The recommender fetches your ratings from BoardGameGeek and uses them to find games you haven't played that you're likely to enjoy. Three recommendation modes are available, each using a different strategy to measure how well a candidate game matches your taste. All modes share a common data pipeline described below.
/xmlapi2/collection), including your personal score, the BGG community average, and your logged play count for each game.geekitem/recs endpoint, which is itself computed from co-rating patterns across millions of users.Feature representation
Each game is described by a feature vector combining three groups of binary attributes plus one continuous value:
- Mechanics (~200 distinct tags on BGG) — e.g. Worker Placement, Deck Building, Area Control
- Categories (~80 tags) — e.g. Fantasy, Wargame, Economic
- BGG Families (thousands of tags) — highly specific groupings like "Game: Pandemic Series" or "Mechanism: Legacy Games" that cross-cut mechanics and categories
- Complexity weight — BGG's community-rated difficulty, normalised to a 0–1 scale from the 1–5 original
IDF weighting
Rather than treating all features equally, each is weighted by its Inverse Document Frequency across the entire game corpus:
"Dice Rolling" appears in roughly 60% of games and gets a low IDF weight. "Crayon Rail System" appears in under 3% and gets a high weight. This means the similarity signal responds sharply to rare, distinctive features rather than being drowned out by ubiquitous ones.
Building the taste profile
Your taste profile is a weighted average of the feature vectors of all games you've rated at or above the minimum threshold. Three factors determine each game's contribution weight:
- Rating weight:
w = rating − (min_rating − 1)— a 10/10 has four times the influence of a 7/10 at the default threshold. - Play count boost:
× (1 + 0.5 × log(plays + 1))— a game you've logged 50 plays of contributes ~3× more than one played once at the same rating, because repeat plays signal genuine love beyond the initial impression. - Negative signal: Games rated below the threshold build a separate negative profile, normalised the same way. Your final taste vector subtracts a dampened version:
This means candidates sharing features with games you disliked are actively penalised, not merely ignored.
Cosine similarity
For each candidate, the angle between its feature vector and your taste vector is measured using cosine similarity:
A value of 1.0 means perfect alignment; 0 means no overlap; negative values (possible because of the negative profile) mean the candidate actively conflicts with your taste. Cosine similarity is scale-invariant — a game with many mechanics isn't automatically more similar just because its vector is longer.
Final score
BGG's Bayes-adjusted community rating (which shrinks scores toward the global mean to penalise low-vote-count games) provides a quality floor. Without it, obscure games with perfect niche fits would dominate over well-loved broadly-appealing titles.
The data source
True collaborative filtering requires a matrix of user × item ratings. BGG has this internally — tens of millions of ratings from over a million users — but doesn't expose it via its public API. Instead, BGG publishes pre-computed "similar games" lists for every title on the site via the geekitem/recs endpoint.
These lists are collaborative filtering: BGG computes them from co-rating patterns — if many users who gave Game A a high rating also gave Game B a high rating, the two games appear in each other's similar lists. We use this signal directly rather than reconstructing the ratings matrix ourselves.
Co-occurrence scoring
In Collaborative mode, similar-game lists are fetched for every game you've liked (not just the top 5 seeds used in Content mode). Each candidate is then scored by the weighted fraction of your liked games that co-occur with it:
/ Σ( weight_i for all liked games )
Weights are the same rating × play-count values used in Content mode. A score of 1.0 would mean every liked game you have co-occurs with this candidate; in practice 0.05–0.20 represents a strong signal.
What it finds differently
Content-based filtering can miss games that look superficially different from your taste profile — different setting, different player count, unusual mechanic mix — but consistently delight the same community of players. Collaborative filtering has no concept of game features at all; it finds latent patterns in human preference that don't reduce to any single attribute.
Conversely, if you have a small collection with few liked games, the co-occurrence signal is weak. Collaborative mode works best when you have 20+ games rated above the threshold.
Hybrid mode combines both approaches at the similarity step. The candidate pool is expanded to include co-occurrence candidates from all liked games plus the hot games list, giving the broadest possible set of options. Scoring uses an equal blend:
score = 0.65 × similarity + 0.35 × (bayes_rating / 9)
The content signal keeps results grounded in the features you demonstrably enjoy. The CF signal adds the community dimension — games your taste cohort loves regardless of how they're described. Together they tend to produce more varied top-10 lists than either mode alone.
| Mode | Best when… | Weaker when… |
|---|---|---|
| Content | You have a small collection, or you want tight thematic matches — games that feel obviously related to your favourites | Your taste is broad and doesn't reduce to a consistent mechanic/category profile |
| Collaborative | You have 20+ liked games and want to discover titles that might surprise you thematically but fit your player community | Your collection is small — the co-occurrence signal needs enough liked games to be reliable |
| Hybrid | You want varied suggestions and have a reasonably sized collection; good general-purpose default | You want very tight thematic matches (use Content) or maximum surprise (use Collaborative) |