I think you will find that for small documents, that aren't actually 
documents at all, but really a mass of data points, such as a product 
library, you won't even use the built in scoring at all. The built in 
scoring works well for books and articles (long works of text). For a 
product library, you will use an array of custom boosts through the 
function score query. The key is to get all those data points in your 
documents so that you can boost on matches.

For example, with "xbox," you could have a keywords field that includes 
xbox just for consoles. Maybe Xbox is the title of the product while games 
just have Xbox listed as their console compatibility. Only matches in the 
titles will score higher. 

For the macbook, you could have an accessories flag where items flagged as 
an accessory receive a negative boost. 

For Apple food vs. Apple products, you can use sales data or user history. 

The key to having relevancy that works for your organization is by 
providing all the data points to elasticsearch to base its decisions. For 
products, your best solution is a big old set of constant score queries 
wrapped in some wild function score queries.

On Tuesday, January 7, 2014 12:36:43 PM UTC-5, David Mitchell wrote:
>
> What is the best way to make products more relevant outside of the default 
> scoring?
>
> I have an unknown number of business rules that will dictate a document's 
> "relativity". Meaning, if one document scores higher than the other, it's 
> possible that the other document will be more relevant to the user. 
>
> Given two products with similar titles but different attributes and the 
> query "ipad", I'd like to promote one over the other:
>
> {
>    "title_simple": "iPad Mini Case",
>    "description_simple": "Royce Leather iPad Mini Case:...",
>    "category": "Computers & Accessories",
>    "brand" : "Royce Leather",
>    "id": 794809052574
> }
>
> {
>   "title_simple": "Apple iPad mini (16GB, Wi-Fi + Sprint 4G, White)",
>   "description_simple": "iPad mini features a beautiful 7.9\" display...",
>   "category": "Electronics",
>   "brand" : "Apple",
>   "id": 885909689712
> }
>
>
> A simple query scores the iPad case high:
>
> {
>    "query": { "term": { "title_simple": "ipad" }}
> }
>
>
> But business rules dictate that the actual iPad be on the top. 
>
> I can run a filter or score based on the attribute or brand to get what 
> I'm looking for:
>
> {
>    "query": {
>       "function_score": {
>           "query": { "term": { "title_simple": "ipad" } },
>           "functions" : [{
>                   "filter" : { "term": { "category_simple": "electronics" 
> } },
>                   "boost_factor" : 2
>           }]          
>       }
>    }
> }
>
> But building a bunch of these isn't scalable or reasonable. 
>
> I have an unknown number of these and that number will continue to grow. 
> Some other examples:
>
> - query "xbox" should promote consoles over games
> - query "macbook" should promote Apple computers over macbook sleeves
> - query "Apple" should promote Apple products and not food
>
> Building a thousand queries based on functions filters is unreasonable and 
> unscalable. 
>
> Some possible solutions I've considered:
>
> - building a lookup table that will build the filter portion of the query 
> (this could get unmaintainable)
> - Including a pre-calculated score in the document (unfortunately, doesn't 
> work on a per query basis, as the score may change based on the user's 
> needs)
> - Extending the DefaultSimilary class (I'm not sure how this helps me in 
> this scenario, though)
>
> What have other people done to solve these problems? Is there something 
> else that I'm missing that could help?
>
> Here's a runnable gist - 
> https://gist.github.com/dlmitchell/826e8fb7ca89bed30e4a/raw/613be2c202b26faaaa5899bdcfeac714737beb49/sample_mapping.sh
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/48fb3984-a23c-4d95-aa34-e8e67dce8df9%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to