Re: Okay to use negative value for boost on boolean query?

Joel Potischman Tue, 03 Mar 2015 06:40:25 -0800

Thanks again. I was hoping for easiest way, but most powerful will have to 
do! :-)


Cheers,

-joel

On Tuesday, March 3, 2015 at 3:59:29 AM UTC-5, Jörg Prante wrote:
>
> You are right with function score query. This is surely the most powerful 
> way to manipulate scores the way you like to do.
>
> Jörg
>
> On Tue, Mar 3, 2015 at 2:26 AM, Joel Potischman <joel.po...@beatport.com 
> <javascript:>> wrote:
>
>> Thanks Jörg, that makes sense.
>>
>> I've made that change and it works but I'm still struggling to have 
>> scoring behave the way I want. I simplified the query in my original post 
>> for clarity. The actual query, with the new flag, is more like this:
>>
>> {
>>     "query": {
>>         "bool": {
>>             "should": [
>>                 {
>>                     "match": {
>>                         "display_name.raw": {
>>                             "query": "{{q}}",
>>                             "type": "phrase"
>>                         }
>>                     }
>>                 },
>>                 {
>>                     "match": {
>>                         "display_name.raw_folded": {
>>                             "boost": 5,
>>                             "query": "{{q}}",
>>                             "type": "phrase"
>>                         }
>>                     }
>>                 },
>>                 {
>>                     "bool": {
>>                         "must_not": {
>>                             "term": {
>>                                 "some_flag": true
>>                             }
>>                         },
>>                         "boost": 0.00001
>>                     }
>>                 }
>>             ]
>>         }
>>     }
>> }
>>
>> I have the *display_name* field indexed two additional ways - "raw" and 
>> "raw_folded", "raw" is an exact phrase match, and "raw_folded" is the same 
>> thing with accents/diacritics stripped, so if a query matches raw_folded, 
>> it will always match raw as well and score higher.
>>
>> I want to use this new clause on *some_flag* to only *slightly* decrease 
>> scoring, but due to normalizing I'm finding it very different to do so 
>> without wildly swinging the scores of other records due to boost 
>> normalization. I'd ideally want the presence of this flag set to true to 
>> reduce score by say 1%. The exact number is not important, I just want to 
>> make sure that when multiple records match, those with this flag set to 
>> true rank slightly lower. Think of it as a tiebreaker flag.
>>
>> I know about function_score queries but that would be a major risk to 
>> implement now, as a) I believe it would require a substantial rewrite of my 
>> template, and b) I've never used them before, and c) we are going live very 
>> soon. If that's really the right way to do this I'll ticket it for after 
>> launch, but I'm hopeful there's a way to do this that only involves minor 
>> tweaks to our existing templates. Any (additional!) guidance is very much 
>> appreciated!
>>
>> -joel
>>
>> On Monday, March 2, 2015 at 1:25:49 PM UTC-5, Jörg Prante wrote:
>>>
>>> Negative boosts are not supported. The challenge in downranking is that 
>>> each boost value will contribute to the score and push docs higher, also 
>>> when using very small boost values or negative values. This is not what is 
>>> expected.
>>>
>>> The trick for successful downranking is to reward all docs that do not 
>>> match the condition
>>>
>>> {
>>>     "bool": {
>>>         "must_not": {
>>>             "term": {
>>>                 "some_flag": true
>>>             }
>>>         },
>>>         "boost": 0.00001
>>>     }
>>> }
>>>
>>> which is equivalent to
>>>
>>> {
>>>     "bool": {
>>>         "must": {
>>>             "term": {
>>>                 "some_flag": false
>>>             }
>>>         },
>>>         "boost": 0.00001
>>>     }
>>> }
>>>
>>> given that some_flag exists in all docs.
>>>
>>> This clause means: reward all docs that do not match the condition 
>>> some_flag=true and push them higher in the result set. In other words, 
>>> penalize all docs that match the condition some_flag=true.
>>>
>>> Jörg
>>>
>>>
>>>
>>> On Mon, Mar 2, 2015 at 7:03 PM, Joel Potischman <joel.po...@beatport.com
>>> > wrote:
>>>
>>>> I have a query template that currently returns results exactly as 
>>>> desired. I've been given a requirement to very slightly downrank results 
>>>> that have an optional boolean field set to True. The intent is to ensure 
>>>> that we return everything that matches the query, but if multiple records 
>>>> match, any with this flag set to true come up later in results. In case 
>>>> it's relevant, most records will not contain this flag. I came up with the 
>>>> following (simplified) version of my query which works great:
>>>>
>>>> {
>>>>     "query": {
>>>>         "bool": {
>>>>             "should": [
>>>>                 {
>>>>                     "match": {
>>>>                         "my_field": {
>>>>                             "query": "{{q}}"
>>>>                         }
>>>>                     }
>>>>                 },
>>>>                 {
>>>>                     "bool": {
>>>>                         "must": {
>>>>                             "term": {
>>>>                                 "some_flag": true
>>>>                             }
>>>>                         },
>>>>                         "boost": -0.00001
>>>>                     }
>>>>                 }
>>>>             ]
>>>>         }
>>>>     }
>>>> }
>>>>
>>>> A colleague said that we learned in our Elasticsearch training last 
>>>> year that we should avoid negative boosts, and I should rewrite the second 
>>>> clause as follows
>>>>
>>>> {
>>>>     "bool": {
>>>>         "must_not": {
>>>>             "term": {
>>>>                 "some_flag": false
>>>>             }
>>>>         },
>>>>         "boost": 0.00001
>>>>     }
>>>> }
>>>>
>>>> I don't recall learning that, and this construction strikes me as less 
>>>> performant, as it must modify most records instead of just the minority 
>>>> that will have *some_flag=true*.
>>>>
>>>> Because we're both relatively new to Elasticsearch we'd very much 
>>>> appreciate someone with more experience to weigh in. I'm happy to change 
>>>> it 
>>>> if it's the right thing to do. I'm just not sure I believe it is, and if 
>>>> so, why.
>>>>
>>>> Thanks in advance.
>>>>
>>>> -joel
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "elasticsearch" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to elasticsearc...@googlegroups.com.
>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>> msgid/elasticsearch/2113c100-7283-464e-b989-d58853d24a17%
>>>> 40googlegroups.com 
>>>> <https://groups.google.com/d/msgid/elasticsearch/2113c100-7283-464e-b989-d58853d24a17%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/4367fbba-46b7-4b89-8c31-b8614495a8a4%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/4367fbba-46b7-4b89-8c31-b8614495a8a4%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d44242d7-e26b-490a-888c-ae0421831344%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Okay to use negative value for boost on boolean query?

Reply via email to