Re: Okay to use negative value for boost on boolean query?

joergpra...@gmail.com Tue, 03 Mar 2015 01:00:07 -0800

You are right with function score query. This is surely the most powerful
way to manipulate scores the way you like to do.


Jörg

On Tue, Mar 3, 2015 at 2:26 AM, Joel Potischman <
joel.potisch...@beatport.com> wrote:

> Thanks Jörg, that makes sense.
>
> I've made that change and it works but I'm still struggling to have
> scoring behave the way I want. I simplified the query in my original post
> for clarity. The actual query, with the new flag, is more like this:
>
> {
>     "query": {
>         "bool": {
>             "should": [
>                 {
>                     "match": {
>                         "display_name.raw": {
>                             "query": "{{q}}",
>                             "type": "phrase"
>                         }
>                     }
>                 },
>                 {
>                     "match": {
>                         "display_name.raw_folded": {
>                             "boost": 5,
>                             "query": "{{q}}",
>                             "type": "phrase"
>                         }
>                     }
>                 },
>                 {
>                     "bool": {
>                         "must_not": {
>                             "term": {
>                                 "some_flag": true
>                             }
>                         },
>                         "boost": 0.00001
>                     }
>                 }
>             ]
>         }
>     }
> }
>
> I have the *display_name* field indexed two additional ways - "raw" and
> "raw_folded", "raw" is an exact phrase match, and "raw_folded" is the same
> thing with accents/diacritics stripped, so if a query matches raw_folded,
> it will always match raw as well and score higher.
>
> I want to use this new clause on *some_flag* to only *slightly* decrease
> scoring, but due to normalizing I'm finding it very different to do so
> without wildly swinging the scores of other records due to boost
> normalization. I'd ideally want the presence of this flag set to true to
> reduce score by say 1%. The exact number is not important, I just want to
> make sure that when multiple records match, those with this flag set to
> true rank slightly lower. Think of it as a tiebreaker flag.
>
> I know about function_score queries but that would be a major risk to
> implement now, as a) I believe it would require a substantial rewrite of my
> template, and b) I've never used them before, and c) we are going live very
> soon. If that's really the right way to do this I'll ticket it for after
> launch, but I'm hopeful there's a way to do this that only involves minor
> tweaks to our existing templates. Any (additional!) guidance is very much
> appreciated!
>
> -joel
>
> On Monday, March 2, 2015 at 1:25:49 PM UTC-5, Jörg Prante wrote:
>>
>> Negative boosts are not supported. The challenge in downranking is that
>> each boost value will contribute to the score and push docs higher, also
>> when using very small boost values or negative values. This is not what is
>> expected.
>>
>> The trick for successful downranking is to reward all docs that do not
>> match the condition
>>
>> {
>>     "bool": {
>>         "must_not": {
>>             "term": {
>>                 "some_flag": true
>>             }
>>         },
>>         "boost": 0.00001
>>     }
>> }
>>
>> which is equivalent to
>>
>> {
>>     "bool": {
>>         "must": {
>>             "term": {
>>                 "some_flag": false
>>             }
>>         },
>>         "boost": 0.00001
>>     }
>> }
>>
>> given that some_flag exists in all docs.
>>
>> This clause means: reward all docs that do not match the condition
>> some_flag=true and push them higher in the result set. In other words,
>> penalize all docs that match the condition some_flag=true.
>>
>> Jörg
>>
>>
>>
>> On Mon, Mar 2, 2015 at 7:03 PM, Joel Potischman <joel.po...@beatport.com>
>> wrote:
>>
>>> I have a query template that currently returns results exactly as
>>> desired. I've been given a requirement to very slightly downrank results
>>> that have an optional boolean field set to True. The intent is to ensure
>>> that we return everything that matches the query, but if multiple records
>>> match, any with this flag set to true come up later in results. In case
>>> it's relevant, most records will not contain this flag. I came up with the
>>> following (simplified) version of my query which works great:
>>>
>>> {
>>>     "query": {
>>>         "bool": {
>>>             "should": [
>>>                 {
>>>                     "match": {
>>>                         "my_field": {
>>>                             "query": "{{q}}"
>>>                         }
>>>                     }
>>>                 },
>>>                 {
>>>                     "bool": {
>>>                         "must": {
>>>                             "term": {
>>>                                 "some_flag": true
>>>                             }
>>>                         },
>>>                         "boost": -0.00001
>>>                     }
>>>                 }
>>>             ]
>>>         }
>>>     }
>>> }
>>>
>>> A colleague said that we learned in our Elasticsearch training last year
>>> that we should avoid negative boosts, and I should rewrite the second
>>> clause as follows
>>>
>>> {
>>>     "bool": {
>>>         "must_not": {
>>>             "term": {
>>>                 "some_flag": false
>>>             }
>>>         },
>>>         "boost": 0.00001
>>>     }
>>> }
>>>
>>> I don't recall learning that, and this construction strikes me as less
>>> performant, as it must modify most records instead of just the minority
>>> that will have *some_flag=true*.
>>>
>>> Because we're both relatively new to Elasticsearch we'd very much
>>> appreciate someone with more experience to weigh in. I'm happy to change it
>>> if it's the right thing to do. I'm just not sure I believe it is, and if
>>> so, why.
>>>
>>> Thanks in advance.
>>>
>>> -joel
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/2113c100-7283-464e-b989-d58853d24a17%
>>> 40googlegroups.com
>>> <https://groups.google.com/d/msgid/elasticsearch/2113c100-7283-464e-b989-d58853d24a17%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/4367fbba-46b7-4b89-8c31-b8614495a8a4%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/4367fbba-46b7-4b89-8c31-b8614495a8a4%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGosTiByOGAwZJmDMPzSvzNF5fRr8DEAfxRd5Nz9L6oow%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Okay to use negative value for boost on boolean query?

Reply via email to