Re: Okay to use negative value for boost on boolean query?
Thanks again. I was hoping for easiest way, but most powerful will have to do! :-) Cheers, -joel On Tuesday, March 3, 2015 at 3:59:29 AM UTC-5, Jörg Prante wrote: You are right with function score query. This is surely the most powerful way to manipulate scores the way you like to do. Jörg On Tue, Mar 3, 2015 at 2:26 AM, Joel Potischman joel.po...@beatport.com javascript: wrote: Thanks Jörg, that makes sense. I've made that change and it works but I'm still struggling to have scoring behave the way I want. I simplified the query in my original post for clarity. The actual query, with the new flag, is more like this: { query: { bool: { should: [ { match: { display_name.raw: { query: {{q}}, type: phrase } } }, { match: { display_name.raw_folded: { boost: 5, query: {{q}}, type: phrase } } }, { bool: { must_not: { term: { some_flag: true } }, boost: 0.1 } } ] } } } I have the *display_name* field indexed two additional ways - raw and raw_folded, raw is an exact phrase match, and raw_folded is the same thing with accents/diacritics stripped, so if a query matches raw_folded, it will always match raw as well and score higher. I want to use this new clause on *some_flag* to only *slightly* decrease scoring, but due to normalizing I'm finding it very different to do so without wildly swinging the scores of other records due to boost normalization. I'd ideally want the presence of this flag set to true to reduce score by say 1%. The exact number is not important, I just want to make sure that when multiple records match, those with this flag set to true rank slightly lower. Think of it as a tiebreaker flag. I know about function_score queries but that would be a major risk to implement now, as a) I believe it would require a substantial rewrite of my template, and b) I've never used them before, and c) we are going live very soon. If that's really the right way to do this I'll ticket it for after launch, but I'm hopeful there's a way to do this that only involves minor tweaks to our existing templates. Any (additional!) guidance is very much appreciated! -joel On Monday, March 2, 2015 at 1:25:49 PM UTC-5, Jörg Prante wrote: Negative boosts are not supported. The challenge in downranking is that each boost value will contribute to the score and push docs higher, also when using very small boost values or negative values. This is not what is expected. The trick for successful downranking is to reward all docs that do not match the condition { bool: { must_not: { term: { some_flag: true } }, boost: 0.1 } } which is equivalent to { bool: { must: { term: { some_flag: false } }, boost: 0.1 } } given that some_flag exists in all docs. This clause means: reward all docs that do not match the condition some_flag=true and push them higher in the result set. In other words, penalize all docs that match the condition some_flag=true. Jörg On Mon, Mar 2, 2015 at 7:03 PM, Joel Potischman joel.po...@beatport.com wrote: I have a query template that currently returns results exactly as desired. I've been given a requirement to very slightly downrank results that have an optional boolean field set to True. The intent is to ensure that we return everything that matches the query, but if multiple records match, any with this flag set to true come up later in results. In case it's relevant, most records will not contain this flag. I came up with the following (simplified) version of my query which works great: { query: { bool: { should: [ { match: { my_field: { query: {{q}} } } }, { bool: { must: { term: { some_flag: true } }, boost: -0.1 } }
Re: Okay to use negative value for boost on boolean query?
Thanks Jörg, that makes sense. I've made that change and it works but I'm still struggling to have scoring behave the way I want. I simplified the query in my original post for clarity. The actual query, with the new flag, is more like this: { query: { bool: { should: [ { match: { display_name.raw: { query: {{q}}, type: phrase } } }, { match: { display_name.raw_folded: { boost: 5, query: {{q}}, type: phrase } } }, { bool: { must_not: { term: { some_flag: true } }, boost: 0.1 } } ] } } } I have the *display_name* field indexed two additional ways - raw and raw_folded, raw is an exact phrase match, and raw_folded is the same thing with accents/diacritics stripped, so if a query matches raw_folded, it will always match raw as well and score higher. I want to use this new clause on *some_flag* to only *slightly* decrease scoring, but due to normalizing I'm finding it very different to do so without wildly swinging the scores of other records due to boost normalization. I'd ideally want the presence of this flag set to true to reduce score by say 1%. The exact number is not important, I just want to make sure that when multiple records match, those with this flag set to true rank slightly lower. Think of it as a tiebreaker flag. I know about function_score queries but that would be a major risk to implement now, as a) I believe it would require a substantial rewrite of my template, and b) I've never used them before, and c) we are going live very soon. If that's really the right way to do this I'll ticket it for after launch, but I'm hopeful there's a way to do this that only involves minor tweaks to our existing templates. Any (additional!) guidance is very much appreciated! -joel On Monday, March 2, 2015 at 1:25:49 PM UTC-5, Jörg Prante wrote: Negative boosts are not supported. The challenge in downranking is that each boost value will contribute to the score and push docs higher, also when using very small boost values or negative values. This is not what is expected. The trick for successful downranking is to reward all docs that do not match the condition { bool: { must_not: { term: { some_flag: true } }, boost: 0.1 } } which is equivalent to { bool: { must: { term: { some_flag: false } }, boost: 0.1 } } given that some_flag exists in all docs. This clause means: reward all docs that do not match the condition some_flag=true and push them higher in the result set. In other words, penalize all docs that match the condition some_flag=true. Jörg On Mon, Mar 2, 2015 at 7:03 PM, Joel Potischman joel.po...@beatport.com javascript: wrote: I have a query template that currently returns results exactly as desired. I've been given a requirement to very slightly downrank results that have an optional boolean field set to True. The intent is to ensure that we return everything that matches the query, but if multiple records match, any with this flag set to true come up later in results. In case it's relevant, most records will not contain this flag. I came up with the following (simplified) version of my query which works great: { query: { bool: { should: [ { match: { my_field: { query: {{q}} } } }, { bool: { must: { term: { some_flag: true } }, boost: -0.1 } } ] } } } A colleague said that we learned in our Elasticsearch training last year that we should avoid negative boosts, and I should rewrite the second clause as follows { bool: { must_not: { term: { some_flag: false } }, boost: 0.1 } } I don't recall learning that, and this construction strikes me as less performant, as it must
Okay to use negative value for boost on boolean query?
I have a query template that currently returns results exactly as desired. I've been given a requirement to very slightly downrank results that have an optional boolean field set to True. The intent is to ensure that we return everything that matches the query, but if multiple records match, any with this flag set to true come up later in results. In case it's relevant, most records will not contain this flag. I came up with the following (simplified) version of my query which works great: { query: { bool: { should: [ { match: { my_field: { query: {{q}} } } }, { bool: { must: { term: { some_flag: true } }, boost: -0.1 } } ] } } } A colleague said that we learned in our Elasticsearch training last year that we should avoid negative boosts, and I should rewrite the second clause as follows { bool: { must_not: { term: { some_flag: false } }, boost: 0.1 } } I don't recall learning that, and this construction strikes me as less performant, as it must modify most records instead of just the minority that will have *some_flag=true*. Because we're both relatively new to Elasticsearch we'd very much appreciate someone with more experience to weigh in. I'm happy to change it if it's the right thing to do. I'm just not sure I believe it is, and if so, why. Thanks in advance. -joel -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2113c100-7283-464e-b989-d58853d24a17%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Okay to use negative value for boost on boolean query?
Negative boosts are not supported. The challenge in downranking is that each boost value will contribute to the score and push docs higher, also when using very small boost values or negative values. This is not what is expected. The trick for successful downranking is to reward all docs that do not match the condition { bool: { must_not: { term: { some_flag: true } }, boost: 0.1 } } which is equivalent to { bool: { must: { term: { some_flag: false } }, boost: 0.1 } } given that some_flag exists in all docs. This clause means: reward all docs that do not match the condition some_flag=true and push them higher in the result set. In other words, penalize all docs that match the condition some_flag=true. Jörg On Mon, Mar 2, 2015 at 7:03 PM, Joel Potischman joel.potisch...@beatport.com wrote: I have a query template that currently returns results exactly as desired. I've been given a requirement to very slightly downrank results that have an optional boolean field set to True. The intent is to ensure that we return everything that matches the query, but if multiple records match, any with this flag set to true come up later in results. In case it's relevant, most records will not contain this flag. I came up with the following (simplified) version of my query which works great: { query: { bool: { should: [ { match: { my_field: { query: {{q}} } } }, { bool: { must: { term: { some_flag: true } }, boost: -0.1 } } ] } } } A colleague said that we learned in our Elasticsearch training last year that we should avoid negative boosts, and I should rewrite the second clause as follows { bool: { must_not: { term: { some_flag: false } }, boost: 0.1 } } I don't recall learning that, and this construction strikes me as less performant, as it must modify most records instead of just the minority that will have *some_flag=true*. Because we're both relatively new to Elasticsearch we'd very much appreciate someone with more experience to weigh in. I'm happy to change it if it's the right thing to do. I'm just not sure I believe it is, and if so, why. Thanks in advance. -joel -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2113c100-7283-464e-b989-d58853d24a17%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/2113c100-7283-464e-b989-d58853d24a17%40googlegroups.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH0ohQM3e1Gh1pJFZKKe%2BvXfq7D63%3DgnS7PzZdhZG_-eQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.