Re: Okay to use negative value for boost on boolean query?

2015-03-03 Thread Joel Potischman
Thanks again. I was hoping for easiest way, but most powerful will have to 
do! :-)

Cheers,

-joel

On Tuesday, March 3, 2015 at 3:59:29 AM UTC-5, Jörg Prante wrote:
>
> You are right with function score query. This is surely the most powerful 
> way to manipulate scores the way you like to do.
>
> Jörg
>
> On Tue, Mar 3, 2015 at 2:26 AM, Joel Potischman  > wrote:
>
>> Thanks Jörg, that makes sense.
>>
>> I've made that change and it works but I'm still struggling to have 
>> scoring behave the way I want. I simplified the query in my original post 
>> for clarity. The actual query, with the new flag, is more like this:
>>
>> {
>> "query": {
>> "bool": {
>> "should": [
>> {
>> "match": {
>> "display_name.raw": {
>> "query": "{{q}}",
>> "type": "phrase"
>> }
>> }
>> },
>> {
>> "match": {
>> "display_name.raw_folded": {
>> "boost": 5,
>> "query": "{{q}}",
>> "type": "phrase"
>> }
>> }
>> },
>> {
>> "bool": {
>> "must_not": {
>> "term": {
>> "some_flag": true
>> }
>> },
>> "boost": 0.1
>> }
>> }
>> ]
>> }
>> }
>> }
>>
>> I have the *display_name* field indexed two additional ways - "raw" and 
>> "raw_folded", "raw" is an exact phrase match, and "raw_folded" is the same 
>> thing with accents/diacritics stripped, so if a query matches raw_folded, 
>> it will always match raw as well and score higher.
>>
>> I want to use this new clause on *some_flag* to only *slightly* decrease 
>> scoring, but due to normalizing I'm finding it very different to do so 
>> without wildly swinging the scores of other records due to boost 
>> normalization. I'd ideally want the presence of this flag set to true to 
>> reduce score by say 1%. The exact number is not important, I just want to 
>> make sure that when multiple records match, those with this flag set to 
>> true rank slightly lower. Think of it as a tiebreaker flag.
>>
>> I know about function_score queries but that would be a major risk to 
>> implement now, as a) I believe it would require a substantial rewrite of my 
>> template, and b) I've never used them before, and c) we are going live very 
>> soon. If that's really the right way to do this I'll ticket it for after 
>> launch, but I'm hopeful there's a way to do this that only involves minor 
>> tweaks to our existing templates. Any (additional!) guidance is very much 
>> appreciated!
>>
>> -joel
>>
>> On Monday, March 2, 2015 at 1:25:49 PM UTC-5, Jörg Prante wrote:
>>>
>>> Negative boosts are not supported. The challenge in downranking is that 
>>> each boost value will contribute to the score and push docs higher, also 
>>> when using very small boost values or negative values. This is not what is 
>>> expected.
>>>
>>> The trick for successful downranking is to reward all docs that do not 
>>> match the condition
>>>
>>> {
>>> "bool": {
>>> "must_not": {
>>> "term": {
>>> "some_flag": true
>>> }
>>> },
>>> "boost": 0.1
>>> }
>>> }
>>>
>>> which is equivalent to
>>>
>>> {
>>> "bool": {
>>> "must": {
>>> "term": {
>>> "some_flag": false
>>> }
>>> },
>>> "boost": 0.1
>>> }
>>> }
>>>
>>> given that some_flag exists in all docs.
>>>
>>> This clause means: reward all docs that do not match the condition 
>>> some_flag=true and push them higher in the result set. In other words, 
>>> penalize all docs that match the condition some_flag=true.
>>>
>>> Jörg
>>>
>>>
>>>
>>> On Mon, Mar 2, 2015 at 7:03 PM, Joel Potischman >> > wrote:
>>>
 I have a query template that currently returns results exactly as 
 desired. I've been given a requirement to very slightly downrank results 
 that have an optional boolean field set to True. The intent is to ensure 
 that we return everything that matches the query, but if multiple records 
 match, any with this flag set to true come up later in results. In case 
 it's relevant, most records will not contain this flag. I came up with the 
 following (simplified) version of my query which works great:

 {
 "query": {
 "bool": {
 "should": [
 {
 "match": {
 "my_field": {
 "query": "{{q}}"
   

Re: Okay to use negative value for boost on boolean query?

2015-03-03 Thread joergpra...@gmail.com
You are right with function score query. This is surely the most powerful
way to manipulate scores the way you like to do.

Jörg

On Tue, Mar 3, 2015 at 2:26 AM, Joel Potischman <
joel.potisch...@beatport.com> wrote:

> Thanks Jörg, that makes sense.
>
> I've made that change and it works but I'm still struggling to have
> scoring behave the way I want. I simplified the query in my original post
> for clarity. The actual query, with the new flag, is more like this:
>
> {
> "query": {
> "bool": {
> "should": [
> {
> "match": {
> "display_name.raw": {
> "query": "{{q}}",
> "type": "phrase"
> }
> }
> },
> {
> "match": {
> "display_name.raw_folded": {
> "boost": 5,
> "query": "{{q}}",
> "type": "phrase"
> }
> }
> },
> {
> "bool": {
> "must_not": {
> "term": {
> "some_flag": true
> }
> },
> "boost": 0.1
> }
> }
> ]
> }
> }
> }
>
> I have the *display_name* field indexed two additional ways - "raw" and
> "raw_folded", "raw" is an exact phrase match, and "raw_folded" is the same
> thing with accents/diacritics stripped, so if a query matches raw_folded,
> it will always match raw as well and score higher.
>
> I want to use this new clause on *some_flag* to only *slightly* decrease
> scoring, but due to normalizing I'm finding it very different to do so
> without wildly swinging the scores of other records due to boost
> normalization. I'd ideally want the presence of this flag set to true to
> reduce score by say 1%. The exact number is not important, I just want to
> make sure that when multiple records match, those with this flag set to
> true rank slightly lower. Think of it as a tiebreaker flag.
>
> I know about function_score queries but that would be a major risk to
> implement now, as a) I believe it would require a substantial rewrite of my
> template, and b) I've never used them before, and c) we are going live very
> soon. If that's really the right way to do this I'll ticket it for after
> launch, but I'm hopeful there's a way to do this that only involves minor
> tweaks to our existing templates. Any (additional!) guidance is very much
> appreciated!
>
> -joel
>
> On Monday, March 2, 2015 at 1:25:49 PM UTC-5, Jörg Prante wrote:
>>
>> Negative boosts are not supported. The challenge in downranking is that
>> each boost value will contribute to the score and push docs higher, also
>> when using very small boost values or negative values. This is not what is
>> expected.
>>
>> The trick for successful downranking is to reward all docs that do not
>> match the condition
>>
>> {
>> "bool": {
>> "must_not": {
>> "term": {
>> "some_flag": true
>> }
>> },
>> "boost": 0.1
>> }
>> }
>>
>> which is equivalent to
>>
>> {
>> "bool": {
>> "must": {
>> "term": {
>> "some_flag": false
>> }
>> },
>> "boost": 0.1
>> }
>> }
>>
>> given that some_flag exists in all docs.
>>
>> This clause means: reward all docs that do not match the condition
>> some_flag=true and push them higher in the result set. In other words,
>> penalize all docs that match the condition some_flag=true.
>>
>> Jörg
>>
>>
>>
>> On Mon, Mar 2, 2015 at 7:03 PM, Joel Potischman 
>> wrote:
>>
>>> I have a query template that currently returns results exactly as
>>> desired. I've been given a requirement to very slightly downrank results
>>> that have an optional boolean field set to True. The intent is to ensure
>>> that we return everything that matches the query, but if multiple records
>>> match, any with this flag set to true come up later in results. In case
>>> it's relevant, most records will not contain this flag. I came up with the
>>> following (simplified) version of my query which works great:
>>>
>>> {
>>> "query": {
>>> "bool": {
>>> "should": [
>>> {
>>> "match": {
>>> "my_field": {
>>> "query": "{{q}}"
>>> }
>>> }
>>> },
>>> {
>>> "bool": {
>>> "must": {
>>> "term": {
>>> "some_flag": true
>>> }
>>> }

Re: Okay to use negative value for boost on boolean query?

2015-03-02 Thread Joel Potischman
Thanks Jörg, that makes sense.

I've made that change and it works but I'm still struggling to have scoring 
behave the way I want. I simplified the query in my original post for 
clarity. The actual query, with the new flag, is more like this:

{
"query": {
"bool": {
"should": [
{
"match": {
"display_name.raw": {
"query": "{{q}}",
"type": "phrase"
}
}
},
{
"match": {
"display_name.raw_folded": {
"boost": 5,
"query": "{{q}}",
"type": "phrase"
}
}
},
{
"bool": {
"must_not": {
"term": {
"some_flag": true
}
},
"boost": 0.1
}
}
]
}
}
}

I have the *display_name* field indexed two additional ways - "raw" and 
"raw_folded", "raw" is an exact phrase match, and "raw_folded" is the same 
thing with accents/diacritics stripped, so if a query matches raw_folded, 
it will always match raw as well and score higher.

I want to use this new clause on *some_flag* to only *slightly* decrease 
scoring, but due to normalizing I'm finding it very different to do so 
without wildly swinging the scores of other records due to boost 
normalization. I'd ideally want the presence of this flag set to true to 
reduce score by say 1%. The exact number is not important, I just want to 
make sure that when multiple records match, those with this flag set to 
true rank slightly lower. Think of it as a tiebreaker flag.

I know about function_score queries but that would be a major risk to 
implement now, as a) I believe it would require a substantial rewrite of my 
template, and b) I've never used them before, and c) we are going live very 
soon. If that's really the right way to do this I'll ticket it for after 
launch, but I'm hopeful there's a way to do this that only involves minor 
tweaks to our existing templates. Any (additional!) guidance is very much 
appreciated!

-joel

On Monday, March 2, 2015 at 1:25:49 PM UTC-5, Jörg Prante wrote:
>
> Negative boosts are not supported. The challenge in downranking is that 
> each boost value will contribute to the score and push docs higher, also 
> when using very small boost values or negative values. This is not what is 
> expected.
>
> The trick for successful downranking is to reward all docs that do not 
> match the condition
>
> {
> "bool": {
> "must_not": {
> "term": {
> "some_flag": true
> }
> },
> "boost": 0.1
> }
> }
>
> which is equivalent to
>
> {
> "bool": {
> "must": {
> "term": {
> "some_flag": false
> }
> },
> "boost": 0.1
> }
> }
>
> given that some_flag exists in all docs.
>
> This clause means: reward all docs that do not match the condition 
> some_flag=true and push them higher in the result set. In other words, 
> penalize all docs that match the condition some_flag=true.
>
> Jörg
>
>
>
> On Mon, Mar 2, 2015 at 7:03 PM, Joel Potischman  > wrote:
>
>> I have a query template that currently returns results exactly as 
>> desired. I've been given a requirement to very slightly downrank results 
>> that have an optional boolean field set to True. The intent is to ensure 
>> that we return everything that matches the query, but if multiple records 
>> match, any with this flag set to true come up later in results. In case 
>> it's relevant, most records will not contain this flag. I came up with the 
>> following (simplified) version of my query which works great:
>>
>> {
>> "query": {
>> "bool": {
>> "should": [
>> {
>> "match": {
>> "my_field": {
>> "query": "{{q}}"
>> }
>> }
>> },
>> {
>> "bool": {
>> "must": {
>> "term": {
>> "some_flag": true
>> }
>> },
>> "boost": -0.1
>> }
>> }
>> ]
>> }
>> }
>> }
>>
>> A colleague said that we learned in our Elasticsearch training last year 
>> that we should avoid negative boosts, and I should rewrite the second 
>> clause as follows
>>
>> {
>> "bool": {
>> "must_not": {
>> "term": {

Re: Okay to use negative value for boost on boolean query?

2015-03-02 Thread joergpra...@gmail.com
Negative boosts are not supported. The challenge in downranking is that
each boost value will contribute to the score and push docs higher, also
when using very small boost values or negative values. This is not what is
expected.

The trick for successful downranking is to reward all docs that do not
match the condition

{
"bool": {
"must_not": {
"term": {
"some_flag": true
}
},
"boost": 0.1
}
}

which is equivalent to

{
"bool": {
"must": {
"term": {
"some_flag": false
}
},
"boost": 0.1
}
}

given that some_flag exists in all docs.

This clause means: reward all docs that do not match the condition
some_flag=true and push them higher in the result set. In other words,
penalize all docs that match the condition some_flag=true.

Jörg



On Mon, Mar 2, 2015 at 7:03 PM, Joel Potischman <
joel.potisch...@beatport.com> wrote:

> I have a query template that currently returns results exactly as desired.
> I've been given a requirement to very slightly downrank results that have
> an optional boolean field set to True. The intent is to ensure that we
> return everything that matches the query, but if multiple records match,
> any with this flag set to true come up later in results. In case it's
> relevant, most records will not contain this flag. I came up with the
> following (simplified) version of my query which works great:
>
> {
> "query": {
> "bool": {
> "should": [
> {
> "match": {
> "my_field": {
> "query": "{{q}}"
> }
> }
> },
> {
> "bool": {
> "must": {
> "term": {
> "some_flag": true
> }
> },
> "boost": -0.1
> }
> }
> ]
> }
> }
> }
>
> A colleague said that we learned in our Elasticsearch training last year
> that we should avoid negative boosts, and I should rewrite the second
> clause as follows
>
> {
> "bool": {
> "must_not": {
> "term": {
> "some_flag": false
> }
> },
> "boost": 0.1
> }
> }
>
> I don't recall learning that, and this construction strikes me as less
> performant, as it must modify most records instead of just the minority
> that will have *some_flag=true*.
>
> Because we're both relatively new to Elasticsearch we'd very much
> appreciate someone with more experience to weigh in. I'm happy to change it
> if it's the right thing to do. I'm just not sure I believe it is, and if
> so, why.
>
> Thanks in advance.
>
> -joel
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/2113c100-7283-464e-b989-d58853d24a17%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH0ohQM3e1Gh1pJFZKKe%2BvXfq7D63%3DgnS7PzZdhZG_-eQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.