Re: Okay to use negative value for boost on boolean query?

2015-03-03 Thread Joel Potischman
Thanks again. I was hoping for easiest way, but most powerful will have to 
do! :-)

Cheers,

-joel

On Tuesday, March 3, 2015 at 3:59:29 AM UTC-5, Jörg Prante wrote:

 You are right with function score query. This is surely the most powerful 
 way to manipulate scores the way you like to do.

 Jörg

 On Tue, Mar 3, 2015 at 2:26 AM, Joel Potischman joel.po...@beatport.com 
 javascript: wrote:

 Thanks Jörg, that makes sense.

 I've made that change and it works but I'm still struggling to have 
 scoring behave the way I want. I simplified the query in my original post 
 for clarity. The actual query, with the new flag, is more like this:

 {
 query: {
 bool: {
 should: [
 {
 match: {
 display_name.raw: {
 query: {{q}},
 type: phrase
 }
 }
 },
 {
 match: {
 display_name.raw_folded: {
 boost: 5,
 query: {{q}},
 type: phrase
 }
 }
 },
 {
 bool: {
 must_not: {
 term: {
 some_flag: true
 }
 },
 boost: 0.1
 }
 }
 ]
 }
 }
 }

 I have the *display_name* field indexed two additional ways - raw and 
 raw_folded, raw is an exact phrase match, and raw_folded is the same 
 thing with accents/diacritics stripped, so if a query matches raw_folded, 
 it will always match raw as well and score higher.

 I want to use this new clause on *some_flag* to only *slightly* decrease 
 scoring, but due to normalizing I'm finding it very different to do so 
 without wildly swinging the scores of other records due to boost 
 normalization. I'd ideally want the presence of this flag set to true to 
 reduce score by say 1%. The exact number is not important, I just want to 
 make sure that when multiple records match, those with this flag set to 
 true rank slightly lower. Think of it as a tiebreaker flag.

 I know about function_score queries but that would be a major risk to 
 implement now, as a) I believe it would require a substantial rewrite of my 
 template, and b) I've never used them before, and c) we are going live very 
 soon. If that's really the right way to do this I'll ticket it for after 
 launch, but I'm hopeful there's a way to do this that only involves minor 
 tweaks to our existing templates. Any (additional!) guidance is very much 
 appreciated!

 -joel

 On Monday, March 2, 2015 at 1:25:49 PM UTC-5, Jörg Prante wrote:

 Negative boosts are not supported. The challenge in downranking is that 
 each boost value will contribute to the score and push docs higher, also 
 when using very small boost values or negative values. This is not what is 
 expected.

 The trick for successful downranking is to reward all docs that do not 
 match the condition

 {
 bool: {
 must_not: {
 term: {
 some_flag: true
 }
 },
 boost: 0.1
 }
 }

 which is equivalent to

 {
 bool: {
 must: {
 term: {
 some_flag: false
 }
 },
 boost: 0.1
 }
 }

 given that some_flag exists in all docs.

 This clause means: reward all docs that do not match the condition 
 some_flag=true and push them higher in the result set. In other words, 
 penalize all docs that match the condition some_flag=true.

 Jörg



 On Mon, Mar 2, 2015 at 7:03 PM, Joel Potischman joel.po...@beatport.com
  wrote:

 I have a query template that currently returns results exactly as 
 desired. I've been given a requirement to very slightly downrank results 
 that have an optional boolean field set to True. The intent is to ensure 
 that we return everything that matches the query, but if multiple records 
 match, any with this flag set to true come up later in results. In case 
 it's relevant, most records will not contain this flag. I came up with the 
 following (simplified) version of my query which works great:

 {
 query: {
 bool: {
 should: [
 {
 match: {
 my_field: {
 query: {{q}}
 }
 }
 },
 {
 bool: {
 must: {
 term: {
 some_flag: true
 }
 },
 boost: -0.1
 }
 }
   

Re: Okay to use negative value for boost on boolean query?

2015-03-02 Thread Joel Potischman
Thanks Jörg, that makes sense.

I've made that change and it works but I'm still struggling to have scoring 
behave the way I want. I simplified the query in my original post for 
clarity. The actual query, with the new flag, is more like this:

{
query: {
bool: {
should: [
{
match: {
display_name.raw: {
query: {{q}},
type: phrase
}
}
},
{
match: {
display_name.raw_folded: {
boost: 5,
query: {{q}},
type: phrase
}
}
},
{
bool: {
must_not: {
term: {
some_flag: true
}
},
boost: 0.1
}
}
]
}
}
}

I have the *display_name* field indexed two additional ways - raw and 
raw_folded, raw is an exact phrase match, and raw_folded is the same 
thing with accents/diacritics stripped, so if a query matches raw_folded, 
it will always match raw as well and score higher.

I want to use this new clause on *some_flag* to only *slightly* decrease 
scoring, but due to normalizing I'm finding it very different to do so 
without wildly swinging the scores of other records due to boost 
normalization. I'd ideally want the presence of this flag set to true to 
reduce score by say 1%. The exact number is not important, I just want to 
make sure that when multiple records match, those with this flag set to 
true rank slightly lower. Think of it as a tiebreaker flag.

I know about function_score queries but that would be a major risk to 
implement now, as a) I believe it would require a substantial rewrite of my 
template, and b) I've never used them before, and c) we are going live very 
soon. If that's really the right way to do this I'll ticket it for after 
launch, but I'm hopeful there's a way to do this that only involves minor 
tweaks to our existing templates. Any (additional!) guidance is very much 
appreciated!

-joel

On Monday, March 2, 2015 at 1:25:49 PM UTC-5, Jörg Prante wrote:

 Negative boosts are not supported. The challenge in downranking is that 
 each boost value will contribute to the score and push docs higher, also 
 when using very small boost values or negative values. This is not what is 
 expected.

 The trick for successful downranking is to reward all docs that do not 
 match the condition

 {
 bool: {
 must_not: {
 term: {
 some_flag: true
 }
 },
 boost: 0.1
 }
 }

 which is equivalent to

 {
 bool: {
 must: {
 term: {
 some_flag: false
 }
 },
 boost: 0.1
 }
 }

 given that some_flag exists in all docs.

 This clause means: reward all docs that do not match the condition 
 some_flag=true and push them higher in the result set. In other words, 
 penalize all docs that match the condition some_flag=true.

 Jörg



 On Mon, Mar 2, 2015 at 7:03 PM, Joel Potischman joel.po...@beatport.com 
 javascript: wrote:

 I have a query template that currently returns results exactly as 
 desired. I've been given a requirement to very slightly downrank results 
 that have an optional boolean field set to True. The intent is to ensure 
 that we return everything that matches the query, but if multiple records 
 match, any with this flag set to true come up later in results. In case 
 it's relevant, most records will not contain this flag. I came up with the 
 following (simplified) version of my query which works great:

 {
 query: {
 bool: {
 should: [
 {
 match: {
 my_field: {
 query: {{q}}
 }
 }
 },
 {
 bool: {
 must: {
 term: {
 some_flag: true
 }
 },
 boost: -0.1
 }
 }
 ]
 }
 }
 }

 A colleague said that we learned in our Elasticsearch training last year 
 that we should avoid negative boosts, and I should rewrite the second 
 clause as follows

 {
 bool: {
 must_not: {
 term: {
 some_flag: false
 }
 },
 boost: 0.1
 }
 }

 I don't recall learning that, and this construction strikes me as less 
 performant, as it must 

Okay to use negative value for boost on boolean query?

2015-03-02 Thread Joel Potischman
I have a query template that currently returns results exactly as desired. 
I've been given a requirement to very slightly downrank results that have 
an optional boolean field set to True. The intent is to ensure that we 
return everything that matches the query, but if multiple records match, 
any with this flag set to true come up later in results. In case it's 
relevant, most records will not contain this flag. I came up with the 
following (simplified) version of my query which works great:

{
query: {
bool: {
should: [
{
match: {
my_field: {
query: {{q}}
}
}
},
{
bool: {
must: {
term: {
some_flag: true
}
},
boost: -0.1
}
}
]
}
}
}

A colleague said that we learned in our Elasticsearch training last year 
that we should avoid negative boosts, and I should rewrite the second 
clause as follows

{
bool: {
must_not: {
term: {
some_flag: false
}
},
boost: 0.1
}
}

I don't recall learning that, and this construction strikes me as less 
performant, as it must modify most records instead of just the minority 
that will have *some_flag=true*.

Because we're both relatively new to Elasticsearch we'd very much 
appreciate someone with more experience to weigh in. I'm happy to change it 
if it's the right thing to do. I'm just not sure I believe it is, and if 
so, why.

Thanks in advance.

-joel

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2113c100-7283-464e-b989-d58853d24a17%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Okay to use negative value for boost on boolean query?

2015-03-02 Thread joergpra...@gmail.com
Negative boosts are not supported. The challenge in downranking is that
each boost value will contribute to the score and push docs higher, also
when using very small boost values or negative values. This is not what is
expected.

The trick for successful downranking is to reward all docs that do not
match the condition

{
bool: {
must_not: {
term: {
some_flag: true
}
},
boost: 0.1
}
}

which is equivalent to

{
bool: {
must: {
term: {
some_flag: false
}
},
boost: 0.1
}
}

given that some_flag exists in all docs.

This clause means: reward all docs that do not match the condition
some_flag=true and push them higher in the result set. In other words,
penalize all docs that match the condition some_flag=true.

Jörg



On Mon, Mar 2, 2015 at 7:03 PM, Joel Potischman 
joel.potisch...@beatport.com wrote:

 I have a query template that currently returns results exactly as desired.
 I've been given a requirement to very slightly downrank results that have
 an optional boolean field set to True. The intent is to ensure that we
 return everything that matches the query, but if multiple records match,
 any with this flag set to true come up later in results. In case it's
 relevant, most records will not contain this flag. I came up with the
 following (simplified) version of my query which works great:

 {
 query: {
 bool: {
 should: [
 {
 match: {
 my_field: {
 query: {{q}}
 }
 }
 },
 {
 bool: {
 must: {
 term: {
 some_flag: true
 }
 },
 boost: -0.1
 }
 }
 ]
 }
 }
 }

 A colleague said that we learned in our Elasticsearch training last year
 that we should avoid negative boosts, and I should rewrite the second
 clause as follows

 {
 bool: {
 must_not: {
 term: {
 some_flag: false
 }
 },
 boost: 0.1
 }
 }

 I don't recall learning that, and this construction strikes me as less
 performant, as it must modify most records instead of just the minority
 that will have *some_flag=true*.

 Because we're both relatively new to Elasticsearch we'd very much
 appreciate someone with more experience to weigh in. I'm happy to change it
 if it's the right thing to do. I'm just not sure I believe it is, and if
 so, why.

 Thanks in advance.

 -joel

 --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/2113c100-7283-464e-b989-d58853d24a17%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/2113c100-7283-464e-b989-d58853d24a17%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH0ohQM3e1Gh1pJFZKKe%2BvXfq7D63%3DgnS7PzZdhZG_-eQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.