Re: DisjunctionMaxQuery and scoring

2012-04-20 Thread Benson Margulies
Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > >> -Original Message- >> From: Uwe Schindler [mailto:u...@thetaphi.de] >> Sent: Friday, April 20, 2012 8:16 AM >> To: java-user@lucene.apach

RE: DisjunctionMaxQuery and scoring

2012-04-19 Thread Uwe Schindler
ay, April 20, 2012 8:16 AM > To: java-user@lucene.apache.org; david_murgatr...@hotmail.com > Subject: RE: DisjunctionMaxQuery and scoring > > Hi, > > I think > > BooleanQuery bq = new BooleanQuery(false); doesn't quite accomplish > > the desired "name I

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Robert Muir
On Thu, Apr 19, 2012 at 8:32 PM, David Murgatroyd wrote: > In contrast, I think the desire > is that one and only one of the terms in the document match those in the > BooleanQuery so that "Rich" would score higher than "Dick Rich", given > document length normalization. It's almost like a desire

RE: DisjunctionMaxQuery and scoring

2012-04-19 Thread Uwe Schindler
Hi, > I think > BooleanQuery bq = new BooleanQuery(false); doesn't quite accomplish the > desired "name IN (dick, rich)" scoring behavior. This is because (name:dick | > name:rich) with coord=false would score the 'document' "Dick Rich" higher > than "Rich" because the former has two term matches

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Benson Margulies
FWIW, there seems to be an explain bug in 2.9.1 that is fixed in 3.6.0, so I'm no longer confused about the actual behavior. On Thu, Apr 19, 2012 at 8:32 PM, David Murgatroyd wrote: > [apologies for the earlier errant send] > > I think >  BooleanQuery bq = new BooleanQuery(false); > doesn't quit

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread David Murgatroyd
[apologies for the earlier errant send] I think BooleanQuery bq = new BooleanQuery(false); doesn't quite accomplish the desired "name IN (dick, rich)" scoring behavior. This is because (name:dick | name:rich) with coord=false would score the 'document' "Dick Rich" higher than "Rich" because the f

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Robert Muir
On Thu, Apr 19, 2012 at 6:36 PM, Benson Margulies wrote: > I see why I'm so confused, but I think I need to construct a simpler test > case. > > My top-level BooleanQuery, which has disableCoord=false, has 22 > clauses. All but three are ordinary SHOULD TermQueries. the remainder > are a spanNear

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread David Murgatroyd
On Apr 19, 2012, at 6:36 PM, Benson Margulies wrote: > I see why I'm so confused, but I think I need to construct a simpler test > case. > > My top-level BooleanQuery, which has disableCoord=false, has 22 > clauses. All but three are ordinary SHOULD TermQueries. the remainder > are a spanNe

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Benson Margulies
I see why I'm so confused, but I think I need to construct a simpler test case. My top-level BooleanQuery, which has disableCoord=false, has 22 clauses. All but three are ordinary SHOULD TermQueries. the remainder are a spanNear and a nested BooleanQuery, and an empty PhraseQuery (that's a bug).

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Benson Margulies
On Thu, Apr 19, 2012 at 5:10 PM, Robert Muir wrote: > On Thu, Apr 19, 2012 at 5:05 PM, Benson Margulies > wrote: >> On Thu, Apr 19, 2012 at 4:21 PM, Robert Muir wrote: >>> On Thu, Apr 19, 2012 at 3:49 PM, Benson Margulies >>> wrote: On Thu, Apr 19, 2012 at 1:34 PM, Robert Muir wrote: >>

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Robert Muir
On Thu, Apr 19, 2012 at 5:05 PM, Benson Margulies wrote: > On Thu, Apr 19, 2012 at 4:21 PM, Robert Muir wrote: >> On Thu, Apr 19, 2012 at 3:49 PM, Benson Margulies >> wrote: >>> On Thu, Apr 19, 2012 at 1:34 PM, Robert Muir wrote: On Thu, Apr 19, 2012 at 1:26 PM, Benson Margulies wr

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Benson Margulies
On Thu, Apr 19, 2012 at 4:21 PM, Robert Muir wrote: > On Thu, Apr 19, 2012 at 3:49 PM, Benson Margulies > wrote: >> On Thu, Apr 19, 2012 at 1:34 PM, Robert Muir wrote: >>> On Thu, Apr 19, 2012 at 1:26 PM, Benson Margulies >>> wrote: I am trying to solve a problem using DisjunctionMaxQuer

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Robert Muir
On Thu, Apr 19, 2012 at 3:49 PM, Benson Margulies wrote: > On Thu, Apr 19, 2012 at 1:34 PM, Robert Muir wrote: >> On Thu, Apr 19, 2012 at 1:26 PM, Benson Margulies >> wrote: >>> I am trying to solve a problem using DisjunctionMaxQuery. >>> >>> >>> Consider a query like: >>> >>> a:b OR c:d OR e:

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Benson Margulies
Turning on disableCoord for a nested boolean query does not seem to change the overall maxCoord term as displayed in explain. - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Benson Margulies
On Thu, Apr 19, 2012 at 1:34 PM, Robert Muir wrote: > On Thu, Apr 19, 2012 at 1:26 PM, Benson Margulies > wrote: >> I am trying to solve a problem using DisjunctionMaxQuery. >> >> >> Consider a query like: >> >> a:b OR c:d OR e:f OR ... >> name:richard OR name:dick OR name:dickie OR name:rich ..

Re: DisjunctionMaxQuery and scoring

2012-04-19 Thread Robert Muir
On Thu, Apr 19, 2012 at 1:26 PM, Benson Margulies wrote: > I am trying to solve a problem using DisjunctionMaxQuery. > > > Consider a query like: > > a:b OR c:d OR e:f OR ... > name:richard OR name:dick OR name:dickie OR name:rich ... > > At most, one of the richard names matches. So the match sco

DisjunctionMaxQuery and scoring

2012-04-19 Thread Benson Margulies
I am trying to solve a problem using DisjunctionMaxQuery. Consider a query like: a:b OR c:d OR e:f OR ... name:richard OR name:dick OR name:dickie OR name:rich ... At most, one of the richard names matches. So the match score gets dragged down by the long list of things that don't match, as the