: My original assumption for the DisMax Handler was, that it will just take the
: original query string and pass it to every field in its fieldlist using the
: fields configured analyzer stack. Maybe in the end add some stuff for the
: special options and so ... and then send the query to lucene. Can you explain
: why this approach was not choosen?

because then it wouldn't be the DisMaxRequestHandler.

seriously: the point of dismax is to build up a DisjunctionMaxQuery for 
each "chunk" in the query string and populate those DisjunctionMaxQueries 
with the Queries produced by analyzing that "chunk" against each field in 
the qf -- then all of the DisjunctionMaxQueries are grouped into a 
BooleanQuery with a minNrSHouldMatch.

if you look at the query toString from debugQuery (using a non trivial qf 
param and a q string containing more then one "chunk") you can see what i 
mean.  your example shows it pretty well actaully...

: > : > : > ((category:blue | name:blue)~0.1 (category:tooth | name:tooth)~0.1)

the point is to build those DisjunctionMaxQueries -- so that each "chunk" 
only contributes significantly based on the highest scoring field that 
chunk appears in ... if your example someone typing "blue tooth" can get a 
match when a doc matches blue in one field and tooth in another -- that 
wouldn't be possible with the appraoch you describe.  the Query structure 
also means that a doc where "tooth" appears in both the category and name 
fields but "blue" doesn't appear at all won't score as high as a doc that 
matches "blue" in category and "tooth" in name (allthough you have to look 
at the score explanations to really see hwat i mean by that)


There are certainly a lot of improvements that could be made to dismax ... 
more customiation in terms of how the querystrings is parsed before 
building up the DisjunctionMaxQueries and calling the individual field 
analyzers would certainly be one way it could improve ... but so far no 
one has attempted anything like that.




-Hoss

Reply via email to