I'm aware that using a field tokenized with KeywordTokenizerFactory is
in a dismax 'qf' is often going to result in 0 hits on that field --
(when a whitespace-containing query is entered). But I do it anyway,
for cases where a non-whitespace-containing query is entered, then it
hits. And in those cases where it doesn't hit, I figure okay, well, the
other fields in qf will hit or not, that's good enough.
And usually that works. But it works _differently_ when my query
contains an ampersand (or any other punctuation), result in 0 hits when
it shoudln't, and I can't figure out why.
basically,
&defType=dismax&mm=100%&q=one : two&qf=text_field
gets hits. The ":" is thrown out the text_field, but the mm still
passes somehow, right?
But, in the same index:
&defType=dismax&mm=100%&q=one : two&qf=text_field
keyword_tokenized_text_field
gets 0 hits. Somehow maybe the inclusion of the
keyword_tokenized_text_field in the qf causes dismax to calculate the mm
differently, decide there are three tokens in there and they all must
match, and the token ":" can never match because it's not in my index
it's stripped out... but somehow this isn't a problem unless I include a
keyword-tokenized field in the qf?
This is really confusing, if anyone has any idea what I'm talking about
it and can shed any light on it, much appreciated.
The conclusion I am reaching is just NEVER include anything but a more
or less ordinarily tokenized field in a dismax qf. Sadly, it was useful
for certain use cases for me.
Oh, hey, the debugging trace woudl probably be useful:
<lstname="debug">
<strname="rawquerystring">
churchill : roosevelt
</str>
<strname="querystring">
churchill : roosevelt
</str>
<strname="parsedquery">
+((DisjunctionMaxQuery((isbn_t:churchill | title1_t:churchil)~0.01)
DisjunctionMaxQuery((isbn_t::)~0.01)
DisjunctionMaxQuery((isbn_t:roosevelt | title1_t:roosevelt)~0.01))~3)
DisjunctionMaxQuery((title2_unstem:"churchill roosevelt"~3^240.0 |
text:"churchil roosevelt"~3^10.0 | title2_t:"churchil roosevelt"~3^50.0
| author_unstem:"churchill roosevelt"~3^400.0 |
title_exactmatch:churchill roosevelt^500.0 | title1_t:"churchil
roosevelt"~3^60.0 | title1_unstem:"churchill roosevelt"~3^320.0 |
author2_unstem:"churchill roosevelt"~3^240.0 | title3_unstem:"churchill
roosevelt"~3^80.0 | subject_t:"churchil roosevelt"~3^10.0 |
other_number_unstem:"churchill roosevelt"~3^40.0 |
subject_unstem:"churchill roosevelt"~3^80.0 | title_series_t:"churchil
roosevelt"~3^40.0 | title_series_unstem:"churchill roosevelt"~3^60.0 |
text_unstem:"churchill roosevelt"~3^80.0)~0.01)
</str>
<strname="parsedquery_toString">
+(((isbn_t:churchill | title1_t:churchil)~0.01 (isbn_t::)~0.01
(isbn_t:roosevelt | title1_t:roosevelt)~0.01)~3)
(title2_unstem:"churchill roosevelt"~3^240.0 | text:"churchil
roosevelt"~3^10.0 | title2_t:"churchil roosevelt"~3^50.0 |
author_unstem:"churchill roosevelt"~3^400.0 | title_exactmatch:churchill
roosevelt^500.0 | title1_t:"churchil roosevelt"~3^60.0 |
title1_unstem:"churchill roosevelt"~3^320.0 | author2_unstem:"churchill
roosevelt"~3^240.0 | title3_unstem:"churchill roosevelt"~3^80.0 |
subject_t:"churchil roosevelt"~3^10.0 | other_number_unstem:"churchill
roosevelt"~3^40.0 | subject_unstem:"churchill roosevelt"~3^80.0 |
title_series_t:"churchil roosevelt"~3^40.0 |
title_series_unstem:"churchill roosevelt"~3^60.0 |
text_unstem:"churchill roosevelt"~3^80.0)~0.01
</str>
<lstname="explain"/>
<strname="QParser">
DisMaxQParser
</str>
<nullname="altquerystring"/>
<nullname="boostfuncs"/>
<lstname="timing">
<doublename="time">
6.0
</double>
<lstname="prepare">
<doublename="time">
3.0
</double>
<lstname="org.apache.solr.handler.component.QueryComponent">
<doublename="time">
2.0
</double>
</lst>
<lstname="org.apache.solr.handler.component.FacetComponent">
<doublename="time">
0.0
</double>
</lst>
<lstname="org.apache.solr.handler.component.MoreLikeThisComponent">
<doublename="time">
0.0
</double>
</lst>
<lstname="org.apache.solr.handler.component.HighlightComponent">
<doublename="time">
0.0
</double>
</lst>
<lstname="org.apache.solr.handler.component.StatsComponent">
<doublename="time">
0.0
</double>
</lst>
<lstname="org.apache.solr.handler.component.SpellCheckComponent">
<doublename="time">
0.0
</double>
</lst>
<lstname="org.apache.solr.handler.component.DebugComponent">
<doublename="time">
0.0
</double>
</lst>
</lst>