[ 
https://issues.apache.org/jira/browse/SOLR-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161987#comment-14161987
 ] 

Frances Webb commented on SOLR-6602:
------------------------------------

It looks like your analysis on name_tokenized is eliminating the fifth "term", 
so that you'd only have four terms that must match if only searching against 
name_tokenized. By adding the new search field, the fifth term is no longer 
being universally eliminated by the search field analyses. Because the fifth 
term survives the analysis for feederstate, it is no longer exempted from 
matching under mm=100%.

This isn't an uncommon problem, but it isn't a Solr bug. It requires 
re-evaluation of your search and/or index strategies. Should the single quote 
by itself be populated into feederstate so the fifth term will match? It might 
be a bug that it isn't already. Should the search string be optimized before 
passing it to Solr? Maybe feederstate should be a text field with punctuation 
eliminated and a keyword tokenizer? That would cause the fifth term to be 
eliminated from both fields and thus exempted from matching.

> dismax query does not match with additional field in qf
> -------------------------------------------------------
>
>                 Key: SOLR-6602
>                 URL: https://issues.apache.org/jira/browse/SOLR-6602
>             Project: Solr
>          Issue Type: Bug
>          Components: query parsers
>    Affects Versions: 4.10
>            Reporter: Andreas Hubold
>
> A query using the Solr dismax query parser does not match anymore after I've 
> added another field to the qf parameter. I'd expect that an additional field 
> in the qf parameter would not lead to fewer matches. 
> *Test setup*
> A document with rather strange content in a field "name_tokenized" of type 
> "text_general":
> {noformat}
> abc_<iframe src='loadLocale.js' onload='javascript:document.XSSed="name"' 
> width=0 height=0>
> {noformat}
> can be found when using the following dismax query with qf set to field 
> "name_tokenized" only: 
> {noformat}
> http://localhost:44080/solr/studio/editor?deftype=dismax&q=abc_%3Ciframe+src%3D%27loadLocale.js%27+onload%3D%27javascript%3Adocument.XSSed%3D%22name%22%27&debug=true&echoParams=all&qf=name_tokenized^2
> {noformat}
> When submitting exactly the same query but with an additional field 
> "feederstate" of type "string" in the qf parameter, I don't get any results.
> {noformat}
> http://localhost:44080/solr/studio/editor?deftype=dismax&q=abc_%3Ciframe+src%3D%27loadLocale.js%27+onload%3D%27javascript%3Adocument.XSSed%3D%22name%22%27&debug=true&echoParams=all&qf=name_tokenized^2%20feederstate
> {noformat}
> The decoded value of q is: {noformat}abc_<iframe src='loadLocale.js' 
> onload='javascript:document.XSSed="name"'{noformat} and it seems the trailing 
> single-quote causes problems here. (In fact, I can find the document when I 
> remove the last char)
> The parsed query for the latter case is
> {noformat}
> (
>   +((
>     DisjunctionMaxQuery((feederstate:abc_<iframe | ((name_tokenized:abc_ 
> name_tokenized:iframe)^2.0))~0.1)
>     DisjunctionMaxQuery((feederstate:src='loadLocale.js' | 
> ((name_tokenized:src name_tokenized:loadlocale.js)^2.0))~0.1)
>     DisjunctionMaxQuery((feederstate:onload='javascript:document.XSSed= | 
> ((name_tokenized:onload name_tokenized:javascript:document.xssed)^2.0))~0.1)
>     DisjunctionMaxQuery((feederstate:name | name_tokenized:name^2.0)~0.1)
>     DisjunctionMaxQuery((feederstate:')~0.1)
>   )~5)
>   DisjunctionMaxQuery((textbody:"abc_ iframe src loadlocale.js onload 
> javascript:document.xssed name" | name_tokenized:"abc_ iframe src 
> loadlocale.js onload javascript:document.xssed name"^2.0)~0.1)
> )/no_coord
> {noformat}
> I've configured the called search handler with {{<str name="mm">100%</str>}} 
> so that all of the 5 dismax queries at the top must match. But this one does 
> not match: {{DisjunctionMaxQuery((feederstate:')~0.1)}}
> (All mentioned field types are taken from the example schema.xml.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to