Hongtai:

First, many thanks for reporting this in such detail, it really helps and
it’s obvious you’ve dug into the problem rather than just thrown it over
the wall.

Please go ahead and raise a JIRA

Best,
Erick

On Mar 2, 2020, at 03:45, Hongtai Xue <[email protected]> wrote:



Hi,



Our team found a strange behavior of solr query parser.

In some specific cases, some conditional clauses on unindexed field will be
ignored.



for query like, q=A:1 OR B:1 OR A:2 OR B:2

if field B is not indexed(but docValues="true"), "B:1" will be lost.



but if you write query like, q=A:1 OR A:2 OR B:1 OR B:2,

it will work perfect.



the only difference of two queries is that they are wrote in different
orders.

one is ABAB, another is AABB,



■reproduce steps and example explanation

you can easily reproduce this problem on a solr collection with _default
configset and exampledocs/books.csv data.



1. create a _default collection

bin/solr create -c books -s 2 -rf 2



2. post books.csv.

bin/post -c books example/exampledocs/books.csv



3. run following query.

http://localhost:8983/solr/books/select?q=%2B%28name_str%3AFoundation+OR+cat%3Abook+OR+name_str%3AJhereg+OR+cat%3Acd%29&debug=query





I printed query parsing debug information.

you can tell "name_str:Foundation" is lost.



query: "name_str:Foundation OR cat:book OR name_str:Jhereg OR cat:cd"

(please note "Jhereg" is "4a 68 65 72 65 67" and "Foundation" is "46 6f 75
6e 64 61 74 69 6f 6e")

--------

  "debug":{

    "rawquerystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg
OR cat:cd)",

    "querystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg OR
cat:cd)",

    "parsedquery":"+(cat:book cat:cd (name_str:[[4a 68 65 72 65 67] TO [4a
68 65 72 65 67]]))",

    "parsedquery_toString":"+(cat:book cat:cd name_str:[[4a 68 65 72 65 67]
TO [4a 68 65 72 65 67]])",

    "QParser":"LuceneQParser"}}

--------



but for query: "name_str:Foundation OR name_str:Jhereg OR cat:book OR
cat:cd",

everything is OK. "name_str:Foundation" is not lost.

--------

  "debug":{

    "rawquerystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book
OR cat:cd)",

    "querystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book OR
cat:cd)",

    "parsedquery":"+(cat:book cat:cd ((name_str:[[46 6f 75 6e 64 61 74 69
6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]]) (name_str:[[4a 68 65 72 65 67]
TO [4a 68 65 72 65 67]])))",

    "parsedquery_toString":"+(cat:book cat:cd (name_str:[[46 6f 75 6e 64 61
74 69 6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]] name_str:[[4a 68 65 72 65
67] TO [4a 68 65 72 65 67]]))",

    "QParser":"LuceneQParser"}}

--------

http://localhost:8983/solr/books/select?q=%2B%28name_str%3AFoundation+OR+name_str%3AJhereg+OR+cat%3Abook+OR+cat%3Acd%29&debug=query



we did a little bit research, and we wander if it is a bug of
SolrQueryParser.

more specifically, we think if statement here might be wrong.

https://github.com/apache/lucene-solr/blob/branch_8_4/solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java#L711



Could you please tell us if it is a bug, or it's just a wrong query
statement.



Thanks,

Hongtai Xue

Reply via email to