[ 
https://issues.apache.org/jira/browse/SOLR-14300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17057682#comment-17057682
 ] 

Hongtai Xue commented on SOLR-14300:
------------------------------------

hi, I attached a patch to fix this issue.
h3. about bug

the if statement here is wrong.
{code:java}
 for (BooleanClause clause : clauses) {
     ...
     // NOTE, for query "B:1 OR B:2"
     // when parse come to "B:2" , 
     // filedValues here will not be null since "B:1" has been stored in 
fieldValues
     fieldValues = fmap.get(sfield); 
     ...
     if ((fieldValues == null && useTermsQuery) || !sfield.indexed()) {
         fieldValues = new ArrayList<>(2); // <-- here, if B is not indexed, 
fieldValues will be overwritten, and "B:1" will lost
         fmap.put(sfield, fieldValues);
     }
     ...
 }
{code}
please check comment above,

if sfield is not indexed, fieldValues will always be overwritten.
 even fieldValues is not null.

another question is why only "q=A:1 OR B:1 OR A:2 OR B:2" causes problem,
 but "q=A:1 OR A:2 OR B:1 OR B:2" is OK.

the answer is 
[here|https://github.com/apache/lucene-solr/blob/branch_8_4/solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java#L705].
 the bug code is only run when field change. if the fields are same in clause, 
nothing will happen.
h3. how to fix

so, obviously, it's a very simple bug, and we only changed one line to fix it. 
{code:java}
-            if ((fieldValues == null && useTermsQuery) || !sfield.indexed()) {
+            if (fieldValues == null && (useTermsQuery || !sfield.indexed())) {
{code}
fieldValues will only be initialized when it's null.
h3. test

we confirmed the issue is fixed. 
 the following queries get same results.
 * query1: 
[http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+cat:book+OR+name_str:Jhereg+OR+cat:cd)&debug=query]
{code:json}
  "debug":{
    "rawquerystring":" (name_str:Foundation OR cat:book OR name_str:Jhereg OR 
cat:cd)",
    "querystring":" (name_str:Foundation OR cat:book OR name_str:Jhereg OR 
cat:cd)",
    "parsedquery":"cat:book cat:cd ((name_str:[[46 6f 75 6e 64 61 74 69 6f 6e] 
TO [46 6f 75 6e 64 61 74 69 6f 6e]]) (name_str:[[4a 68 65 72 65 67] TO [4a 68 
65 72 65 67]]))",
    "parsedquery_toString":"cat:book cat:cd (name_str:[[46 6f 75 6e 64 61 74 69 
6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]] name_str:[[4a 68 65 72 65 67] TO [4a 
68 65 72 65 67]])",
    "QParser":"LuceneQParser"}
{code}

 * query2: 
[http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+name_str:Jhereg+OR+cat:book+OR+cat:cd)&debug=query]
{code:json}
 "debug":{
    "rawquerystring":" (name_str:Foundation OR name_str:Jhereg OR cat:book OR 
cat:cd)",
    "querystring":" (name_str:Foundation OR name_str:Jhereg OR cat:book OR 
cat:cd)",
    "parsedquery":"cat:book cat:cd ((name_str:[[46 6f 75 6e 64 61 74 69 6f 6e] 
TO [46 6f 75 6e 64 61 74 69 6f 6e]]) (name_str:[[4a 68 65 72 65 67] TO [4a 68 
65 72 65 67]]))",
    "parsedquery_toString":"cat:book cat:cd (name_str:[[46 6f 75 6e 64 61 74 69 
6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]] name_str:[[4a 68 65 72 65 67] TO [4a 
68 65 72 65 67]])",
    "QParser":"LuceneQParser"}}
{code}

 

> Some conditional clauses on unindexed field will be ignored by query parser 
> in some specific cases
> --------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-14300
>                 URL: https://issues.apache.org/jira/browse/SOLR-14300
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: query parsers
>    Affects Versions: 7.3, 7.4, 7.5, 7.6, 7.7, 8.0, 8.1, 8.2, 8.3, 8.4
>         Environment: Solr 7.3.1 
> centos7.5
>            Reporter: Hongtai Xue
>            Priority: Minor
>              Labels: newbie, patch
>             Fix For: 7.3, 7.4, 7.5, 7.6, 7.7, 8.0, 8.1, 8.2, 8.3, 8.4
>
>         Attachments: SOLR-14300.patch
>
>
> In some specific cases, some conditional clauses on unindexed field will be 
> ignored
>  * for query like, q=A:1 OR B:1 OR A:2 OR B:2
>  if field B is not indexed(but docValues="true"), "B:1" will be lost.
>   
>  * but if you write query like, q=A:1 OR A:2 OR B:1 OR B:2,
>  it will work perfect.
> the only difference of two queries is that they are wrote in different orders.
>  one is *ABAB*, another is *AABB.*
>  
> *steps of reproduce*
>  you can easily reproduce this problem on a solr collection with _default 
> configset and exampledocs/books.csv data.
>  # create a _default collection
> {code:java}
> bin/solr create -c books -s 2 -rf 2{code}
>  # post books.csv.
> {code:java}
> bin/post -c books example/exampledocs/books.csv{code}
>  # run followed query.
>  ** query1: 
> [http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+cat:book+OR+name_str:Jhereg+OR+cat:cd)&debug=query]
>  ** query2: 
> [http://localhost:8983/solr/books/select?q=+(name_str:Foundation+OR+name_str:Jhereg+OR+cat:book+OR+cat:cd)&debug=query]
>  ** then you can find the parsedqueries are different.
>  *** query1.  ("name_str:Foundation" is lost.)
> {code:json}
>  "debug":{
>      "rawquerystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg 
> OR cat:cd)",
>      "querystring":"+(name_str:Foundation OR cat:book OR name_str:Jhereg OR 
> cat:cd)",
>      "parsedquery":"+(cat:book cat:cd (name_str:[[4a 68 65 72 65 67] TO [4a 
> 68 65 72 65 67]]))",
>      "parsedquery_toString":"+(cat:book cat:cd name_str:[[4a 68 65 72 65 67] 
> TO [4a 68 65 72 65 67]])",
>      "QParser":"LuceneQParser"}}{code}
>  *** query2.  ("name_str:Foundation" isn't lost.)
> {code:json}
>    "debug":{
>      "rawquerystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book 
> OR cat:cd)",
>      "querystring":"+(name_str:Foundation OR name_str:Jhereg OR cat:book OR 
> cat:cd)",
>      "parsedquery":"+(cat:book cat:cd ((name_str:[[46 6f 75 6e 64 61 74 69 6f 
> 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]]) (name_str:[[4a 68 65 72 65 67] TO 
> [4a 68 65 72 65 67]])))",
>      "parsedquery_toString":"+(cat:book cat:cd (name_str:[[46 6f 75 6e 64 61 
> 74 69 6f 6e] TO [46 6f 75 6e 64 61 74 69 6f 6e]] name_str:[[4a 68 65 72 65 
> 67] TO [4a 68 65 72 65 67]]))",
>      "QParser":"LuceneQParser"}{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to