Yannick Welsch created LUCENE-10680:
---------------------------------------

             Summary: UnifiedHighlighter's term extraction not working for some 
query rewrites
                 Key: LUCENE-10680
                 URL: https://issues.apache.org/jira/browse/LUCENE-10680
             Project: Lucene - Core
          Issue Type: Bug
          Components: modules/highlighter
            Reporter: Yannick Welsch


UnifiedHighlighter rewrites the query against an empty index when extracting 
the terms from the query (see 
[https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/UnifiedHighlighter.java#L149).|https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/UnifiedHighlighter.java#L149)]

The rewrite step can unfortunately drop the terms that are to be extracted.

Take for example the boolean query "+field:value 
-ConstantScore(FieldExistsQuery [field=other_field])" when highlighting on 
"field".

The `FieldExistsQuery` rewrites on an empty index to a `MatchAllDocsQuery`, and 
as a `MUST_NOT` clause rewrites the overall boolean query to a 
`MatchNoDocsQuery`, dropping the `MUST` clause in the process, which means that 
the `field:value` term is not being extracted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to