Yannick Welsch created LUCENE-10680:
---------------------------------------
Summary: UnifiedHighlighter's term extraction not working for some
query rewrites
Key: LUCENE-10680
URL: https://issues.apache.org/jira/browse/LUCENE-10680
Project: Lucene - Core
Issue Type: Bug
Components: modules/highlighter
Reporter: Yannick Welsch
UnifiedHighlighter rewrites the query against an empty index when extracting
the terms from the query (see
[https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/UnifiedHighlighter.java#L149).|https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/UnifiedHighlighter.java#L149)]
The rewrite step can unfortunately drop the terms that are to be extracted.
Take for example the boolean query "+field:value
-ConstantScore(FieldExistsQuery [field=other_field])" when highlighting on
"field".
The `FieldExistsQuery` rewrites on an empty index to a `MatchAllDocsQuery`, and
as a `MUST_NOT` clause rewrites the overall boolean query to a
`MatchNoDocsQuery`, dropping the `MUST` clause in the process, which means that
the `field:value` term is not being extracted.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]