[
https://issues.apache.org/jira/browse/LUCENE-2013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771458#action_12771458
]
Mark Miller commented on LUCENE-2013:
-------------------------------------
Nice catch - I think I like this method better than the core modifications.
bq. but this also means that no third-party queries have any way to influence
their highlighting.
Unfortunately, I think thats already the deal in many cases. The Highlighter is
very special case - ugly, but the current state of things. We will hopefully
get away from that eventually.
> QueryScorer and SpanRegexQuery are incompatible.
> ------------------------------------------------
>
> Key: LUCENE-2013
> URL: https://issues.apache.org/jira/browse/LUCENE-2013
> Project: Lucene - Java
> Issue Type: Bug
> Components: contrib/highlighter
> Affects Versions: 2.9
> Environment: Lucene-Java 2.9
> Reporter: Benjamin Keil
> Fix For: 3.0
>
> Attachments: lucene-2013-2009-10-28-2135.patch,
> lucene-2013-2009-10-28.patch, lucene-2013-2009-10-29-0136.patch,
> LUCENE-2013.patch
>
>
> Since the resolution of #LUCENE-1685, users are not supposed to rewrite their
> queries before submitting them to QueryScorer:
> bq.------------------------------------------------------------------------
> bq.r800796 | markrmiller | 2009-08-04 06:56:11 -0700 (Tue, 04 Aug 2009) | 1
> line
> bq.
> bq.LUCENE-1685: The position aware SpanScorer has become the default scorer
> for Highlighting. The SpanScorer implementation has replaced QueryScorer and
> the old term highlighting QueryScorer has been renamed to QueryTermScorer.
> Multi-term queries are also now expanded by default. If you were previously
> rewritting the query for multi-term query highlighting, you should no longer
> do that (unless you switch to using QueryTermScorer). The SpanScorer API (now
> QueryScorer) has also been improved to more closely match the API of the
> previous QueryScorer implementation.
> bq.------------------------------------------------------------------------
> This is a great convenience for the most part, but it's causing me
> difficulties with SpanRegexQuerys, as the WeightedSpanTermExtractor uses
> Query.extractTerms() to collect the fields used in the query, but
> SpanRegexQuery does not implement this method, so highlighting any query with
> a SpanRegexQuery throws an UnsupportedOpertationException. If this issue is
> circumvented, there is still the issue of SpanRegexQuery throwing an
> exception when someone calls its getSpans() method.
> I can provide the patch that I am currently using, but I'm not sure that my
> solution is optimal. It adds two methods to SpanQuery:
> extractFields(Set<String> fields) which is equivalent to
> fields.add(getField()) except when MaskedFieldQuerys get involved, and
> mustBeRewrittenToGetSpans() which returns true for SpanQuery, false for
> SpanTermQuery, and is overridden in each composite SpanQuery to return a
> value depending on its components. In this way SpanRegexQuery (and any other
> custom SpanQuerys) do not need to be adjusted.
> Currently the collection of fields and non-weighted terms are done in a
> single step. In the proposed patch the WeightedSpanTerm extraction from a
> SpanQuery proceeds in two steps. First, if the QueryScorer's field is null,
> then the fields are collected from the SpanQuery using the extractFields()
> method. Second the terms are collected using extractTerms(), rewriting the
> query for each field if mustBeRewrittenToGetSpans() returns true.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]