[ https://issues.apache.org/jira/browse/LUCENE-2013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Benjamin Keil updated LUCENE-2013: ---------------------------------- Attachment: lucene-2013-2009-10-28.patch Patch for LUCENE-2013 > QueryScorer and SpanRegexQuery are incompatible. > ------------------------------------------------ > > Key: LUCENE-2013 > URL: https://issues.apache.org/jira/browse/LUCENE-2013 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/highlighter > Affects Versions: 2.9 > Environment: Lucene-Java 2.9 > Reporter: Benjamin Keil > Attachments: lucene-2013-2009-10-28.patch > > > Since the resolution of #LUCENE-1685, users are not supposed to rewrite their > queries before submitting them to QueryScorer: > bq.{{------------------------------------------------------------------------ > r800796 | markrmiller | 2009-08-04 06:56:11 -0700 (Tue, 04 Aug 2009) | 1 line > LUCENE-1685: The position aware SpanScorer has become the default scorer for > Highlighting. The SpanScorer implementation has replaced QueryScorer and the > old term highlighting QueryScorer has been renamed to QueryTermScorer. > Multi-term queries are also now expanded by default. If you were previously > rewritting the query for multi-term query highlighting, you should no longer > do that (unless you switch to using QueryTermScorer). The SpanScorer API (now > QueryScorer) has also been improved to more closely match the API of the > previous QueryScorer implementation. > ------------------------------------------------------------------------}} > This is a great convenience for the most part, but it's causing me > difficulties with {{SpanRegexQuery}}s, as the {{WeightedSpanTermExtractor}} > uses {{Query.extractTerms()}} to collect the fields used in the query, but > {{SpanRegexQuery}} does not implement this method, so highlighting any query > with a {{SpanRegexQuery}} throws an UnsupportedOpertationException. If this > issue is circumvented, there is still the issue of {{SpanRegexQuery}} > throwing an exception when someone calls its {{getSpans()}} method. > I can provide the patch that I am currently using, but I'm not sure that my > solution is optimal. It adds two methods to {{SpanQuery}}: > {{extractFields(Set<String> fields)}} which is {{fields.add(getField())}} for > everything except {{MaskedFieldQuery}}, and {{mustBeRewrittenToGetSpans()}} > which returns {{true}} for {{SpanQuery}}, {{false}} for {{SpanTermQuery}}, > and is overridden in each composite {{SpanQuery}} to return a value depending > on its components. In this way {{SpanRegexQuery}} (and any other custom > {{SpanQuery}}s) do not need to be adjusted. > Currently the collection of fields and non-weighted terms are done in a > single step. In the proposed patch the {{WeightedSpanTerm}} extraction from > a {{SpanQuery}} proceeds in two steps. First, if the {{QueryScorer}}'s field > is {{null}}, then the fields are collected from the {{SpanQuery}} using the > {{extractFields()}} method. Second the terms are collected using > {{extractTerms()}}, rewriting the query for each field if > {{mustBeRewrittenToGetSpans()}} returns {{true}}. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org