[ 
https://issues.apache.org/jira/browse/LUCENE-2013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Keil updated LUCENE-2013:
----------------------------------

    Attachment: lucene-2013-2009-10-28.patch

Patch for LUCENE-2013

> QueryScorer and SpanRegexQuery are incompatible.
> ------------------------------------------------
>
>                 Key: LUCENE-2013
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2013
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/highlighter
>    Affects Versions: 2.9
>         Environment: Lucene-Java 2.9
>            Reporter: Benjamin Keil
>         Attachments: lucene-2013-2009-10-28.patch
>
>
> Since the resolution of #LUCENE-1685, users are not supposed to rewrite their 
> queries before submitting them to QueryScorer:
> bq.{{------------------------------------------------------------------------
> r800796 | markrmiller | 2009-08-04 06:56:11 -0700 (Tue, 04 Aug 2009) | 1 line
> LUCENE-1685: The position aware SpanScorer has become the default scorer for 
> Highlighting. The SpanScorer implementation has replaced QueryScorer and the 
> old term highlighting QueryScorer has been renamed to QueryTermScorer. 
> Multi-term queries are also now expanded by default. If you were previously 
> rewritting the query for multi-term query highlighting, you should no longer 
> do that (unless you switch to using QueryTermScorer). The SpanScorer API (now 
> QueryScorer) has also been improved to more closely match the API of the 
> previous QueryScorer implementation.
> ------------------------------------------------------------------------}}
> This is a great convenience for the most part, but it's causing me 
> difficulties with {{SpanRegexQuery}}s, as the {{WeightedSpanTermExtractor}} 
> uses {{Query.extractTerms()}} to collect the fields used in the query, but 
> {{SpanRegexQuery}} does not implement this method, so highlighting any query 
> with a {{SpanRegexQuery}} throws an UnsupportedOpertationException.  If this 
> issue is circumvented, there is still the issue of {{SpanRegexQuery}} 
> throwing an exception when someone calls its {{getSpans()}} method.
> I can provide the patch that I am currently using, but I'm not sure that my 
> solution is optimal.  It adds two methods to {{SpanQuery}}: 
> {{extractFields(Set<String> fields)}} which is {{fields.add(getField())}} for 
> everything except {{MaskedFieldQuery}}, and {{mustBeRewrittenToGetSpans()}} 
> which returns {{true}} for {{SpanQuery}}, {{false}} for {{SpanTermQuery}}, 
> and is overridden in each composite {{SpanQuery}} to return a value depending 
> on its components.  In this way {{SpanRegexQuery}} (and any other custom 
> {{SpanQuery}}s) do not need to be adjusted.
> Currently the collection of fields and non-weighted terms are done in a 
> single step.  In the proposed patch the {{WeightedSpanTerm}} extraction from 
> a {{SpanQuery}} proceeds in two steps.  First, if the {{QueryScorer}}'s field 
> is {{null}}, then the fields are collected from the {{SpanQuery}} using the 
> {{extractFields()}} method.  Second the terms are collected using 
> {{extractTerms()}}, rewriting the query for each field if 
> {{mustBeRewrittenToGetSpans()}} returns {{true}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to