[
https://issues.apache.org/jira/browse/LUCENE-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexandre Patry updated LUCENE-5019:
------------------------------------
Description:
In SimpleSpanFragmenter, when a query term is followed by a stop word, the
fragment will run until the end of the document.
When a query term is encountered (line 80), SimpleSpanFragmenter waits for the
token following it before allowing the fragment to end (lines 68 to 72). When a
stop word follows the query word (or any token with a position increment
greater than 1), its position is skipped and the token SimpleSpanFragmenter is
waiting for never arrive.
The attached patch fixes that by waiting for the first token following the
query word instead of the token at the position after the query term.
was:
In SimpleFragmentScorer, when a query term is followed by a stop word, the
fragment will run until the end of the document.
When a query term is encountered (line 80), SimpleFragmentScorer waits for the
token following it before allowing the fragment to end (lines 68 to 72). When a
stop word follows the query word (or any token with a position increment
greater than 1), its position is skipped and the token SimpleFragmentScorer is
waiting for never arrive.
The attached patch fixes that by waiting for the first token following the
query word instead of the token at the position after the query term.
Summary: SimpleSpanFragmenter can create very long fragments (was:
SimpleFragmentScorer can create very long fragments)
> SimpleSpanFragmenter can create very long fragments
> ---------------------------------------------------
>
> Key: LUCENE-5019
> URL: https://issues.apache.org/jira/browse/LUCENE-5019
> Project: Lucene - Core
> Issue Type: Bug
> Components: modules/highlighter
> Affects Versions: 4.3
> Reporter: Alexandre Patry
> Priority: Minor
> Attachments: simple-span-fragmenter.patch
>
>
> In SimpleSpanFragmenter, when a query term is followed by a stop word, the
> fragment will run until the end of the document.
> When a query term is encountered (line 80), SimpleSpanFragmenter waits for
> the token following it before allowing the fragment to end (lines 68 to 72).
> When a stop word follows the query word (or any token with a position
> increment greater than 1), its position is skipped and the token
> SimpleSpanFragmenter is waiting for never arrive.
> The attached patch fixes that by waiting for the first token following the
> query word instead of the token at the position after the query term.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]