David Smiley created LUCENE-7620:
------------------------------------
Summary: UnifiedHighlighter: add target character width
BreakIterator wrapper
Key: LUCENE-7620
URL: https://issues.apache.org/jira/browse/LUCENE-7620
Project: Lucene - Core
Issue Type: Improvement
Components: modules/highlighter
Reporter: David Smiley
Assignee: David Smiley
The original Highlighter includes a {{SimpleFragmenter}} that delineates
fragments (aka Passages) by a character width. The default is 100 characters.
It would be great to support something similar for the UnifiedHighlighter.
It's useful in its own right and of course it helps users transition to the UH.
I'd like to do it as a wrapper to another BreakIterator -- perhaps a sentence
one. In this way you get back Passages that are a number of sentences so they
will look nice instead of breaking mid-way through a sentence. And you get
some control by specifying a target number of characters. This BreakIterator
wouldn't be a general purpose java.text.BreakIterator since it would assume
it's called in a manner exactly as the UnifiedHighlighter uses it. It would
probably be compatible with the PostingsHighlighter too.
I don't propose doing this by default; besides, it's easy enough to pick your
BreakIterator config.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]