[jira] Commented: (LUCENE-794) SpanScorer and SimpleSpanFragmenter for Contrib Highlighter

Mark Harwood (JIRA) Mon, 12 Mar 2007 11:34:30 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12480175
 ]


Mark Harwood commented on LUCENE-794:
-------------------------------------

>>At a minimum, the Term fields could be set back to their original value after 
>>doing the Span search..

Hmm. If the query is being reused in a multi-threaded server environment this 
wouldn't fly.

>>I really don't see how it is possible to ignore fields in another way though

I can think of one. Your current approach is based on modifying the query to 
suit the MemoryIndex content. Another approach may be to modify the MemoryIndex 
content to suit the query. Your code creates a MemoryIndex when presented with 
the text of a field. If it recognised it was being used in "field-insensitive 
mode" it could extract the query terms and create a MemoryIndex field for each 
unique fieldname in the set of query terms - using the same source text (a 
CachedTokenStreamAnalyzer  could be used to avoid excessive tokenization of 
this text)
This approach would of course use some more memory but avoids the 
unpleasantness of changing Query objects' contents.
I haven't fully considered the implications of this idea yet - initial thoughts?

Cheers
Mark

> SpanScorer and SimpleSpanFragmenter for Contrib Highlighter
> -----------------------------------------------------------
>
>                 Key: LUCENE-794
>                 URL: https://issues.apache.org/jira/browse/LUCENE-794
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Other
>            Reporter: Mark Miller
>            Priority: Minor
>         Attachments: CachedTokenStream.java, CachedTokenStream.java, 
> CachedTokenStream.java, DefaultEncoder.java, Encoder.java, Formatter.java, 
> Highlighter.java, Highlighter.java, Highlighter.java, Highlighter.java, 
> Highlighter.java, HighlighterTest.java, HighlighterTest.java, 
> HighlighterTest.java, HighlighterTest.java, MemoryIndex.java, 
> QuerySpansExtractor.java, QuerySpansExtractor.java, QuerySpansExtractor.java, 
> QuerySpansExtractor.java, SimpleFormatter.java, spanhighlighter.patch, 
> spanhighlighter2.patch, spanhighlighter3.patch, spanhighlighter_patch_4.zip, 
> SpanHighlighterTest.java, SpanHighlighterTest.java, SpanScorer.java, 
> SpanScorer.java, WeightedSpanTerm.java
>
>
> This patch adds a new Scorer class (SpanQueryScorer) to the Highlighter 
> package that scores just like QueryScorer, but scores a 0 for Terms that did 
> not cause the Query hit. This gives 'actual' hit highlighting for the range 
> of SpanQuerys and PhraseQuery. There is also a new Fragmenter that attempts 
> to fragment without breaking up Spans.
> See http://issues.apache.org/jira/browse/LUCENE-403 for some background.
> There is a dependency on MemoryIndex.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-794) SpanScorer and SimpleSpanFragmenter for Contrib Highlighter

Reply via email to