[ 
https://issues.apache.org/jira/browse/LUCENE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13601017#comment-13601017
 ] 

Robert Muir commented on LUCENE-4825:
-------------------------------------

I dont see this highlighter as doing that I guess.

I see it as taking query *terms* (not matches!!!!) and intersecting them with a 
breakiterator in increasing offset order, ranking these passages as it goes.

{quote}
We would need to read the spans from the positional queries in order to 
highlight only the proper terms, otherwise the output is wrong from a user 
perspective.
{quote}

Then the user is wrong, and should use another highlighter. This one is about 
good document summarization with respect to the query terms. Its not about 
visualizing exact matches to lucene queries.

If the user doesnt care about 'search' but about 'matching' at the expense of 
everything else, they already have 2 other highlighters in lucene that focus on 
this (making wrong tradeoffs in my opinion)!

                
> PostingsHighlighter support for positional queries
> --------------------------------------------------
>
>                 Key: LUCENE-4825
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4825
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/highlighter
>    Affects Versions: 4.2
>            Reporter: Luca Cavanna
>
> I've been playing around with the brand new PostingsHighlighter. I'm really 
> happy with the result in terms of quality of the snippets and performance.
> On the other hand, I noticed it doesn't support positional queries. If you 
> make a span query, for example, all the single terms will be highlighted, 
> even though they haven't contributed to the match. That reminds me of the 
> difference between the QueryTermScorer and the QueryScorer (using the 
> standard Highlighter).
> I've been trying to adapt what the QueryScorer does, especially the 
> extraction of the query terms together with their positions (what 
> WeightedSpanTermExtractor does). Next step would be to take that information 
> into account within the formatter and highlight only the terms that actually 
> contributed to the match. I'm not quite ready yet with a patch to contribute 
> this back, but I certainly intend to do so. That's why I opened the issue and 
> in the meantime I would like to hear what you guys think about it and  
> discuss how best we can fix it. I think it would be a big improvement for 
> this new highlighter, which is already great!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to