[jira] [Commented] (LUCENE-4825) PostingsHighlighter support for positional queries

Luca Cavanna (JIRA) Wed, 13 Mar 2013 00:34:17 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13600918#comment-13600918
 ]


Luca Cavanna commented on LUCENE-4825:
--------------------------------------

Hey Robert,
sorry but I don't quite understand why it would become an orange? :)

I mean, the PostingsHighlighter does (among others) two great things:
1) reads offsets from the postings list, as its name says
2) summarizes the content giving nice sentences as output

I think the two above features are a great improvement and pretty much what 
everybody would like to have!

I'm proposing to add support for positional queries, as a third optional 
feature. We would need to read the spans from the positional queries in order 
to highlight only the proper terms, otherwise the output is wrong from a user 
perspective. Would this make it that slower? I don't mean to reanalyze the 
text...

Don't get me wrong you must be right but I would like to understand more. 

You're saying that instead of adding 3) to 2) and 1) we should have another 
highlighter that does 1) 2) and 3)?




                
> PostingsHighlighter support for positional queries
> --------------------------------------------------
>
>                 Key: LUCENE-4825
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4825
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/highlighter
>    Affects Versions: 4.2
>            Reporter: Luca Cavanna
>
> I've been playing around with the brand new PostingsHighlighter. I'm really 
> happy with the result in terms of quality of the snippets and performance.
> On the other hand, I noticed it doesn't support positional queries. If you 
> make a span query, for example, all the single terms will be highlighted, 
> even though they haven't contributed to the match. That reminds me of the 
> difference between the QueryTermScorer and the QueryScorer (using the 
> standard Highlighter).
> I've been trying to adapt what the QueryScorer does, especially the 
> extraction of the query terms together with their positions (what 
> WeightedSpanTermExtractor does). Next step would be to take that information 
> into account within the formatter and highlight only the terms that actually 
> contributed to the match. I'm not quite ready yet with a patch to contribute 
> this back, but I certainly intend to do so. That's why I opened the issue and 
> in the meantime I would like to hear what you guys think about it and  
> discuss how best we can fix it. I think it would be a big improvement for 
> this new highlighter, which is already great!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-4825) PostingsHighlighter support for positional queries

Reply via email to