[ 
https://issues.apache.org/jira/browse/LUCENE-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054114#comment-13054114
 ] 

Robert Muir commented on LUCENE-3234:
-------------------------------------

You can change it if you don't mind. However, I think I agree it would be good 
to figure out if there is an n^2 here. This might have some affect on what the 
default value should be... ideally there is some way we could fix the n^2.

Is there a way to turn your test case into a benchmark, or do you have a 
separate benchmark (the example you mentioned where it blows up really bad). 
This could help in looking at what's going on.


> Provide limit on phrase analysis in FastVectorHighlighter
> ---------------------------------------------------------
>
>                 Key: LUCENE-3234
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3234
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Mike Sokolov
>         Attachments: LUCENE-3234.patch
>
>
> With larger documents, FVH can spend a lot of time trying to find the 
> best-scoring snippet as it examines every possible phrase formed from 
> matching terms in the document.  If one is willing to accept
> less-than-perfect scoring by limiting the number of phrases that are 
> examined, substantial speedups are possible.  This is analogous to the 
> Highlighter limit on the number of characters to analyze.
> The patch includes an artifical test case that shows > 1000x speedup.  In a 
> more normal test environment, with English documents and random queries, I am 
> seeing speedups of around 3-10x when setting phraseLimit=1, which has the 
> effect of selecting the first possible snippet in the document.  Most of our 
> sites operate in this way (just show the first snippet), so this would be a 
> big win for us.
> With phraseLimit = -1, you get the existing FVH behavior. At larger values of 
> phraseLimit, you may not get substantial speedup in the normal case, but you 
> do get the benefit of protection against blow-up in pathological cases.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to