[jira] Commented: (LUCENE-2939) Highlighter should try and use maxDocCharsToAnalyze in WeightedSpanTermExtractor when adding a new field to MemoryIndex as well as when using CachingTokenStream

Mark Miller (JIRA) Fri, 04 Mar 2011 05:58:01 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002609#comment-13002609
 ]


Mark Miller commented on LUCENE-2939:
-------------------------------------

bq. we should not have shoved this in at the last minute,

We didn't? Marking something as 3.1 is the best way to get it considered for 
last minute inclusion, blocker or not. It certainly doesn't mean its not going 
to be pushed back out after discussion.

In any case, if you are not for it, that decides it - I'm not willing to do the 
work right now.

bq. So, here are my questions:

1. I don't remember 2.9 probably.
2. Because it's a completely different approach.

It's been around for a while. I saw one guy that stayed on Solr 1.3 over 1.4 
because of it. Most people will try fast vector and say oh nice, it's fast - 
but it doesn't highlight wildcard queries or these queries, etc. They either 
accept one bug over the other, or stick with an older version. Honestly, if 
that continues for another release, it's no skin off my nose. But neither are 
most bugs.



> Highlighter should try and use maxDocCharsToAnalyze in 
> WeightedSpanTermExtractor when adding a new field to MemoryIndex as well as 
> when using CachingTokenStream
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-2939
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2939
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/highlighter
>            Reporter: Mark Miller
>            Assignee: Mark Miller
>            Priority: Minor
>             Fix For: 3.1.1, 3.2, 4.0
>
>         Attachments: LUCENE-2939.patch, LUCENE-2939.patch, LUCENE-2939.patch
>
>
> huge documents can be drastically slower than need be because the entire 
> field is added to the memory index
> this cost can be greatly reduced in many cases if we try and respect 
> maxDocCharsToAnalyze
> things can be improved even further by respecting this setting with 
> CachingTokenStream

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-2939) Highlighter should try and use maxDocCharsToAnalyze in WeightedSpanTermExtractor when adding a new field to MemoryIndex as well as when using CachingTokenStream

Reply via email to