[ https://issues.apache.org/jira/browse/LUCENE-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002602#comment-13002602 ]
Robert Muir commented on LUCENE-2939: ------------------------------------- {quote} I mind your attitude. Changing the issue target 2 seconds after Grant with no discussion. Declaring on your own that it won't get in. Not trying to get to a real conversation about the issue (which you clearly don't fully understand if you think storing term vectors will help). These things are my issue, not any so called push back. {quote} Its not an attitude, and its not personal. Its trying to stop last minute stuff from being shoved into the release right before the RC, especially if its not fully-formed patches ready to be committed. {quote} Well man, you need us on your team too. Performance bug is a technical valid reason for a -1 on a release. I'm not threatening that - but I'm pointing out that everyone needs to be on board - not just the RM. Taking the time for fair discussion is not a waste of time. {quote} I totally agree with you here. But some people might say, if the bug has been aroudn since say 2.4 or 2.9 that its not critical that it be fixed in 3.1 at the last minute, and still +1 the release. As i stated earlier on this issue, I'm sympathetic to performance bugs: performance bugs are bugs too. But we need to evaluate risk-reward here. Just don't forget that there are other performance problems with large documents in lucene (some have been around a while) and we aren't trying to shove any last minute fixes for those in. So, here are my questions: # What version of Lucene was this performance bug introduced in? Is it something we introduced in version 3.1? If this is the case its more serious than if its something thats been around since 2.9. # Why is fast-vector highlighter with TVs "ok", but highlighter with TVs slow? > Highlighter should try and use maxDocCharsToAnalyze in > WeightedSpanTermExtractor when adding a new field to MemoryIndex as well as > when using CachingTokenStream > ---------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: LUCENE-2939 > URL: https://issues.apache.org/jira/browse/LUCENE-2939 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/highlighter > Reporter: Mark Miller > Assignee: Mark Miller > Priority: Minor > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2939.patch, LUCENE-2939.patch, LUCENE-2939.patch > > > huge documents can be drastically slower than need be because the entire > field is added to the memory index > this cost can be greatly reduced in many cases if we try and respect > maxDocCharsToAnalyze > things can be improved even further by respecting this setting with > CachingTokenStream -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org