[ https://issues.apache.org/jira/browse/LUCENE-3113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034956#comment-13034956 ]
Steven Rowe commented on LUCENE-3113: ------------------------------------- +1 bq. the ShingleAnalyzerWrapper was double-resetting Your patch just removes the reset call: {noformat} @@ -201,7 +201,6 @@ TokenStream result = defaultAnalyzer.reusableTokenStream(fieldName, reader); if (result == streams.wrapped) { /* the wrapped analyzer reused the stream */ - streams.shingle.reset(); } else { /* the wrapped analyzer did not, create a new shingle around the new one */ streams.wrapped = result; {noformat} but inverting the condition would read better: {noformat} TokenStream result = defaultAnalyzer.reusableTokenStream(fieldName, reader); - if (result == streams.wrapped) { - /* the wrapped analyzer reused the stream */ - streams.shingle.reset(); - } else { - /* the wrapped analyzer did not, create a new shingle around the new one */ + if (result != streams.wrapped) { + // The wrapped analyzer did not reuse the stream. + // Wrap the new stream with a new ShingleFilter. streams.wrapped = result; streams.shingle = new ShingleFilter(streams.wrapped); } {noformat} > fix analyzer bugs found by MockTokenizer > ---------------------------------------- > > Key: LUCENE-3113 > URL: https://issues.apache.org/jira/browse/LUCENE-3113 > Project: Lucene - Java > Issue Type: Bug > Components: modules/analysis > Reporter: Robert Muir > Fix For: 3.2, 4.0 > > Attachments: LUCENE-3113.patch, LUCENE-3113.patch > > > In LUCENE-3064, we beefed up MockTokenizer with assertions, and I've switched > over the analysis tests to use MockTokenizer for better coverage. > However, this found a few bugs (one of which is LUCENE-3106): > * incrementToken() after it returns false in CommonGramsQueryFilter, > HyphenatedWordsFilter, ShingleFilter, SynonymFilter > * missing end() implementation for PrefixAwareTokenFilter > * double reset() in QueryAutoStopWordAnalyzer and ReusableAnalyzerBase > * missing correctOffset()s in MockTokenizer itself. > I think it would be nice to just fix all the bugs on one issue... I've fixed > everything except Shingle and Synonym -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org