[ https://issues.apache.org/jira/browse/LUCENE-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771335#action_12771335 ]
Uwe Schindler commented on LUCENE-2014: --------------------------------------- This is the problem, you are right. I thought about that, too. The question is, why does the PosIncr get such strange values even when the filter is source of tokens? Nobody else modifies it? > position increment bug: smartcn > ------------------------------- > > Key: LUCENE-2014 > URL: https://issues.apache.org/jira/browse/LUCENE-2014 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/analyzers > Reporter: Robert Muir > Attachments: LUCENE-2014.patch, LUCENE-2014.patch > > > If i use LUCENE_VERSION >= 2.9 with smart chinese analyzer, it will crash > indexwriter with any reasonable amount of chinese text. > its especially annoying because it happens in 2.9.1 RC as well. > this is because the position increments for tokens after stopwords are bogus: > Here's an example (from test case), where the position increment should be 2, > but is instead 91975314! > {code} > public void testChineseStopWords2() throws Exception { > Analyzer ca = new SmartChineseAnalyzer(Version.LUCENE_CURRENT); /* will > load stopwords */ > String sentence = "Title:San"; // : is a stopword > String result[] = { "titl", "san"}; > int startOffsets[] = { 0, 6 }; > int endOffsets[] = { 5, 9 }; > int posIncr[] = { 1, 2 }; > assertAnalyzesTo(ca, sentence, result, startOffsets, endOffsets, posIncr); > } > {code} > junit.framework.AssertionFailedError: posIncrement 1 expected:<2> but > was:<91975314> > at junit.framework.Assert.fail(Assert.java:47) > at junit.framework.Assert.failNotEquals(Assert.java:280) > at junit.framework.Assert.assertEquals(Assert.java:64) > at junit.framework.Assert.assertEquals(Assert.java:198) > at > org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:83) > ... -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org