[ https://issues.apache.org/jira/browse/LUCENE-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir updated LUCENE-2014: -------------------------------- Lucene Fields: [New, Patch Available] (was: [New]) Fix Version/s: 3.0 Assignee: Robert Muir > position increment bug: smartcn > ------------------------------- > > Key: LUCENE-2014 > URL: https://issues.apache.org/jira/browse/LUCENE-2014 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/analyzers > Reporter: Robert Muir > Assignee: Robert Muir > Fix For: 3.0 > > Attachments: LUCENE-2014.patch, LUCENE-2014.patch > > > If i use LUCENE_VERSION >= 2.9 with smart chinese analyzer, it will crash > indexwriter with any reasonable amount of chinese text. > its especially annoying because it happens in 2.9.1 RC as well. > this is because the position increments for tokens after stopwords are bogus: > Here's an example (from test case), where the position increment should be 2, > but is instead 91975314! > {code} > public void testChineseStopWords2() throws Exception { > Analyzer ca = new SmartChineseAnalyzer(Version.LUCENE_CURRENT); /* will > load stopwords */ > String sentence = "Title:San"; // : is a stopword > String result[] = { "titl", "san"}; > int startOffsets[] = { 0, 6 }; > int endOffsets[] = { 5, 9 }; > int posIncr[] = { 1, 2 }; > assertAnalyzesTo(ca, sentence, result, startOffsets, endOffsets, posIncr); > } > {code} > junit.framework.AssertionFailedError: posIncrement 1 expected:<2> but > was:<91975314> > at junit.framework.Assert.fail(Assert.java:47) > at junit.framework.Assert.failNotEquals(Assert.java:280) > at junit.framework.Assert.assertEquals(Assert.java:64) > at junit.framework.Assert.assertEquals(Assert.java:198) > at > org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:83) > ... -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org