[ https://issues.apache.org/jira/browse/LUCENE-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
wangzhenghang updated LUCENE-3026: ---------------------------------- Description: That's all because of org.apache.lucene.analysis.cn.smart.hhmm.SegGraph's makeIndex() method: public List<SegToken> makeIndex() { List<SegToken> result = new ArrayList<SegToken>(); int s = -1, count = 0, size = tokenListTable.size(); List<SegToken> tokenList; short index = 0; while (count < size) { if (isStartExist(s)) { tokenList = tokenListTable.get(s); for (SegToken st : tokenList) { st.index = index; result.add(st); index++; } count++; } s++; } return result; } here 'short index = 0;' should be 'int index = 0;'. And that's reported here http://code.google.com/p/imdict-chinese-analyzer/issues/detail?id=2 and http://code.google.com/p/imdict-chinese-analyzer/issues/detail?id=11, the author XiaoPingGao have already fixed this bug:http://code.google.com/p/imdict-chinese-analyzer/source/browse/trunk/src/org/apache/lucene/analysis/cn/smart/hhmm/SegGraph.java was: That all because of org.apache.lucene.analysis.cn.smart.hhmm.SegGraph's makeIndex() method: public List<SegToken> makeIndex() { List<SegToken> result = new ArrayList<SegToken>(); int s = -1, count = 0, size = tokenListTable.size(); List<SegToken> tokenList; short index = 0; while (count < size) { if (isStartExist(s)) { tokenList = tokenListTable.get(s); for (SegToken st : tokenList) { st.index = index; result.add(st); index++; } count++; } s++; } return result; } 'short index = 0;' should be 'int index = 0;'. And that's reported here http://code.google.com/p/imdict-chinese-analyzer/issues/detail?id=2, http://code.google.com/p/imdict-chinese-analyzer/issues/detail?id=11, the author XiaoPingGao have already fixed this bug:http://code.google.com/p/imdict-chinese-analyzer/source/browse/trunk/src/org/apache/lucene/analysis/cn/smart/hhmm/SegGraph.java > smartcn analysis throw NullPointer exception when the length of analysed text > over 32767 > ---------------------------------------------------------------------------------------- > > Key: LUCENE-3026 > URL: https://issues.apache.org/jira/browse/LUCENE-3026 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/analyzers > Affects Versions: 3.1, 4.0 > Reporter: wangzhenghang > > That's all because of org.apache.lucene.analysis.cn.smart.hhmm.SegGraph's > makeIndex() method: > public List<SegToken> makeIndex() { > List<SegToken> result = new ArrayList<SegToken>(); > int s = -1, count = 0, size = tokenListTable.size(); > List<SegToken> tokenList; > short index = 0; > while (count < size) { > if (isStartExist(s)) { > tokenList = tokenListTable.get(s); > for (SegToken st : tokenList) { > st.index = index; > result.add(st); > index++; > } > count++; > } > s++; > } > return result; > } > here 'short index = 0;' should be 'int index = 0;'. And that's reported here > http://code.google.com/p/imdict-chinese-analyzer/issues/detail?id=2 and > http://code.google.com/p/imdict-chinese-analyzer/issues/detail?id=11, the > author XiaoPingGao have already fixed this > bug:http://code.google.com/p/imdict-chinese-analyzer/source/browse/trunk/src/org/apache/lucene/analysis/cn/smart/hhmm/SegGraph.java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org