subject:"RE\: getting full english word from tokenizing with SmartChineseAnalyzer"

Re: getting full english word from tokenizing with SmartChineseAnalyzer

2015-08-14 Thread Michael Mastroianni

The easiest thing to do is to create your own analyzer, cut and paste the code from org.apache.lucene.analysis.cn.smart.SmartChineseAnalyzer into it, and get rid of the line in createComponents(String fieldName, Reader reader) that says result = new PorterStemFilter(result); On Fri, Aug

Re: getting full english word from tokenizing with SmartChineseAnalyzer

2015-08-14 Thread Wayne Xin

Thanks Michael. That works well. Not sure why SmartChineseAnalyzer is final, otherwise we could overwrite createComponents(). New output: 女单方面王适娴 second seed 和头号种子卫冕冠军西班牙选手马林 first seed 同处 1 4 区 3 号种子李雪芮和韩国选手 korean player 成池铉处在 2 4 区不过成池铉先要过日本小将 japanese player

Re: getting full english word from tokenizing with SmartChineseAnalyzer

2015-08-14 Thread Wayne Xin

Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Wayne Xin [mailto:wayne_...@hotmail.com] Sent: Friday, August 14, 2015 8:44 PM To: java-user@lucene.apache.org Subject: Re: getting full english word from tokenizing with SmartChineseAnalyzer Thanks

RE: getting full english word from tokenizing with SmartChineseAnalyzer

2015-08-14 Thread Uwe Schindler

, August 14, 2015 8:44 PM To: java-user@lucene.apache.org Subject: Re: getting full english word from tokenizing with SmartChineseAnalyzer Thanks Michael. That works well. Not sure why SmartChineseAnalyzer is final, otherwise we could overwrite createComponents(). New output: 女单方面王适娴