Re: Question about chinese and WildcardQuery

2012-06-28 Thread Paco Avila
Thank, using Whitespace Analyzer works, but I don't understand why StandardAnalyzer does not work if according with the ChineseAnalyzer deprecation I should use StandardAnalyzer: @deprecated Use {@link StandardAnalyzer} instead, which has the same functionality. Is very annoying. 2012/6/27 Li

Re: Question about chinese and WildcardQuery

2012-06-28 Thread Li Li
in Chinese, there isn't word boundary between words. it writes like: Iamok. you should tokenize it to I am ok if you want to search *amo*, you should view I am ok as one token. In Chinese, fuzzy search is not very useful. even use Standard Analyzer, it's ok to use boolean query. because Iamok is

Re: Question about chinese and WildcardQuery

2012-06-28 Thread wangjing
最好搜索的Analyzer 和生成index的Analyzer 保持一致 On Thu, Jun 28, 2012 at 2:31 PM, Paco Avila monk...@gmail.com wrote: Thank, using Whitespace Analyzer works, but I don't understand why StandardAnalyzer does not work if according with the ChineseAnalyzer deprecation I should use StandardAnalyzer:

Re: Question about chinese and WildcardQuery

2012-06-28 Thread Paco Avila
Thanks for the info. 2012/6/28 Li Li fancye...@gmail.com in Chinese, there isn't word boundary between words. it writes like: Iamok. you should tokenize it to I am ok if you want to search *amo*, you should view I am ok as one token. In Chinese, fuzzy search is not very useful. even use

Question about chinese and WildcardQuery

2012-06-27 Thread Paco Avila
Hi there, I have to index chinese content and I don't get the expected results when searching. It seems that the WildcardQuery does not work properly with the chinese characters. See attached sample code. I store the string 专项信息管理.doc using the StandardAnalyzer and after that search for 专项信* and

Re: Question about chinese and WildcardQuery

2012-06-27 Thread Li Li
standard analyzer will segment each character into a token, you should use whitespace analyzer or your own analyzer that can tokenize it as one token for wildcard search 在 2012-6-27 傍晚6:20,Paco Avila monk...@gmail.com写道: Hi there, I have to index chinese content and I don't get the expected