Re: Question about chinese and WildcardQuery

2012-06-28 Thread Paco Avila
Standard Analyzer, > it's ok to use boolean query. because "Iamok" is tokenized as I a m o > k. if search boolean query +a +m +o, it's fine. Chinese has many > letters(commonly used more than 3000). and words are very short(most > words has only 2 letters). > &g

Re: Question about chinese and WildcardQuery

2012-06-27 Thread Paco Avila
27 Li Li > standard analyzer will segment each character into a token, you should use > whitespace analyzer or your own analyzer that can tokenize it as one token > for wildcard search > 在 2012-6-27 傍晚6:20,"Paco Avila" 写道: > > > Hi there, > > > > I h

Question about chinese and WildcardQuery

2012-06-27 Thread Paco Avila
Hi there, I have to index chinese content and I don't get the expected results when searching. It seems that the WildcardQuery does not work properly with the chinese characters. See attached sample code. I store the string "专项信息管理.doc" using the StandardAnalyzer and after that search for "专项信*"