using CJKTokenizerFactory for Japanese language

Kumar Pandey Thu, 11 Nov 2010 08:50:23 -0800

I am exploring support for Japanese language in solr.
Solr seems to provide CJKTokenizerFactory.
How useful is this module? Has anyone been using this in production for
Japanese language?


One shortfall it seems to have from what I have been able to read up on is
that it can generate lot of false matches. For example mathcing kyoto when
searching for tokyo etc.

I did not see many questions related to this module so I wonder if people
are actively using it.
If not are there any other solution in the market that are recommended by
solr users?

Thanks
Kumar

using CJKTokenizerFactory for Japanese language

Reply via email to