(sorry, tangent. I'll be quick)

On Tue, Aug 4, 2009 at 8:42 AM, Shai Erera<ser...@gmail.com> wrote:
> Interesting ... I don't have access to a Japanese dictionary, so I just
> extract bi-grams.

Shai - if you're interested in parsing Japanese, check out Kakasi. It
can split into words and convert Kanji->Katakana/Hirugana/Romaji -
after which I would index them all.
http://kakasi.namazu.org/
http://www.kawao.com/java/kakasi/api/com/kawao/kakasi/Kakasi.html
http://kakasi.namazu.org/stable/kakasi-2.3.4.tar.gz <-- contain GPL
Japanese dictionary

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to