(sorry, tangent. I'll be quick) On Tue, Aug 4, 2009 at 8:42 AM, Shai Erera<[email protected]> wrote: > Interesting ... I don't have access to a Japanese dictionary, so I just > extract bi-grams.
Shai - if you're interested in parsing Japanese, check out Kakasi. It can split into words and convert Kanji->Katakana/Hirugana/Romaji - after which I would index them all. http://kakasi.namazu.org/ http://www.kawao.com/java/kakasi/api/com/kawao/kakasi/Kakasi.html http://kakasi.namazu.org/stable/kakasi-2.3.4.tar.gz <-- contain GPL Japanese dictionary --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
