Hi Eirik, I believe "icu tokenizer" does a decent job on text written in non-alphabets. Ahmet
On Monday, May 22, 2017, 10:32:22 AM GMT+3, Eirik Hungnes <[email protected]> wrote: Hi, There doesn't seem to be any Tokenizer / Analyzer for Vietnamese built in to Lucene at the moment. Does anyone know if something like this exists today or is planned for? We found this https://github.com/CaoManhDat/VNAnalyzer made by Cao Mahn Dat, but not sure if it's up to date. Any info highly appreciated! Thanks, Eirik
