On Wed, Jan 9, 2013 at 5:25 PM, Steve Rowe <sar...@gmail.com> wrote: > Dude. Go look. It allows for per-script specialization, with (non-UAX#29) > specializations by default for Thai, Lao, Myanmar and Hewbrew. See > DefaultICUTokenizerConfig. It's filled with exactly the opposite of what you > were describing.
I guess that's a reasonable start. Still has no specialisation for straight Roman script, but I guess it could always be added. TX --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org