That's something we can try.   I don't know how much it performance we'd lose 
doing that as our custom filter has to decompose the tokens to do its 
operations.   So instead of 0..1 conversions we'd be doing 1..2 conversions 
during indexing and searching.

-----Original Message-----
From: Robert Muir [mailto:rcm...@gmail.com] 
Sent: Saturday, February 21, 2009 8:35 AM
To: java-user@lucene.apache.org
Subject: Re: 2.3.2 -> 2.4.0 StandardTokenizer issue

normalize your text to NFC. then it will be \u0043 \u00F3 \u006D \u006F and
will work...


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to