We have a multi-languaged index and we need to match accented characters with non accented characters. For example, if a document contains: mângão, the query: mangao should match it.I guess I would have to build some sort of analyzer/tokenizer for this. I was wondering if there are tokenizers already built for lucene.
Search the archives for a discussion about this, back in June I believe. I'd suggested using ICU to generate sort keys, and indexing those.
-- Ken -- Ken Krugler TransPac Software, Inc. <http://www.transpac.com> +1 530-470-9200 --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
