Re: Use of Unicode data in Lucene

2009-03-05 Thread Chris Hostetter
: I can implement the functionality just using the data tables from the Unicode : Consortium, including http://www.unicode.org/reports/tr39, but there's still : the issue of the Unicode data license and its compatibility with Apache 2.0. : : Does anybody know whether http://www.unicode.org/copyri

Re: Use of Unicode data in Lucene

2009-02-25 Thread Robert Muir
Ken, Just my opinion here... i work with a lot of multilingual data with lucene. I can't imagine many serious real-world applications doing things such as search that wouldn't need ICU for something anyway... even if its not the lucene piece requiring it... I hope this doesn't discourage you from

Use of Unicode data in Lucene

2009-02-25 Thread Ken Krugler
Hi all, I've started working on something similar to https://issues.apache.org/jira/browse/LUCENE-1343, which is about creating a better (more universal) normalizer for words that "look the same". I'd like to avoid the dependency on ICU4J, which (I think) would otherwise prevent the code fr