Use of Unicode data in Lucene

Ken Krugler Wed, 25 Feb 2009 12:23:41 -0800

Hi all,

I've started working on something similar tohttps://issues.apache.org/jira/browse/LUCENE-1343, which is aboutcreating a better (more universal) normalizer for words that "lookthe same".

I'd like to avoid the dependency on ICU4J, which (I think) wouldotherwise prevent the code from being part of the core - due tolicense issues, it would have to languish in contrib.

I can implement the functionality just using the data tables from theUnicode Consortium, including http://www.unicode.org/reports/tr39,but there's still the issue of the Unicode data license and itscompatibility with Apache 2.0.

Does anybody know whether http://www.unicode.org/copyright.htmlcreates an issue? What's the process for vetting a license? Or isthis something I should be posting to a different list?


Thanks,

-- Ken
--
Ken Krugler
+1 530-210-6378

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Use of Unicode data in Lucene

Reply via email to