Hi all,
I've started working on something similar to
https://issues.apache.org/jira/browse/LUCENE-1343, which is about
creating a better (more universal) normalizer for words that "look
the same".
I'd like to avoid the dependency on ICU4J, which (I think) would
otherwise prevent the code from being part of the core - due to
license issues, it would have to languish in contrib.
I can implement the functionality just using the data tables from the
Unicode Consortium, including http://www.unicode.org/reports/tr39,
but there's still the issue of the Unicode data license and its
compatibility with Apache 2.0.
Does anybody know whether http://www.unicode.org/copyright.html
creates an issue? What's the process for vetting a license? Or is
this something I should be posting to a different list?
Thanks,
-- Ken
--
Ken Krugler
+1 530-210-6378
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]