I posted about this subject earlier too (in OpenSolaris we added Unicode
tolower() and toupper() functions to SQLite 2.x).  But the fact that
SQLite 3.x already supports ICU disuaded me from pursuing the matter
further.  OpenSolaris has a light-weight Unicode API (licensed under the
CDDL), much like the one you wrote, capable of doing normalization and
case conversions, as well as case- and even normalization-insensitive
string comparison.

It hadn't occurred to me that ICU might be so large as to make use of
alternative libraries interesting.  When we port the relevant app in
OpenSolaris to use SQLite 3.x I'll look again at contributing patches
that add support for the OpenSolaris Unicode APIs.

I hadn't considered something like an "unaccented collation" -- it
sounds tricky.  Which modifiers should be dropped, which should be kept?
That can depend on what language the text is written in, and how much
lossiness or what false positive rates you're willing to accept.  I
recommend that you try normalization-insensitive collation before
resorting to an "unaccented collation."

Nico
-- 

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to