> Anyway, I would definitely  unicode-normalize the strings before putting
them into the database. You might avoid the special handling for the
digraphs if you normalize towards the digraph code points: only strings
actually containing digraphs would escape your optimization. 

Tough stuff. Although I tried to learn something about the Unicode, I am no
expert.

Anyway, the idea is to handle what can be handled safely and fast and let
the OS do the difficult things.  We are here at mercy of the OS, but I
believe they handle correctly 100% of common strings and 99.9999% of less
frequent strings. Correct me if I am wrong, please.

In fact, our Android version uses Unicode ICU library, but we want to avoid
this in general. (Performance, size, maintenance.)

As is Unicode testing concerned, I'll start a new post.

Regards,
    Jan Slodicka



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Collation-advice-tp70668p70696.html
Sent from the SQLite mailing list archive at Nabble.com.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to