> Anyway, I would definitely unicode-normalize the strings before putting
them into the database. You might avoid the special handling for the
digraphs if you normalize towards the digraph code points: only strings
actually containing digraphs would escape your optimization.
Tough stuff. Although I tried to learn something about the Unicode, I am no
expert.
Anyway, the idea is to handle what can be handled safely and fast and let
the OS do the difficult things. We are here at mercy of the OS, but I
believe they handle correctly 100% of common strings and 99.9999% of less
frequent strings. Correct me if I am wrong, please.
In fact, our Android version uses Unicode ICU library, but we want to avoid
this in general. (Performance, size, maintenance.)
As is Unicode testing concerned, I'll start a new post.
Regards,
Jan Slodicka
--
View this message in context:
http://sqlite.1065341.n5.nabble.com/Collation-advice-tp70668p70696.html
Sent from the SQLite mailing list archive at Nabble.com.
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users