On Mon, Apr 2, 2012 at 5:46 PM, Simon Slavin <[email protected]> wrote: > Replace part of that routine with something which specifies the locale rather > than fetching it from OS settings. And store the locale used with the index, > as a COLLATE setting. Thus leaving it up to whoever writes the CREATE > command to decide which locale was used. I find that acceptable. This does > still give you the problem Jean-Christophe noted of sorting multilanguage > lists of names, but that's inherent in Unicode. Encountering the problem > just means you're implementing Unicode properly.
If only it were that easy. A plain C locale (i.e., byte-wise) collation will result in "encountering the problem", but you won't be "implementing Unicode properly", you won't be implementing it at all! Even if you use some Unicode collation, if you don't handle normalization insensitivity then you're not really doing it right either. Consider that HFS+ on MacOS X always normalizes to NFD on file/directory create. But all user input methods I've seen to date produce NFC for all Latin-* characters! This means that if someone does a cut-n-paste of filenames from an HFS+ filesystem then there will be a very difficult-to-detect conflict. Unicode is hard. I want a doll that says that. Nico -- _______________________________________________ sqlite-users mailing list [email protected] http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

