On 2 Apr 2012, at 10:44pm, Igor Tandetnik <itandet...@mvps.org> wrote:

> On 4/2/2012 5:34 PM, Simon Slavin wrote:
>> Hmm.  It's there:
>> 
>> <http://www.gnu.org/software/libunistring/manual/libunistring.html#unistr_002eh>
>> 
>> "The following function compares two Unicode strings of possibly different 
>> lengths.
>> 
>> — Function: int u8_cmp2 (const uint8_t *s1, size_t n1, const uint8_t *s2, 
>> size_t n2)
>> — Function: int u16_cmp2 (const uint16_t *s1, size_t n1, const uint16_t *s2, 
>> size_t n2)
>> — Function: int u32_cmp2 (const uint32_t *s1, size_t n1, const uint32_t *s2, 
>> size_t n2)
>> Compares s1 and s2, lexicographically. Returns a negative value if s1 
>> compares smaller than s2, a positive value if s1 compares larger than s2, or 
>> 0 if they compare equal."
>> 
>> I wonder whether it respects languages.
> 
> These don't,

You know, I don't care that much.  Unicode sorting even without languages would 
be a nice plugin for SQLite3, if that makes things so much simpler.

> but u8_strcoll et al supposedly do, based on LC_COLLATE locale category. 
> Herein lies the problem: if you build an index using these functions while 
> running under locale A, then try to run queries against this database in an 
> application running with locale B, bad things happen. From the point of view 
> of the second application, the index is corrupted.

Replace part of that routine with something which specifies the locale rather 
than fetching it from OS settings.  And store the locale used with the index, 
as a COLLATE setting.  Thus leaving it up to whoever writes the CREATE command 
to decide which locale was used.  I find that acceptable.  This does still give 
you the problem Jean-Christophe noted of sorting multilanguage lists of names, 
but that's inherent in Unicode.  Encountering the problem just means you're 
implementing Unicode properly.

Simon.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to