Constantine Yannakopoulos wrote
> On Mon, May 12, 2014 at 4:46 PM, Jan Slodicka <

> jano@

> > wrote:
> I understand that it is difficult to find the least greater character of a
> given character if you are unaware of the inner workings of a collation,
> but maybe finding a consistent upper limit for all characters in all
> possible collations is not impossible?
> 
> -- Constantine
> _______________________________________________
> sqlite-users mailing list

> sqlite-users@

> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

I could be wrong when I refused this idea. U+10FFFD (represented as
surrogate pair "\uDBFF\uDFFD") may be that character. So one could replace
"LIKE 'xxx%'" by "BETWEEN('xxx', 'xxx' + '\uDBFF\uDFFD').

U+10FFFD is the highest codepoint from the Unicode Private Use Area (PUA).
Unicode standards do not specify anything for this area. So vendors and/or
apps (MS Word for example) can define here their own characters incl. their
properties (font, sorting etc.). I saw somewhere that somebody placed here
Japanese Kanji characters, for example.

msdn statement for CompareString: "All characters in the PUA are sorted
after all other Unicode characters. Within the area, characters are sorted
in numerical order." Hence Windows is OK.

I could not find any such notion for UCI library, but when I tested their
interactive demo, character U+10FFFD was really sorted at the end. (Even
with Japanese, Hindi etc.) That might mean that at least the standard setup
of the UCI library sorts U+10FFFD in the expected way. Of course, this is no
proof - UCI library can be customized.

After all, it seems that U+10FFFD is a good choice, at least for a vast
majority of cases.

BTW, my tests showed that the speed improvement of the LIKE optimization is
in the range 10x-100x.

Any comments are welcome.



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/LIKE-operator-and-collations-tp73789p75643.html
Sent from the SQLite mailing list archive at Nabble.com.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to