On 19 Mar 2014, at 3:36pm, Alex Loukissas <a...@maginatics.com> wrote:
> Thanks everyone for your comments. IIUC, the correct way of going about > what I want to do is to use BINARY collation on the column I'm interested > in and when I want to do unicode-aware case-insensitive lookups, they > should look something like SELECT * FROM table WHERE LOWER(col_name) = > LOWER(key), correct? It seems like with ICU support, LOWER( ) will call > u_foldCase under the covers, which is what I want. This solution suggests that a good compromise for handling Unicode is a hashing function. It would be equivalent to NOCASE, but for Unicode characters, and instead of just removing case it would also remove accents and various other 'hints'. How it would handle the various unicode characters which have no equivalent in the any alphabet, I have no idea. It would be reasonable for all 'Right Arrow' characters to have the same hash, but how much data about Unicode would it take to do that ? Maybe it should just leave all non-alphabetic characters as they are. Simon. _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users