Re: [sqlite] Unicode searches
> > On Sat, Apr 5, 2008 at 11:42 PM, Cory Nelson <[EMAIL PROTECTED]> wrote: > > > Sort order is highly dependent on locale. You can add custom > > > collations to do this. > <[EMAIL PROTECTED]> wrote: > > That was not was I was talking about. I was not talking about Sort Order but > > about Searches. > > Keith On Sat, Apr 05, 2008 at 02:58:48PM -0700, Cory Nelson scratched on the wall: > They are one and the same. Look up collations. I'm a bit confused why everyone keeps pointing Keith at collations. He asked about searching and matching, and in specific the LIKE operator. I'm not trying to be dense-- I read the collation web-page several times-- but I seem to be missing something. As I understand it, collations are used to define sort orders. A collation function must define it's own version of greater-than, less-than, and equal. All three of those definitions must be transitive (given A=B, B=C, then A=C; given Ahttp://www.sqlite.org/lang_expr.html), which is the same page he quoted the bug from. This page describes the LIKE, GLOB, REGEXP, and MATCH operators (all of which return a "match/no-match" value). Also see the like(), glob(), regexp(), and match() functions under the Core Functions section of the same page. If you need full Unicode support for matches and searches, it looks like your only option is to define a custom like() user function that implements the search and matching behavior you're looking for. If you need full Unicode support for sort-ordering, you also need to define a new collation. Or maybe you can find one that someone else has already written A copy of the message Miha Vrhovnik mentioned can be found here: http://www.mail-archive.com/sqlite-users%40sqlite.org/msg30403.html It seems the .c file mentioned in this post has been updated since the link was first posted. In addition to a new Unicode aware LIKE operator, the newer .c file also includes a new NOCASE collation function. So you can get Unicode aware searching/matching *and* sort-ordering. Perhaps that will fit your needs. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "'People who live in bamboo houses should not throw pandas.' Jesus said that." - "The Ninja", www.AskANinja.com, "Special Delivery 10: Pop!Tech 2006" ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Unicode searches
Cory, sorry, I had a bad day. Keith ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Unicode searches
Someone sent a sqlite3_unicode.c file to this mailing list in the last week of December, 1st week of January which implemented upper/lower and some other functions. File was released as public domain if I remember correctly and used data from Unicode 5.1 standard. As ICU brings a lot of bulk into the mix and that file brought I believe about 80k to final library size. It would be nice if that get incorporated into the tree, of course with defines for those who don't write Unicode data to db. Regards, Miha "Dan" <[EMAIL PROTECTED]> wrote on 07.04.2008 8:18:50: > >On Apr 6, 2008, at 5:10 AM, Keith Stemmer wrote: > >> Yes, I can add a custom collation which works for ASCII chars LOL. >> If you don't understand the problem, just don't reply. >> >> By the way, you can read on the SQLite website that the developer >> describes >> my problem as a BUG which is nice to read. At least he doesn't call >> it a >> feature. > >The answer you were provided with is correct and canonical in my >opinion. > >The SQLite source archive (.tar.gz, not sure about the preprocessed >versions) contains source code for an sqlite extension that binds >the ICU library to SQLite using the custom collation sequence interface >(and others). With this extension, SQLite uses the upper/lower case >tables that are part of unicode. > >See ext/icu/README in the source distro for details. > >Dan. > > > >> Keith. >> Sort order is highly dependent on locale. You can add custom collations to do this. >> >> On Sat, Apr 5, 2008 at 11:58 PM, Cory Nelson <[EMAIL PROTECTED]> >> wrote: >> >>> They are one and the same. Look up collations. >>> >>> On Sat, Apr 5, 2008 at 2:55 PM, Keith Stemmer >>> <[EMAIL PROTECTED]> wrote: That was not was I was talking about. I was not talking about Sort Order >>> but about Searches. Keith On Sat, Apr 5, 2008 at 11:42 PM, Cory Nelson <[EMAIL PROTECTED]> wrote: > Sort order is highly dependent on locale. You can add custom > collations to do this. > > On Sat, Apr 5, 2008 at 10:41 AM, Keith Stemmer > <[EMAIL PROTECTED]> wrote: >> Hello! >> >> I found SQLite quite amazing, but I think there is one >> showstopper >>> for > me. >> It seems that searches for Unicode strings are case sensitive and >>> there > is >> no (easy) way around that. >> Could you please confirm or deny this? >> >> Your explanation... >> >> (A bug: SQLite only understands upper/lower case for 7-bit Latin > characters. >> Hence the LIKE operator is case sensitive for 8-bit iso8859 >>> characters > or >> UTF-8 characters. For example, the expression 'a' LIKE 'A' is >> TRUE >>> but > 'æ' >> LIKE 'Æ' is FALSE.). >> >> seems to destroy all my hopes. >> >> Thank you very much! >>> >>> -- >>> Cory Nelson >>> http://www.int64.org -- It's time to get rid of your current e-mail client ... ... and start using si.Mail. It's small & free. ( http://www.simail.si/ ) ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Unicode searches
On Apr 6, 2008, at 5:10 AM, Keith Stemmer wrote: > Yes, I can add a custom collation which works for ASCII chars LOL. > If you don't understand the problem, just don't reply. > > By the way, you can read on the SQLite website that the developer > describes > my problem as a BUG which is nice to read. At least he doesn't call > it a > feature. The answer you were provided with is correct and canonical in my opinion. The SQLite source archive (.tar.gz, not sure about the preprocessed versions) contains source code for an sqlite extension that binds the ICU library to SQLite using the custom collation sequence interface (and others). With this extension, SQLite uses the upper/lower case tables that are part of unicode. See ext/icu/README in the source distro for details. Dan. > Keith. > >>> Sort order is highly dependent on locale. You can add custom >>> collations to do this. > > On Sat, Apr 5, 2008 at 11:58 PM, Cory Nelson <[EMAIL PROTECTED]> > wrote: > >> They are one and the same. Look up collations. >> >> On Sat, Apr 5, 2008 at 2:55 PM, Keith Stemmer >> <[EMAIL PROTECTED]> wrote: >>> That was not was I was talking about. I was not talking about >>> Sort Order >> but >>> about Searches. >>> Keith >>> >>> >>> >>> On Sat, Apr 5, 2008 at 11:42 PM, Cory Nelson <[EMAIL PROTECTED]> >>> wrote: >>> Sort order is highly dependent on locale. You can add custom collations to do this. On Sat, Apr 5, 2008 at 10:41 AM, Keith Stemmer <[EMAIL PROTECTED]> wrote: > Hello! > > I found SQLite quite amazing, but I think there is one > showstopper >> for me. > It seems that searches for Unicode strings are case sensitive and >> there is > no (easy) way around that. > Could you please confirm or deny this? > > Your explanation... > > (A bug: SQLite only understands upper/lower case for 7-bit Latin characters. > Hence the LIKE operator is case sensitive for 8-bit iso8859 >> characters or > UTF-8 characters. For example, the expression 'a' LIKE 'A' is > TRUE >> but 'æ' > LIKE 'Æ' is FALSE.). > > seems to destroy all my hopes. > > Thank you very much! >> >> -- >> Cory Nelson >> http://www.int64.org >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Unicode searches
Keith Stemmer schrieb: > Yes, I can add a custom collation which works for ASCII chars LOL. > Plain wrong > If you don't understand the problem, just don't reply. > > Plain unreasonable carefulle read ( and understand) this http://sqlite.org/c3ref/create_collation.html and this http://www.wikihow.com/Be-Polite regards, gunnnar ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Unicode searches
Yes, I can add a custom collation which works for ASCII chars LOL. If you don't understand the problem, just don't reply. By the way, you can read on the SQLite website that the developer describes my problem as a BUG which is nice to read. At least he doesn't call it a feature. Keith. > > Sort order is highly dependent on locale. You can add custom > > collations to do this. On Sat, Apr 5, 2008 at 11:58 PM, Cory Nelson <[EMAIL PROTECTED]> wrote: > They are one and the same. Look up collations. > > On Sat, Apr 5, 2008 at 2:55 PM, Keith Stemmer > <[EMAIL PROTECTED]> wrote: > > That was not was I was talking about. I was not talking about Sort Order > but > > about Searches. > > Keith > > > > > > > > On Sat, Apr 5, 2008 at 11:42 PM, Cory Nelson <[EMAIL PROTECTED]> wrote: > > > > > Sort order is highly dependent on locale. You can add custom > > > collations to do this. > > > > > > On Sat, Apr 5, 2008 at 10:41 AM, Keith Stemmer > > > <[EMAIL PROTECTED]> wrote: > > > > Hello! > > > > > > > > I found SQLite quite amazing, but I think there is one showstopper > for > > > me. > > > > It seems that searches for Unicode strings are case sensitive and > there > > > is > > > > no (easy) way around that. > > > > Could you please confirm or deny this? > > > > > > > > Your explanation... > > > > > > > > (A bug: SQLite only understands upper/lower case for 7-bit Latin > > > characters. > > > > Hence the LIKE operator is case sensitive for 8-bit iso8859 > characters > > > or > > > > UTF-8 characters. For example, the expression 'a' LIKE 'A' is TRUE > but > > > 'æ' > > > > LIKE 'Æ' is FALSE.). > > > > > > > > seems to destroy all my hopes. > > > > > > > > Thank you very much! > > -- > Cory Nelson > http://www.int64.org > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Unicode searches
They are one and the same. Look up collations. On Sat, Apr 5, 2008 at 2:55 PM, Keith Stemmer <[EMAIL PROTECTED]> wrote: > That was not was I was talking about. I was not talking about Sort Order but > about Searches. > Keith > > > > On Sat, Apr 5, 2008 at 11:42 PM, Cory Nelson <[EMAIL PROTECTED]> wrote: > > > Sort order is highly dependent on locale. You can add custom > > collations to do this. > > > > On Sat, Apr 5, 2008 at 10:41 AM, Keith Stemmer > > <[EMAIL PROTECTED]> wrote: > > > Hello! > > > > > > I found SQLite quite amazing, but I think there is one showstopper for > > me. > > > It seems that searches for Unicode strings are case sensitive and there > > is > > > no (easy) way around that. > > > Could you please confirm or deny this? > > > > > > Your explanation... > > > > > > (A bug: SQLite only understands upper/lower case for 7-bit Latin > > characters. > > > Hence the LIKE operator is case sensitive for 8-bit iso8859 characters > > or > > > UTF-8 characters. For example, the expression 'a' LIKE 'A' is TRUE but > > 'æ' > > > LIKE 'Æ' is FALSE.). > > > > > > seems to destroy all my hopes. > > > > > > Thank you very much! -- Cory Nelson http://www.int64.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Unicode searches
That was not was I was talking about. I was not talking about Sort Order but about Searches. Keith On Sat, Apr 5, 2008 at 11:42 PM, Cory Nelson <[EMAIL PROTECTED]> wrote: > Sort order is highly dependent on locale. You can add custom > collations to do this. > > On Sat, Apr 5, 2008 at 10:41 AM, Keith Stemmer > <[EMAIL PROTECTED]> wrote: > > Hello! > > > > I found SQLite quite amazing, but I think there is one showstopper for > me. > > It seems that searches for Unicode strings are case sensitive and there > is > > no (easy) way around that. > > Could you please confirm or deny this? > > > > Your explanation... > > > > (A bug: SQLite only understands upper/lower case for 7-bit Latin > characters. > > Hence the LIKE operator is case sensitive for 8-bit iso8859 characters > or > > UTF-8 characters. For example, the expression 'a' LIKE 'A' is TRUE but > 'æ' > > LIKE 'Æ' is FALSE.). > > > > seems to destroy all my hopes. > > > > Thank you very much! > > ___ > > sqlite-users mailing list > > sqlite-users@sqlite.org > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > > > > -- > Cory Nelson > http://www.int64.org > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Unicode searches
Sort order is highly dependent on locale. You can add custom collations to do this. On Sat, Apr 5, 2008 at 10:41 AM, Keith Stemmer <[EMAIL PROTECTED]> wrote: > Hello! > > I found SQLite quite amazing, but I think there is one showstopper for me. > It seems that searches for Unicode strings are case sensitive and there is > no (easy) way around that. > Could you please confirm or deny this? > > Your explanation... > > (A bug: SQLite only understands upper/lower case for 7-bit Latin characters. > Hence the LIKE operator is case sensitive for 8-bit iso8859 characters or > UTF-8 characters. For example, the expression 'a' LIKE 'A' is TRUE but 'æ' > LIKE 'Æ' is FALSE.). > > seems to destroy all my hopes. > > Thank you very much! > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- Cory Nelson http://www.int64.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Unicode searches
Hello! I found SQLite quite amazing, but I think there is one showstopper for me. It seems that searches for Unicode strings are case sensitive and there is no (easy) way around that. Could you please confirm or deny this? Your explanation... (A bug: SQLite only understands upper/lower case for 7-bit Latin characters. Hence the LIKE operator is case sensitive for 8-bit iso8859 characters or UTF-8 characters. For example, the expression 'a' LIKE 'A' is TRUE but 'æ' LIKE 'Æ' is FALSE.). seems to destroy all my hopes. Thank you very much! ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users