Re: [sqlite] bug? like-search with german umlaut is case-sensitive, should not be

2008-11-15 Thread Peter Jacobi
I'm aware that ICU is able to provide a very general solution, but I'm
wondering about two other options:

(1) Just as an OS abstraction layer is in place for I/O, wouldn't it
be possible to use an OS abstraction layer for L14N? So that for
example uppercasing is forwarded to LCMapString(LCMAP_UPPERCASE) on
Win32. That would bring the Sqlite behaviour in line with the handling
in the application program itself (provided that it uses OS APIs and
not ICU).

(2) I'm under the impression, that the problematic cases (german
sharp-s, turkic i) are few compared with all the cases where a simple
lookup would things make work. If I'm not mistaken, a lookup table of
2048 entries handling all 2 byte UTF-8 characters would already cover
all the joint character repertoire of all ISO-8859-*  (and their MSFT
counterparts). Thai (in ISO 8859-11) is using three byte UTF-8 but
doesn't have upper/lower case.

Regards,
Peter
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug? like-search with german umlaut is case-sensitive, should not be

2008-11-14 Thread Thomas Mittelstaedt
Thanks for that pointer to the icu project. Did not know about that!!

thomas

Am Freitag, den 14.11.2008, 15:27 +0200 schrieb Elefterios
Stamatogiannakis:
> Has anybody successfully compiled sqlite with icu for win32?
> 
> I haven't managed to find an libicu for mingw. Any tips welcome.
> 
> lefteris
> 
> D. Richard Hipp wrote:
> > On Nov 14, 2008, at 8:08 AM, Martin Engelschalk wrote:
> > 
> >> Hi all,
> >>
> >> the ICU project is a very powerful tool to handle codepages, and also
> >> supports regular expressions (using a class named "RegexMatcher", see
> >> http://icu-project.org/apiref/icu4c/classRegexMatcher.html).
> >> So, it should be relatively easy to replace the like() - function in
> >> sqlite (see http://www.sqlite.org/lang_corefunc.html#like and
> >> http://www.sqlite.org/c3ref/create_function.html)
> >>
> > 
> > http://www.sqlite.org/cvstrac/fileview?f=sqlite/ext/icu/README.txt=1.2
> > 
> > D. Richard Hipp
> > [EMAIL PROTECTED]
> > 
> > 
> > 
> > ___
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug? like-search with german umlaut is case-sensitive, should not be

2008-11-14 Thread Elefterios Stamatogiannakis
Has anybody successfully compiled sqlite with icu for win32?

I haven't managed to find an libicu for mingw. Any tips welcome.

lefteris

D. Richard Hipp wrote:
> On Nov 14, 2008, at 8:08 AM, Martin Engelschalk wrote:
> 
>> Hi all,
>>
>> the ICU project is a very powerful tool to handle codepages, and also
>> supports regular expressions (using a class named "RegexMatcher", see
>> http://icu-project.org/apiref/icu4c/classRegexMatcher.html).
>> So, it should be relatively easy to replace the like() - function in
>> sqlite (see http://www.sqlite.org/lang_corefunc.html#like and
>> http://www.sqlite.org/c3ref/create_function.html)
>>
> 
> http://www.sqlite.org/cvstrac/fileview?f=sqlite/ext/icu/README.txt=1.2
> 
> D. Richard Hipp
> [EMAIL PROTECTED]
> 
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug? like-search with german umlaut is case-sensitive, should not be

2008-11-14 Thread D. Richard Hipp

On Nov 14, 2008, at 8:08 AM, Martin Engelschalk wrote:

> Hi all,
>
> the ICU project is a very powerful tool to handle codepages, and also
> supports regular expressions (using a class named "RegexMatcher", see
> http://icu-project.org/apiref/icu4c/classRegexMatcher.html).
> So, it should be relatively easy to replace the like() - function in
> sqlite (see http://www.sqlite.org/lang_corefunc.html#like and
> http://www.sqlite.org/c3ref/create_function.html)
>

http://www.sqlite.org/cvstrac/fileview?f=sqlite/ext/icu/README.txt=1.2

D. Richard Hipp
[EMAIL PROTECTED]



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug? like-search with german umlaut is case-sensitive, should not be

2008-11-14 Thread Martin Engelschalk
Hi all,

the ICU project is a very powerful tool to handle codepages, and also 
supports regular expressions (using a class named "RegexMatcher", see 
http://icu-project.org/apiref/icu4c/classRegexMatcher.html).
So, it should be relatively easy to replace the like() - function in 
sqlite (see http://www.sqlite.org/lang_corefunc.html#like and 
http://www.sqlite.org/c3ref/create_function.html)

Martin

Igor Tandetnik wrote:
> "Thomas Mittelstaedt"
> <[EMAIL PROTECTED]> wrote in
> message news:[EMAIL PROTECTED]
>   
>> Just did a search on my database using
>> SELECT * FROM ku2008 where "Empfaenger 1" like '%köck%';
>>
>> and nothing was found. Doing a SELECT * FROM ku2008 where "Empfaenger
>> 1" like '%kÖck%'; with the capital umlaut did find the record.
>> 
>
> http://sqlite.org/lang_expr.html
>
> "SQLite only understands upper/lower case for 7-bit Latin characters. 
> Hence the LIKE operator is case sensitive for 8-bit iso8859 characters 
> or UTF-8 characters. For example, the expression 'a' LIKE 'A' is TRUE 
> but 'æ' LIKE 'Æ' is FALSE."
>
> Apparently, it's possible to integrate SQLite with ICU 
> (http://icu-project.org/) to support properly localized collation and 
> case folding. I don't know the details, hopefully someone more 
> knowledgeable will chime in.
>
> Igor Tandetnik 
>
>
>
>   
> 
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>   

-- 

* Codeswift GmbH *
Traunstr. 30
A-5026 Salzburg-Aigen
Tel: +49 (0) 8662 / 494330
Mob: +49 (0) 171 / 4487687
Fax: +49 (0) 12120 / 204645
[EMAIL PROTECTED]
www.codeswift.com / www.swiftcash.at

Codeswift Professional IT Services GmbH
Firmenbuch-Nr. FN 202820s
UID-Nr. ATU 50576309

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug? like-search with german umlaut is case-sensitive, should not be

2008-11-14 Thread Michael Schlenker
Thomas Mittelstaedt schrieb:
> Hallo,
> 
> Just did a search on my database using 
> SELECT * FROM ku2008 where "Empfaenger 1" like '%köck%';
> 
> and nothing was found. Doing a SELECT * FROM ku2008 where "Empfaenger 1"
> like '%kÖck%'; with the capital umlaut did find the record. 
> The data is utf-8! my sqlite version is 3.5.9 on ubuntu hardy.
>
Documented bug, see the sqlite expressions documentation page which states:
http://www.sqlite.org/lang_expr.html

(A bug: SQLite only understands upper/lower case for 7-bit Latin characters.
Hence the LIKE operator is case sensitive for 8-bit iso8859 characters or
UTF-8 characters. For example, the expression 'a' LIKE 'A' is TRUE but 'æ'
LIKE 'Æ' is FALSE.).

But its hard to fix as you would need language information for the data to
get the upper/lower thing always correct (just think about the ß -> SS
anomaly in german).

Michael

-- 
Michael Schlenker
Software Engineer

CONTACT Software GmbH   Tel.:   +49 (421) 20153-80
Wiener Straße 1-3   Fax:+49 (421) 20153-41
28359 Bremen
http://www.contact.de/  E-Mail: [EMAIL PROTECTED]

Sitz der Gesellschaft: Bremen
Geschäftsführer: Karl Heinz Zachries, Ralf Holtgrefe
Eingetragen im Handelsregister des Amtsgerichts Bremen unter HRB 13215
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug? like-search with german umlaut is case-sensitive, should not be

2008-11-14 Thread Martin Engelschalk
Hello Thomas,

I have the same problem. There is no readily available function for 
converting utf-8 characters outside 7-bit-Ascii from lower to upper, so 
sqlite does not use one.
To achieve this, you have to write your own function and/or incorporate 
something like ICU into your project. I still have hte work before me.

Martin

Thomas Mittelstaedt wrote:
> Hallo,
>
> Just did a search on my database using 
> SELECT * FROM ku2008 where "Empfaenger 1" like '%köck%';
>
> and nothing was found. Doing a SELECT * FROM ku2008 where "Empfaenger 1"
> like '%kÖck%'; with the capital umlaut did find the record. 
> The data is utf-8! my sqlite version is 3.5.9 on ubuntu hardy.
>
> thomas
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
>   

-- 

* Codeswift GmbH *
Traunstr. 30
A-5026 Salzburg-Aigen
Tel: +49 (0) 8662 / 494330
Mob: +49 (0) 171 / 4487687
Fax: +49 (0) 12120 / 204645
[EMAIL PROTECTED]
www.codeswift.com / www.swiftcash.at

Codeswift Professional IT Services GmbH
Firmenbuch-Nr. FN 202820s
UID-Nr. ATU 50576309

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] bug? like-search with german umlaut is case-sensitive, should not be

2008-11-14 Thread Thomas Mittelstaedt
Hallo,

Just did a search on my database using 
SELECT * FROM ku2008 where "Empfaenger 1" like '%köck%';

and nothing was found. Doing a SELECT * FROM ku2008 where "Empfaenger 1"
like '%kÖck%'; with the capital umlaut did find the record. 
The data is utf-8! my sqlite version is 3.5.9 on ubuntu hardy.

thomas


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users