Re: [firebird-support] Case and Accent insensitive compares

Paul Vinkenoog p...@vinkenoog.nl [firebird-support] Wed, 15 Jun 2016 09:48:25 -0700

Hello Stefan,

> I expect that an accent insensitive compare treats accented characters
> as the "same" as their un-accented counterparts because the accent
> does not change the character itself but things like pronounciation or
> stress.
>
> So in Frech, à is similar to a, é is similar to è and you use an
> accent insensitive compare to find Gérard even though your search term
> says Gerard (without the accent).
>
> However, in the German language, the letters Ö and O are two different
> characters with a completely different pronounciation (the same is
> true for A/Ä and U/Ü). As they look similar, the sorting is done so
> that they stay together, but they can _not_ be treated as accented
> versions of each other.


UNICODE_CI_AI is a generic, language-independent collation. Since
ö, ü and ä are not specific to German (they also exist in Dutch, for
instance, and ö and ä in Swedish, and ö and ü in Hungarian, etc.)
it will simply treat them as accented forms of o, u and a.

Also, it is questionable if you should consider a and ä different
letters, even in German. See e.g. 
https://de.wikipedia.org/wiki/Alphabetische_Sortierung

DIN 5007 Variante 1 (für Wörter verwendet, etwa in Lexika; Abschnitt 6.1.1.4.1)

    ä und a sind gleich
    ö und o sind gleich
    ü und u sind gleich
    ß und ss sind gleich

DIN 5007 Variante 2 (spezielle Sortierung für Namenslisten, etwa in 
Telefonbüchern; Abschnitt 6.1.1.4.2)

    ä und ae sind gleich
    ö und oe sind gleich
    ü und ue sind gleich
    ß und ss sind gleich

If you do want to treat them as different letters, you need a German
collation that does just that. However, this collation will not work
correctly with words in some other languages containing ä, ö and ü.


Cheers,
Paul Vinkenoog

Re: [firebird-support] Case and Accent insensitive compares

Reply via email to