Package: locales Version: 2.3.6.ds1-13etch7 Severity: normal Tags: l10n
I was doing a bit of C++ programming, and replacing my own swedish collation algorithm with the standard locales (through the standard C++ std::locale interface), when my unit tests started to fail. It turned out I could repeat it with the standard sort utility, so that's what I'll use here. This quote from /usr/share/i18n/locales/sv_SE describes what the locale intends to implement, and it's also the rule I am familiar with from real life: % The letter w is normally not present in the Swedish alphabet. It % exists in some names in Swedish and foreign words, but is accounted % for as a variant of 'v'. Words and names with 'w' are in Swedish % ordered alphabetically among the words and names with 'v'. If two % words or names are only to be distinguished by 'v' or % 'w', 'v' is % placed before 'w'. And that seems to work *some* of the time ... out of the following three examples, the two first are ok and show how it should work. The third is simply wrong -- "wword" and "vword" are identical except one contains the 'w' variant of the letter 'v', and should thus collate last. tuva:~> /bin/echo -e "word\nvorm" | env LC_COLLATE=sv_SE.iso88591 sort word vorm tuva:~> /bin/echo -e "word\nvord" | env LC_COLLATE=sv_SE.iso88591 sort vord word tuva:~> /bin/echo -e "vword\nwword" | env LC_COLLATE=sv_SE.iso88591 sort wword vword tuva:~> I have not done any further experiments to see what triggers it. I cannot help suspecting that similar rules for other languages are affected as well ... Final side note: Solaris 8 passes this test. That's the only other Unix I've tested. regards, Jorgen -- System Information: Debian Release: 4.0 APT prefers stable APT policy: (500, 'stable') Architecture: powerpc (ppc) Shell: /bin/sh linked to /bin/bash Kernel: Linux 2.6.18-3-powerpc Locale: LANG=sv_SE.utf8, LC_CTYPE=sv_SE.utf8 (charmap=UTF-8) Versions of packages locales depends on: ii debconf [debconf-2.0] 1.5.11etch2 Debian configuration management sy ii libc6 [glibc-2.3.6.ds1 2.3.6.ds1-13etch7 GNU C Library: Shared libraries locales recommends no packages. -- debconf information: locales/default_environment_locale: en_US locales/locales_to_be_generated: en_US ISO-8859-1, sv_SE.UTF-8 UTF-8, sv_SE ISO-8859-1 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]