Thanks Jonathan,
I had looked at package contents to figure out the sequence of fr_CA,
and had found /usr/share/i18n/locales/. I then looked at
/usr/share/i18n/locales/fr_CA, then at en_CA, then at iso14651_t1, and
finally at iso14651_t1_common. This is where I decided to stop guesswork
and looked for actual documentation.
I agree that not knowing the file's syntax was the final thing that
discouraged me, but even seeing what locale(5) contains now is of little
help (for me, it doesn't change anything).
I did mean this bug as being about the lack of *specification* of
collation. Linking to a manual giving hints on how to interpret the code
is better than nothing, but only a fraction of users will dare going
that way. This is not about strcoll's manpage. I probably shouldn't have
mentioned strcoll() specifically, this is about collation in general. I
believe this should be documented in glibc-doc-reference, in section 7
"Locales and Internationalization" and easily reachable from 5.6
Collation Functions.
I think an even more general issue is that the influence of choosing a
specific locale doesn't seem to be explained. The documentation explains
what different locales can change, but not what each locale does.
Debian's best-known interface to locale choice is dpkg-reconfigure
locales. I'm not sure my dad would find it obvious that he wants to pick
"fr_CA.UTF-8 UTF-8" there.
I don't think specifying the collation order of each locale would give
that little gain. What made me hit this issue is I was trying to
determine what locale a multilingual program should use (the best
compromise assuming that a single locale will be used). Collation is
important, and I think many people wonder how it works.
I however do agree that this will require important work.
Anyway, if we stick to the issue of collation, the Unicode collation
algorithm is documented on http://www.unicode.org/reports/tr10/
The specification is non-free, but specifying the parameters of each
locale and linking to it would be enough for me.
As for non-Unicode locales, I don't know.
POSIX 7.3.2 does contain a nice amount of useful information. It clearly
describes collating sequence definitions. It also gives the collating
sequence definition of C. That one is quite accessible. Thanks for that
too Jonathan.
--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org