bug#22275: a before null
in what circumstances is this a sensible and/or desirable result?: $ sort -V< 7.7z > 7a.7z > eoi 7a.7z 7.7z
Re: subtle sort bug?
On Thu, 2003-07-03 at 15:07, gregory mott wrote: i fail to understand. i've used the same stock definitions: # --- /usr/share/i18n/locales/g --- # build with: # localedef -i g -c g what i had missed was localedef -f now i've got it ___ Bug-coreutils mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/bug-coreutils
Re: subtle sort bug?
On Tue, 2003-07-01 at 23:18, Paul Eggert wrote: gregory mott [EMAIL PROTECTED] writes: can you point me to an appropriate RTFM that ideally would layout what encodings are used by what locales, or how to tell what encoding you have/need, etc usw? Sorry, no; this stuff tends to be scattered around all over the place. On my Debian GNU/Linux 3.0r1 system, the file /usr/share/i18n/SUPPORTED lists the encodings used by locales, but things may be different on your system. For general info about encodings you might try www.li18nux.org and/or Ken Lunde's book on encodings and character sets http://www.praxagora.com/lunde/cjkv-ip.html. i've read things hither and yon, i remain in the dark.. when i pass textual input to sort, how does sort come to decide or infer the encoding? you seem to say that a locale is associated with a particular encoding. well, hmm. on rh9, the locale definitions (eg /usr/share/i18n/locales/en_IN) appear to be in unicode. i do not see where a locale becomes associated with any particular encoding (such as UTF-8 or ISO-8859-15). it seems i can fix the en_AU failure by specifying: $ LC_CTYPE=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8 sort /tmp/sos groan grosr gro grost red rsum resumed but that approach doesn't seem to help my personal locale definition: $ LC_CTYPE=g.UTF-8 LC_COLLATE=g.UTF-8 sort /tmp/sos groan grosr grost gro red resumed rsum i fail to understand. i've used the same stock definitions: # --- /usr/share/i18n/locales/g --- # build with: # localedef -i g -c g LC_CTYPE copy i18n END LC_CTYPE LC_COLLATE copy iso14651_t1 END LC_COLLATE LC_TIME d_fmt U0025U0059U002FU0025U006DU002FU0025U0064 END LC_TIME can you/anyone give me a clue? ___ Bug-coreutils mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/bug-coreutils
Re: subtle sort bug?
On Tue, 2003-07-01 at 21:43, Paul Eggert wrote: gregory mott [EMAIL PROTECTED] writes: for example, en_IN repeatably produces proper results, but en_AU repeatably fails to handle some special characters properly. I reproduced your results on my host (Debian GNU/Linux 3.0r1). However, on my host what you were doing was a user error, as en_IN uses UTF-8 encoding but en_AU uses ISO-8859-1, and those two encodings are incompatible. Your email used UTF-8, so most likely your /tmp/sos file was UTF-8 (as was mine). One can't reliably sort a UTF-8 file using an ISO-8859-1 locale, so it wouldn't be surprising that an en_AU sort misfired. cool! can you point me to an appropriate RTFM that ideally would layout what encodings are used by what locales, or how to tell what encoding you have/need, etc usw? ___ Bug-coreutils mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/bug-coreutils
subtle sort bug?
it seems sort is suffering from some subtle bug. does this happen for anyone else? is it just my machine, is it a redhat problem, or is it actually a gnu bug? for example, en_IN repeatably produces proper results, but en_AU repeatably fails to handle some special characters properly. but both of these locales use the same stock definitions: LC_CTYPE copy i18n END LC_CTYPE LC_COLLATE copy iso14651_t1 END LC_COLLATE en_IN producing correct results: $ LC_CTYPE=en_IN LC_COLLATE=en_IN sort /tmp/sos groan grosr gro grost red rsum resumed en_AU producing incorrect results: $ LC_CTYPE=en_AU LC_COLLATE=en_AU sort /tmp/sos gro groan grosr grost rsum red resumed (glibc-common-2.3.2-27.9, coreutils-4.5.3-19) (redhat 9+up2date) ___ Bug-coreutils mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/bug-coreutils