On Sep 3, 2009 10:08am, C de-Avillez <hgg...@ubuntu.com> wrote: > > Interestingly, it works here with the string you used, and fails in the > following case: > > ~ $ for LANG in $(locale -a); do printf "A b\nAA b\nAAA b\n" | sort > -h|tr -d '\n'; echo; done | uniq -c > 1 A bAA bAAA b > 21 AAA bAA bA b > 1 A bAA bAAA b > 2 AAA bAA bA b > ~ $ >
I am receiving the same different sort for the C and POSIX locales with and without the -h option. These two discrepancies are due, I believe, to the collation functions for the C and POSIX locales specifying binary ordering and hence a space being sorted before an 'A'. In the other locales it seems that longer words are given preference. ~/src/core/fake$ locale -a | ./bin/sort | while read LANG ; do printf "%10s " $LANG ; echo -e 'A b\nAA b\nAAA b\n' | ./bin/sort -h | tr -d '\n' ; echo ; done ; C A bAA bAAA b POSIX A bAA bAAA b en_AU.utf8 AAA bAA bA b en_BW.utf8 AAA bAA bA b en_CA.utf8 AAA bAA bA b en_DK.utf8 AAA bAA bA b en_GB.utf8 AAA bAA bA b en_HK.utf8 AAA bAA bA b en_IE.utf8 AAA bAA bA b en_IN AAA bAA bA b en_NG AAA bAA bA b en_NZ.utf8 AAA bAA bA b en_PH.utf8 AAA bAA bA b en_SG.utf8 AAA bAA bA b en_US.utf8 AAA bAA bA b en_ZA.utf8 AAA bAA bA b en_ZW.utf8 AAA bAA bA b ~/src/core/fake$ locale -a | ./bin/sort | while read LANG ; do printf "%10s " $LANG ; echo -e 'A b\nAA b\nAAA b\n' | ./bin/sort | tr -d '\n' ; echo ; done ; C A bAA bAAA b POSIX A bAA bAAA b en_AU.utf8 AAA bAA bA b en_BW.utf8 AAA bAA bA b en_CA.utf8 AAA bAA bA b en_DK.utf8 AAA bAA bA b en_GB.utf8 AAA bAA bA b en_HK.utf8 AAA bAA bA b en_IE.utf8 AAA bAA bA b en_IN AAA bAA bA b en_NG AAA bAA bA b en_NZ.utf8 AAA bAA bA b en_PH.utf8 AAA bAA bA b en_SG.utf8 AAA bAA bA b en_US.utf8 AAA bAA bA b en_ZA.utf8 AAA bAA bA b en_ZW.utf8 AAA bAA bA b ~/src/core/fake$