I've recently found out about a somewhat unituitive behaviour in ls.
Execute the following commands:

bash> LANG=en_US.utf-8
bash> touch vi-mode vima vimz
bash> ls vi-mode vima vimz
vima vi-mode vimz

The strange sort order in the output of ls is there because ls uses
strcoll to perform sorting, and strcoll seems to only sort on
alphabetic characters and ignore e.g. '-', hence 'vim' and 'vi-m' are
considered equvalent. I realise that collation is different in unicode
than in plain C, and I can see why one would sort e.g. 'a' and 'A'
together, but completely ignoring hyphens and other nonalphabetic
characters seems to be a very unreasonable behaviour. Manual pages for
strcoll do not explicitly state if non-alphabetic characters should be
considered, only that the collation order specified by LC_COLLATE
should be respected, so I'm not sure if this is a strcoll bug, a ls
bug or simply dissonance between the two, but I think this behaviour
is clearly suboptimal, so I thought I'd report it.

--
Axel


_______________________________________________
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils

Reply via email to