bug#6377: Subject: inaccurate character class processing
Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-unknown-linux-gnu' -DCONF_VENDOR='unknown' -DLOCALEDIR='/usr/local/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -I. -I. -I./include -I./lib -g -O2 uname output: Linux pony.netsoft.ro 2.6.32.12-115.fc12.x86_64 #1 SMP Fri Apr 30 19:46:25 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux Machine Type: x86_64-unknown-linux-gnu Bash Version: 4.1 Patch Level: 0 Release Status: release Description: (I'm not sure if this a bash or a coreutils issue). ls [A-Z]* doesn't work as expected/documented. I'd want/expect it to list the filenames starting with an uppercase letter. Thank you for looking at it! Repeat-By: In an empty directory, create files like touch a A b B z Z Now, ls [A-Z]* outputs A b B z Z (why 'b' and 'z' - and/or where's 'a'...?!!) and ls [a-z]* outputs a A b B z (why 'A' and 'B' - and/or where's 'Z'...?!!)
bug#6377: Subject: inaccurate character class processing
tags 6377 + notabug On 08/06/10 14:48, Iosif Fettich wrote: (I'm not sure if this a bash or a coreutils issue). ls [A-Z]* doesn't work as expected/documented. The logic is in bash but it's not an issue. It's using the collating sequence of your locale $ touch a A b B z Z $ echo [A-Z]* A b B z Z $ export LANG=C $ echo [A-Z]* A B Z
bug#6377: Subject: inaccurate character class processing
On Tue, Jun 8, 2010 at 4:48 PM, Iosif Fettich ifett...@netsoft.ro wrote: ... ls [a-z]* outputs a A b B z (why 'A' and 'B' - and/or where's 'Z'...?!!) it's a classic problem with the locale, the range [a-z] contains the capital letters for some locale definitions ie a-z is aAbB z (Z is after the z) As a workaround you can export LC_COLLATE=C, or maybe use [[:lower:]] instead of [a-z]
bug#6377: Subject: inaccurate character class processing
On Tue, Jun 08, 2010 at 04:48:08PM +0300, Iosif Fettich wrote: ls [A-Z]* doesn't work as expected/documented. I'd want/expect it to list the filenames starting with an uppercase letter. The results of this are dependent upon your locale. If your locale is set to C or POSIX, you will get what you expect. If your locale is set to something else (such as en_US.utf8) then you will get something completely different. I explain why this happens, on http://mywiki.wooledge.org/locale. The glob in your command is expanded by bash (not ls), so in order to get the results you want, your locale variables would have to be set to C/POSIX *before* expanding the glob. In other words, LANG=C ls [A-Z]* will not work, since that sets the variable after expanding the glob. This would work, although it's extremely awkward (IMHO): LANG=C bash -c 'ls [A-Z]*' Another approach would be to permanently (or semi-permanently, e.g. just for one shell session) set the LC_COLLATE variable. Thus, export LC_COLLATE=C ls [A-Z]* This will cause the ordering of glob results (and also of results generated by ls itself, for example ls with no arguments, or ls dirname) to be in ASCII order, without throwing away the other locale features.