On 2011-06-03 05:09, Greg Wooledge wrote:
Oh, look, there's more!
[...]
See? Both tr(1) and ls(1) do it too!
Right; forgot about ls (because "alias ls='LC_COLLATE=C ls'" has been in
my .bashrc for so long that I completely forgot it was there :) ), and
didn't think to try tr -- but tr appears to be case-sensitive under all
locales for me, though.
And yours looks broken -- how does
echo Hello World | tr A-Z a-z
result in a bunch of non-ASCII characters? (That's how it looked when it
got here, at any rate -- maybe one of our mail servers did something.)
Even grep, whose man page says it obeys LC_COLLATE and the locale,
actually has [a-c] equivalent to [abc] on all locales. Someone must have
snuck in and fixed it.
You must live in a strange and peculiar world.
imadev:~/qwerty$ type grep
grep is hashed (/usr/bin/grep)
imadev:~/qwerty$ echo 'brown cow' | grep '[A-C]'
brown cow
imadev:~/qwerty$ echo 'BROWN COW' | grep '[a-c]'
BROWN COW
Is every single bit of your knowledge born out of familiarity with just
ONE operating system with weird extensions?
I tried this on a few systems with different distros on them. The only
non-Linux systems I have access to are running things so old that they
only have the C locale, so I can't be sure about them. (For instance,
the only Solaris system I can get at doesn't support unicode at all.)
Here are a couple:
[c69:~]$ uname -sr
Linux 2.6.32-31-generic
[c69:~]$ type grep
grep is /bin/grep
[c69:~]$ locale | grep LC_COLLATE
LC_COLLATE="en_US.UTF-8"
[c69:~]$ echo 'brown cow' | grep '[A-C]'
[c69:~]$ echo 'BROWN COW' | grep '[a-c]'
[c69:~]$
u-elive ~(0)$ uname -sr
Linux 2.6.32-26-generic
u-elive ~(0)$ locale | grep LC_COL
LC_COLLATE="en_US.UTF-8"
u-elive ~(0)$ type grep
grep is /bin/grep
u-elive ~(0)$ echo 'brown cow' | grep '[A-C]'
u-elive ~(0)$ echo 'BROWN COW' | grep '[a-c]'
u-elive ~(0)$
The university used to have a big AIX system, but again, I think it was
a version from before the days of unicode locales. Maybe grep working as
expected is a Linux thing?
And if so, then you're right about me living in a "strange and peculiar
world" -- it's the world of people and organizations too poor to afford
proprietary Unices; one would never see HP-UX in a world like that when
there are cheaper alternatives. I'll spin up some BSD's in a virtual
machine and check those later -- anything else I should try?
You ought to report the bug in your vendor's grep(1) implementation, if
it is actually broken as you describe.
And no, I'm going to keep very quiet about this "bug" in grep -- because
it's working the way I want/expect it to, and it'll doubtless break
many, many shell scripts and cause data loss for a fair number of people
if it were fixed :)
~Felix.