[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2012-05-26 Thread Astara
UTF-8 collating has Upper case sorted before lower case.
from:  http://unicode.org/reports/tr10/#Case_Comparisons

6.6 Case Comparisons

In some languages, it is common to sort lowercase before uppercase; in
other languages this is reversed. Often this is more dependent on the
individual concerned, and is not standard across a single language. It
is strongly recommended that implementations provide parameterization
that allows uppercase to be sorted before lowercase, and provides
information as to the standard (if any) for particular countries. This
can easily be done to the Default Unicode Collation Element Table before
tailoring by remapping the L3 weights (see Section 7, Weight
Derivation). It can be done after tailoring by finding the case pairs
and swapping the collation elements.




Anyone not following the above is should likely not claim Unicode
compatibilty.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/120687

Title:
  Caseless collate sequence in en_GB.UTF8

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/bash/+bug/120687/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2011-06-03 Thread Raphaƫl Droz
Any sane "user-friendly" distribution must default to LC_COLLATE=C for the 
terminal use.
I already lost unrecoverable data like in #571958, I now export LC_COLLATE=C in 
.bashrc but
I'm not perverse enough to imagine it's an obligatory stop of terminal users.
(LFS users probably know about collations and read the man 1 bash a long time 
ago)

About GUI:
the LC_COLLATE is a shell configuration variable.
GUI can find something else, metacity may offer an option like "respect 
LC_COLLATE to sort files".

utf-8 LC_COLLATE is definitely far too counter-intuitive and risky,
please fix, at least, /etc/skel/.bashrc

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/120687

Title:
  Caseless collate sequence in en_GB.UTF8

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2011-03-24 Thread sordna
Looks like this affects programs such as GNU grep and egrep ... note I'm using 
quotes around the A-Z character class to avoid any shell interference:   
$ echo hello | grep '[A-Z]'
hello

The above behavior COMPLETELY WRONG AND UNACCEPTABLE. I am utterly
shocked I have to worry change my default environment to do a simple
task such as identifying upper case characters with grep. Note I'm using
en_US.UTF-8  (not en_GB).

LC_COLLATE should default to C under all circumstances, unless the user
explicitly wants grep and other programs to behave in a totally weird
and unexpected way. Even better, perhaps libc should treat an undefined
LC_COLLATE same as being C.

Either way, regular expressions should be honored in a sane linux / unix
operating system.  Users should not have to jump through hoops to make a
fresh installed system behave in a normal, unsuprising way.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/120687

Title:
  Caseless collate sequence in en_GB.UTF8

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2010-12-22 Thread Roberto Gordo Saez
Of course it should go upstream, but I don't understand why it is
outside the scope of Ubuntu to fix a problem for its users. Ubuntu
choose Unity, a non-standard desktop environment, and refuses to choose
a "non-standard" collate sequence (which provides actually more
"standard" behaivior for many of its users that the upstream choice). I
certainly can't understand the reasoning behind that.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/120687

Title:
  Caseless collate sequence in en_GB.UTF8

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2010-12-22 Thread era
So should libc6 be updated to provide a non-standard case-sensitive
collate sequence for each available locale?  I think that goes outside
the scope of what Ubuntu can and should do, but seems like the ultimate
solution, if upstream can be persuaded.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/120687

Title:
  Caseless collate sequence in en_GB.UTF8

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2010-12-21 Thread Roberto Gordo Saez
It is a problem for me too. And it is worse when using es_ES locale,
because we can't use LC_COLLATE=C, it is very important to match
accented characters in files and directories which LC_COLLATE=C does not
do. This is ridiculous and counterintuitive:

touch A B
ls [a-b]*
A

This will be fixed if collate order places upper case letters and lower
case letters separate, like LC_COLLATE=C does.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/120687

Title:
  Caseless collate sequence in en_GB.UTF8

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2010-12-21 Thread era
#15: Obviously, there isn't much Ubuntu can do to help people who do not
use the OS-supplied startup scripts anyway.

Personally, I routinely ditch the default .bashrc on new installations
and replace it with a one-liner which will take me through future
upgrades:

. /etc/skel/.bashrc

It tends to grow more additions over time, of course, but this at least
should provide a healthy future-proof baseline.  It would be nice if
something like this was the default, but that's a separate (wishlist)
bug #194108

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/120687

Title:
  Caseless collate sequence in en_GB.UTF8

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2010-12-20 Thread StephanBeal
@#11: /etc/skel/.bashrc is only useful for new installations. i've been
toting around this same home directory for over 10 years, and have a
.bashrc i have lovingly maintained throughout that time. This particular
"bug"/behaviour nuked that lovingly-maintained home directory (along
with its .bashrc), and i had to pull several tens of GB from backups to
recover.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/120687

Title:
  Caseless collate sequence in en_GB.UTF8

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2010-05-10 Thread StephanBeal
@#5: yes, the (buried) documentation states that the "preferred" way is
to not assume that char ranges are case-sensitive (apparently SuSv3 also
recommends this), but the fact remains that Unix users have, for 30+
years, been relying on case-insensitive ranges. Bash behaves differently
on some systems when LC_ALL is _unset_. Some systems i've tested (RHEL +
Ubuntu) treat an _unset_ LC_ALL as a case-insensitive locale, whereas
others (e.g. Solaris and some Linuxes) treat it equivalently to
LC_ALL=C. The latter behaviour is "historically correct."

i was hit by this in the same manner as the original reporter, nuking
30GB of home directories when i did:

  mv home HOME
  rm -fr [a-z]*

before cleaning up a drive for a new Ubuntu installation (full details
are in #571958).

-- 
Caseless collate sequence in en_GB.UTF8
https://bugs.launchpad.net/bugs/120687
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2010-05-10 Thread StephanBeal
@#13: Correction:

"been relying on case-insensitive ranges"
==
"been relying on case-SENSITIVE ranges"

-- 
Caseless collate sequence in en_GB.UTF8
https://bugs.launchpad.net/bugs/120687
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2010-05-10 Thread Gabe Gorelick
Still an issue on Lucid.

-- 
Caseless collate sequence in en_GB.UTF8
https://bugs.launchpad.net/bugs/120687
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2010-01-18 Thread era
In reply to comment #4: what's wrong with setting it in the shell's
configuration?  I don't believe the GUI reads your .bashrc so you could
set it in /etc/skel/.bashrc

-- 
Caseless collate sequence in en_GB.UTF8
https://bugs.launchpad.net/bugs/120687
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2010-01-12 Thread lieven moors
sorry, LC_LOCALE should read LC_COLLATE

-- 
Caseless collate sequence in en_GB.UTF8
https://bugs.launchpad.net/bugs/120687
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2010-01-12 Thread lieven moors
I think it is an important feature of a shell to be able to
specify character ranges, and distinguish between upper
and lower case while doing this.
Can somebody explain to me why 'export LC_LOCALE=C' is 
not set by default in .bashrc ?
I also want to confirm this bug is still present in Karmic,
and I almost had the same accident the original bug reporter
experienced (removing important directories).
I was lucky to double checked the shell expansion before doing it.

-- 
Caseless collate sequence in en_GB.UTF8
https://bugs.launchpad.net/bugs/120687
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2009-08-05 Thread era
Reproducible in Jaunty and Karmic alpha 3

** Changed in: bash (Ubuntu)
   Status: Incomplete => Confirmed

-- 
Caseless collate sequence in en_GB.UTF8
https://bugs.launchpad.net/bugs/120687
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2008-06-07 Thread Mika Fischer
Maybe this should be discussed on the devel mailinglist or someone could
start a spec?

In any case bash is not the right package for this bug and I don't know
what is...

-- 
Caseless collate sequence in en_GB.UTF8
https://bugs.launchpad.net/bugs/120687
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2008-02-19 Thread Gert Kulyk
@Dirk: If you look at the /usr/share/doc/bash/COMPAT.gz file, $13 states:
[...]
The portable way to specify upper case letters is [:upper:] instead of A-Z; 
lower case may be specified as [:lower:] instead of a-z. 
[...]

The default for /bin/sh in ubuntu is dash, which seem to behave like
older versions of bash (and other shells, e.g. zsh should behave like
bash now), that is ignoring the LC_COLLATE environment-variable, which
results in shell scripts using the [A-Z]-thing or the like are not
destroying anything if it is calling /bin/sh and not /bin/bash, of
course.

I do not like the Idea of changing LC_COLLATE, especially for non-
english environments.

-- 
Caseless collate sequence in en_GB.UTF8
https://bugs.launchpad.net/bugs/120687
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2008-02-11 Thread Colin Watson
Personally I like LC_COLLATE=C and set it everywhere, and I can see why
people want to set this in the installer. However, when this has come up
in the past, the problem has been that GUI users object; with some
justification, they want the sort order in GUI applications to match
that defined for their language. en_GB users largely don't care, but in
other languages it makes much more of a difference.

Is there a good way to set LC_COLLATE=C for shell users but not for GUI
programs other than putting it in one's shell configuration? I don't
think so at present.

-- 
Caseless collate sequence in en_GB.UTF8
https://bugs.launchpad.net/bugs/120687
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2008-02-08 Thread Matthias Klose
this suggestion maybe makes sense, but bash is not the correct place to do 
this; set it in /etc/environment.
if the we want to support this for fresh installations we have to change this 
in the installers as well.

-- 
Caseless collate sequence in en_GB.UTF8
https://bugs.launchpad.net/bugs/120687
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2007-12-13 Thread Ralph Janke
** Changed in: bash (Ubuntu)
   Importance: Undecided => Low

** Description changed:

  do this in a gash (temporary) directory:
  
  touch A a B b
  ls [A-Z]*
  
  you get:-
  
  A  b  B
  
- What most people (especially unix users with >25 years experience) is:-
+ What most people (especially unix users with >25 years experience)
+ expect is:-
  
  A B
  
  I found out about this by accident yesterday by doing "rm [A-Z]*" in a
  directory expecting only files with a initial uppercase letter to be
  removed. You can imagine my surprise when every file (except those
  starting in 'a') where removed. Fortunately most of the files were
  either redundant or backed up, but it still caused me a completely
  unnecessary hour's work to restore the damage.
  
  Obviously the collating sequence is aAbBcCdD... but that really does
  *not* make it right. Other linux distros do not have this problem, but
  then they seem to set:
  
  export LC_COLLATE=C
  
  as standard,  which is missing in a standard ubuntu installation
  (6.06lts -> 7.04)
  
  That *is* the work around, but I would respectfully suggest that you set
  it as standard before someone destroys something irreplaceable!

-- 
Caseless collate sequence in en_GB.UTF8
https://bugs.launchpad.net/bugs/120687
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2007-06-27 Thread Ralph Janke
I can confirm this behaviour on dapper as well as feisty

** Changed in: bash (Ubuntu)
Sourcepackagename: None => bash
   Status: New => Confirmed

-- 
Caseless collate sequence in en_GB.UTF8
https://bugs.launchpad.net/bugs/120687
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 120687] Re: Caseless collate sequence in en_GB.UTF8

2007-06-16 Thread Dirk
What most people (especially unix users with >25 years experience) is:

should read:

What most people (especially unix users with >25 years experience)
expect is:

-- 
Caseless collate sequence in en_GB.UTF8
https://bugs.launchpad.net/bugs/120687
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs