On 2/16/07, Chet Ramey <[EMAIL PROTECTED]> wrote:
Tim Waugh wrote:
>> strcoll indicates that, in the "en_US" locale, `h' sorts between `A' and
>> `Z'. In the "C" locale, it does not. This is consistent with the
>> collating sequences I posted earlier.
>
> Here is what Ulrich Drepper has to say on the matter (see
> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=217359#c5):
>
> "[...] The strcoll result has nothing whatsoever to do with
> the range match. strcoll uses collation weights, ranges use
> collation sequence values, completely different concept.
This is an academic distinction of little or no practical value. There
is no portable interface that exposes the difference to the programmer.
except ``ls [a-z]*''
In the absence of any locale specification (via LANG and LC_* environment
variables), it is incomprehensible to claim that it is reasonable to include
upper case letters in the range of matched characters using a range of
lower case letters. Your quoted paragraph:
A collation sequence definition shall define the relative
order between collating elements (characters and multi-character
collating elements) in the locale. This order is expressed in
terms of collation values; that is, by assigning each element
one or more collation values (also known as collation weights).
This implies to me that the collation weights are what determines the
collation sequence order.
two points:
1. collation sequence is not the same as character range. [a-z] is a range.
2. there is no locale specified, so it must be the default. The default is C.
not en_US or whatever.
But that's the point: I can't "look at the locale definition." And
there is no library function that will allow me to do so. I make do
with what strcoll() gives me.
bash should be entirely consistent with fnmatch. The best way
to do that is to use fnmatch. Then, if there is a problem, we jump
on Ulrich instead of you. :)
> From all I can see so far it's entirely bash's fault by not
> implementing globbing correctly. bash really must use the
> fnmatch code from glibc itself."
Why would I do that? That does nothing to enhance portability.
Yes, it does. You can use the fnmatch module from gnulib to
backfill where necessary.
can put the old subtraction code in that ignores the locale myself,
since, as far as I can tell, that's the only portable part of the
glibc fnmatch code. It would be a step backward to ignore the
locale information, though.
It is a step forward to ignore locale information when there is no
locale information, though. :-)
Thanks - Bruce
_______________________________________________
Bug-bash mailing list
Bug-bash@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-bash