Re: [BUG+PATCH] Wrong name sort.

2007-03-06 Thread Oswald Buddenhagen
On Mon, Mar 05, 2007 at 08:21:41PM +0100, Egmont Koblinger wrote:
 Perhaps it might make sense to have an option in mc where you can
 _manually_ choose between strcmp() and strcoll()

 (but hey, that's what LC_COLLATE is for!),

... in theory.
i don't think LC_COLLATE as it stands is a good idea for filenames. some
locales ignore punctuation for whatever reason, probably because its
defined by the respective dictionary standard. but it's
counterproductive to mangle filenames that way, particularly depending
on the locale and not some setting explicitly relating to file names.

-- 
Hi! I'm a .signature virus! Copy me into your ~/.signature, please!
--
Chaos, panic, and disorder - my work here is done.
___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


[BUG+PATCH] Wrong name sort.

2007-03-05 Thread Martin Petricek

I found a sorting buf in latest CVS version.
I have files sorted by name, case insensitive sort. However, the
sorting seems to be behaving strangely, as bunch of files that should
be sorted before or after filter.cc and filter.h (not sure if . is
before or after letters while sorting) got stuffed in the listing
between these two files. Seems to me like dots are ignored when
sorting at all,

Now it looks like this:
│ filter.cc│  89│-rw---│││
 │  │
│ filtereddataset.cc   │ 544│-rw---│││
 │  │
│ filtereddataset.dep  │ 149│-rw---│││
 │  │
│ filtereddataset.h│ 451│-rw---│││
 │  │
│ filtereddataset.o│   94288│-rw---│││
 │  │
│ filtereddataslice.cc │ 658│-rw---│││
 │  │
│ filtereddataslice.dep│ 155│-rw---│││
 │  │
│ filtereddataslice.h  │ 555│-rw---│││
 │  │
│ filtereddataslice.o  │   96936│-rw---│││
 │  │
│ filter.h │ 172│-rw---│││
 │  │

However, when I switched to case-sensitive sort, the files are sorted
as expected. When I looked in the source (dir.c, sort_name function),
strcmp is used for case-sensitive comparison and either strcoll or
g_strcasecmp for case-insensitive comparison (which in fact use
strcasecmp if it is available).

Seems there is problem with strcoll in libc6, see
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=368270 for example.

Suggested fix:
I think one extra test should be added there:
if a.b is sorted between aa and ac, do not use strcoll
See attached patch which does exactly that.

I tested it and it fixes the problem.

Martin Petricek


sort_order_bugfix.patch
Description: Binary data
___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel


Re: [BUG+PATCH] Wrong name sort.

2007-03-05 Thread Egmont Koblinger
On Mon, Mar 05, 2007 at 07:49:06PM +0100, Martin Petricek wrote:

 I have files sorted by name, case insensitive sort. However, the
 sorting seems to be behaving strangely, as bunch of files that should
 be sorted before or after filter.cc and filter.h (not sure if . is
 before or after letters while sorting) got stuffed in the listing
 between these two files. Seems to me like dots are ignored when
 sorting at all,

Yes, this is how strcoll() behaves in many locales. You don't like it, but
others may.

 I think one extra test should be added there:
 if a.b is sorted between aa and ac, do not use strcoll
 See attached patch which does exactly that.

I don't think this kind of autodetection is the right way. Actually I think
this is a very wrong way.

If you think strcoll() behaves buggy for one particular language, go and fix
that locale, or set your LC_COLLATE variable to use some other locale for
sorting (e.g. export LC_COLLATE=C).

Perhaps it might make sense to have an option in mc where you can _manually_
choose between strcmp() and strcoll() (but hey, that's what LC_COLLATE is
for!), but doing such kind of autodetection is the worst I can imagine --
no-one would ever understand what and why mc does unless s/he looks at the
source. This behavior would be absolutely counterintuitive.



-- 
Egmont
___
Mc-devel mailing list
http://mail.gnome.org/mailman/listinfo/mc-devel