Jim Meyering wrote: > For each fix, I usually try to determine when the bug was introduced > and mention that in NEWS. > > Both of these date back to the very beginning, since sort from > textutils-1.13 (yes, I actually built it ;-) exhibits the same incorrect > behavior, and the code in that function barely changed between my initial > import in 1992 and that 1.13. > > How many more +16-year-old bugs are lurking?
Interesting :) I also noticed that freeBSD/Mac OS X use coreutils sort so they have the same issue. Also the i18n patch in fedora 8 at least seems to be varying one of the problems somewhat: upstream buggy coreutils: $ printf "a y\na z\n" | sort -k1,1b #buggy a z a y $ printf "a y\na z\n" | sort -k1b,1 #ok a y a z fedora 8: $ printf "a y\na z\n" | sort -k1,1b #ok a y a z $ printf "a y\na z\n" | sort -k1b,1 #buggy a z a y So I'll add the attached test I think to check for that. cheers, Pádraig
>From b857242886538be7c1ae387c85b86e9f96fecb80 Mon Sep 17 00:00:00 2001 From: =?utf-8?q?P=C3=A1draig=20Brady?= <p...@draigbrady.com> Date: Fri, 27 Feb 2009 08:40:42 +0000 Subject: [PATCH] tests: sort: Check skipping blanks in multibyte locales * tests/misc/sort: On Fedora 8 at least, sort -k1b,1 mishandles blanks in multibyte locales, so add appropriate test. --- tests/misc/sort | 9 +++++++++ 1 files changed, 9 insertions(+), 0 deletions(-) diff --git a/tests/misc/sort b/tests/misc/sort index 3af2388..b6ee905 100755 --- a/tests/misc/sort +++ b/tests/misc/sort @@ -24,6 +24,10 @@ my $prog = 'sort'; # Turn off localization of executable's output. @ENV{qw(LANGUAGE LANG LC_ALL)} = ('C') x 3; +my $locale = $ENV{LOCALE_FR_UTF8}; +! defined $locale || $locale eq 'none' + and $locale = 'C'; + # Since each test is run with a file name and with redirected stdin, # the name in the diagnostic is either the file name or "-". # Normalize each diagnostic to use '-'. @@ -216,6 +220,11 @@ my @Tests = # next field are not included in the sort. I.E. order should not change here. ["18f", '-k1,1b', {IN=>"a y\na z\n"}, {OUT=>"a y\na z\n"}], +# When ignoring leading blanks for start position, ensure blanks from +# next field are not included in the sort. I.E. order should not change here. +# This was noticed as an issue on fedora 8 (only in multibyte locales). +["18g", '-k1b,1', {IN=>"a y\na z\n"}, {OUT=>"a y\na z\n"}, {ENV => "LC_ALL=$locale"}], + # This looks odd, but works properly -- 2nd keyspec is never # used because all lines are different. ["19a", '+0 +1nr', {IN=>"b 2\nb 1\nb 3\n"}, {OUT=>"b 1\nb 2\nb 3\n"}], -- 1.5.3.6
_______________________________________________ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils