[PATCH]: ls: do not show long iso time format for en_* locales

2009-09-25 Thread Ondřej Vašík
Hello,
as reported in https://bugzilla.redhat.com/show_bug.cgi?id=525134 by
Daniel Qarras, ls -l shows iso long format for en_* locales. This is
caused by
http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=6837183d42a0ccdf7b7106794ea693c5b609aea5
. After that commit ls in locale time style checks if the translation
differs from defaults. If not, it fallbacks to long iso time style
format. However - for english locales is this default time style format
(same as C) expected, so the check for missing translation is wrong. 

Attached patch should fix this, allowing the default timestyle for en_*
locales. However - I guess it would be maybe better to remove the check
for possibly messed translation completely - as the default time-style
(the same as C style) could be in use in more LC_TIME styles and
fallback to C style for locales with missing translation is not that bad
behaviour (imho better than long iso style).

Greetings,
 Ondřej Vašík

From e82f581055af6eadcf3e99ff7aa3f5f3479c7c22 Mon Sep 17 00:00:00 2001
From: =?utf-8?q?Ond=C5=99ej=20Va=C5=A1=C3=ADk?= 
Date: Fri, 25 Sep 2009 15:20:47 +0200
Subject: [PATCH] ls: do not show long iso time format for en_* locales

* src/ls.c (decode_switches): Do not fallback to long iso time format for
  en_* locales.
  Introduced by commit 6837183d, 11-08-2005 and reported
  in https://bugzilla.redhat.com/show_bug.cgi?id=525134 by Daniel Qarras.
* tests/misc/ls-time: test it
* NEWS: mention it
---
 NEWS   |3 +++
 src/ls.c   |8 ++--
 tests/misc/ls-time |7 +++
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index 1571c9c..502355a 100644
--- a/NEWS
+++ b/NEWS
@@ -16,6 +16,9 @@ GNU coreutils NEWS-*- outline -*-
   when the color of a more specific type is disabled.
   [bug introduced in coreutils-5.90]
 
+  ls -l now correctly show locale timestamp of files instead of long iso format
+  [bug introduced in coreutils-6.0]
+
 ** Portability
 
   On Solaris 9, many commands would mistakenly treat file/ the same as
diff --git a/src/ls.c b/src/ls.c
index 1bb6873..3a1d2d1 100644
--- a/src/ls.c
+++ b/src/ls.c
@@ -2027,13 +2027,17 @@ decode_switches (int argc, char **argv)
 if (hard_locale (LC_TIME))
   {
 /* Ensure that the locale has translations for both
-   formats.  If not, fall back on long-iso format.  */
+   formats (translation differs from default).  If not,
+   fall back on long-iso format, unless unchanged
+   format is expected (for english locales).  */
 int i;
+const char *lc_time = setlocale (LC_TIME, NULL);
 for (i = 0; i < 2; i++)
   {
 char const *locale_format =
   dcgettext (NULL, long_time_format[i], LC_TIME);
-if (locale_format == long_time_format[i])
+if (lc_time && strncmp(lc_time, "en_", 3) &&
+locale_format == long_time_format[i])
   goto case_long_iso_time_style;
 long_time_format[i] = locale_format;
   }
diff --git a/tests/misc/ls-time b/tests/misc/ls-time
index abdd429..86868a7 100755
--- a/tests/misc/ls-time
+++ b/tests/misc/ls-time
@@ -123,4 +123,11 @@ EOF
   fail=1
 fi
 
+# The output for english locale should differ from long iso format
+# This failed between 6.0 and 7.7
+LC_ALL=en_US ls -l c >en_output
+ls -l --time-style=long-iso c >liso_output
+compare en_output liso_output && { fail=1;
+ echo "Long format timestamp for en_US locale is same as for long iso." 1>&2; }
+
 Exit $fail
-- 
1.5.6.1.156.ge903b



signature.asc
Description: Toto je digitálně	 podepsaná část	 zprávy


Re: [PATCH]: ls: do not show long iso time format for en_* locales

2009-09-25 Thread Paul Eggert
Ondřej Vašík  writes:

> as reported in https://bugzilla.redhat.com/show_bug.cgi?id=525134 by
> Daniel Qarras, ls -l shows iso long format for en_* locales.

I just now read that Bugzilla report, and the diagnosis and the
patch do not seem correct.  The diagnosis says:

> In ls.c (case locale_time_style)  is dcgettext (NULL, long_time_format[i],
> LC_TIME); ... that translates the string, but the translation is THE SAME as
> the default - as the format is the same for en_* locales.

But that is not what the ls.c source code does.  The code does this:

char const *locale_format =
  dcgettext (NULL, long_time_format[i], LC_TIME);
if (locale_format == long_time_format[i])
  goto case_long_iso_time_style;

The "==" test returns true when dcgettext returns the msgid (its 2nd
argument) because it finds no translation.  If it found a translation,
dcgettext would return a different string, so the "==" test would
return false, and the code would use the translation.  Even if the
translation has the same _contents_ as the msgid, it will have a
different _address_, so the code is correct as-is and does not need
this modification.

Also, the proposed patch would use U.S. styles for all English
locales, which certainly is not right.

I suspect the diagnosis given by Jim Meyering in comment #3 at that
bug report is correct, and that something is going wrong at install
time.




Re: [PATCH]: ls: do not show long iso time format for en_* locales

2009-09-25 Thread Pádraig Brady
Paul Eggert wrote:
> Ondřej Vašík  writes:
> 
>> as reported in https://bugzilla.redhat.com/show_bug.cgi?id=525134 by
>> Daniel Qarras, ls -l shows iso long format for en_* locales.
> 
> I just now read that Bugzilla report, and the diagnosis and the
> patch do not seem correct.  The diagnosis says:
> 
>> In ls.c (case locale_time_style)  is dcgettext (NULL, long_time_format[i],
>> LC_TIME); ... that translates the string, but the translation is THE SAME as
>> the default - as the format is the same for en_* locales.
> 
> But that is not what the ls.c source code does.  The code does this:
> 
> char const *locale_format =
>   dcgettext (NULL, long_time_format[i], LC_TIME);
> if (locale_format == long_time_format[i])
>   goto case_long_iso_time_style;
> 
> The "==" test returns true when dcgettext returns the msgid (its 2nd
> argument) because it finds no translation.

Right. We don't have any translations for "en".

I noticed this previously and assumed it was on purpose.
I.E. we can override the default time style by
providing an en "translation" for the time formats.

Note one can't use LC_TIME=C to set the format,
as that will also cause nl_langinfo to lookup the abbreviated
months in that locale. I.E. the following would show
english abbreviations: LC_TIME=C LANG=fr_FR.utf8 ls -l

What I do is to set TIME_STYLE as follows:

# traditional unix time format with abbreviated month translated from locale
export TIME_STYLE='+%b %e  %Y
%b %e %H:%M'

cheers,
Pádraig.




Re: [PATCH]: ls: do not show long iso time format for en_* locales

2009-09-26 Thread Jim Meyering
Pádraig Brady wrote:

> Paul Eggert wrote:
>> Ondřej Vašík  writes:
>>
>>> as reported in https://bugzilla.redhat.com/show_bug.cgi?id=525134 by
>>> Daniel Qarras, ls -l shows iso long format for en_* locales.
>>
>> I just now read that Bugzilla report, and the diagnosis and the
>> patch do not seem correct.  The diagnosis says:
>>
>>> In ls.c (case locale_time_style)  is dcgettext (NULL, long_time_format[i],
>>> LC_TIME); ... that translates the string, but the translation is THE SAME as
>>> the default - as the format is the same for en_* locales.
>>
>> But that is not what the ls.c source code does.  The code does this:
>>
>> char const *locale_format =
>>   dcgettext (NULL, long_time_format[i], LC_TIME);
>> if (locale_format == long_time_format[i])
>>   goto case_long_iso_time_style;
>>
>> The "==" test returns true when dcgettext returns the msgid (its 2nd
>> argument) because it finds no translation.
>
> Right. We don't have any translations for "en".

This is key.

Looking at Makefile.in.in, you can see that
the file /usr/share/locale/en/LC_TIME/coreutils.mo is never installed,
and neither is .../LC_MESSAGES/coreutils.mo, because coreutils has
no po/en.po file.  Pre-optimizers might argue that not installing the
latter is a good thing, because gettext will then not waste time with the
fstat+mmap+close that it normally performs after each successful .mo file
open.  Not to mention the cost of each subsequent failed message lookup.

I try not to pre-optimize, so question whether the Makefile.in.in patch
is worthwhile, but from an aesthetics point of view, I prefer not to
install the LC_MESSAGES/coreutils.mo file.  The core of the patch
is this two-line addition:

++  lang=en; for lc in '' $(EXTRA_LOCALE_CATEGORIES); do \
++test -n "$$lc" && mv -f $(message_mo) $(lc_mo_file); done

All of the rest is factoring.  I'm leaning towards rewriting it to
insert the non-factored equivalent.  However, one advantage of using the
patch is that it records the context: if we update to a newer gettext
that changes the body of that rule, the patch will no longer apply,
and that will make bootstrap fail.  One alternative that could be used
with the 3-line-insertion approach is to record (and always cross-check)
a checksum of the rule contents.

With this patch, in an en* locale, ls -l now uses the conventional
date formats.

Here's an incomplete patch.
It needs a test and a NEWS entry.
Ondřej, can you adjust your test to work (or skip)
if there is no en* locale?

>From 53c88d531d88e5d4ef393a61758bc3fee4894e48 Mon Sep 17 00:00:00 2001
From: Jim Meyering 
Date: Sat, 26 Sep 2009 14:59:05 +0200
Subject: [PATCH] ls: use conventional date format in long listing for en* 
locales

* bootstrap.conf: Generate po/en.po with identity translations for
the two LC_TIME strings required by ls.c.
* configure.ac: Require gettext version 0.17.
* po/Makefile.in.in-patch: Patch autopoint-provided file so that
it provides the LC_TIME .mo file, but not the LC_MESSAGES one.
* po/.gitattributes: Allow space-before-TAB in the patch.
---
 bootstrap.conf  |   41 +
 configure.ac|2 +-
 po/.gitattributes   |1 +
 po/Makefile.in.in-patch |   65 +++
 4 files changed, 108 insertions(+), 1 deletions(-)
 create mode 100644 po/.gitattributes
 create mode 100644 po/Makefile.in.in-patch

diff --git a/bootstrap.conf b/bootstrap.conf
index 726092c..c2e349d 100644
--- a/bootstrap.conf
+++ b/bootstrap.conf
@@ -320,10 +320,51 @@ tar-
 # Automake requires that ChangeLog exist.
 touch ChangeLog || exit 1

+# The following is to accommodate coreutils' use of LC_TIME.
+# Two steps:
+# - create a nearly empty po/en.po
+# - patch po/Makefile.in.in to do what we want at install-time
+#
+# The existence of po/en.po ensures that $(localedir)/en/LC_TIME/coreutils.mo
+# will be created at install time.  That file is required by ls.c's
+# dcgettext call, when run in an en* locale.
+date=$(date '+%F %H:%M%z')
+ver=$(cat .prev-version)
+pkg=$(sed -n 's/AC_INIT(\[GNU \(.*\)\],.*/\1/p' configure.ac)
+
+cat < po/en.po
+# Copyright 2009 Free Software Foundation, Inc.
+# This file is distributed under the same license as the $pkg package.
+msgid ""
+msgstr ""
+"Project-Id-Version: $pkg-$ver\n"
+"Report-Msgid-Bugs-To: bug-$...@gnu.org\n"
+"POT-Creation-Date: $date\n"
+"PO-Revision-Date: $date\n"
+"Last-Translator: nobody\n"
+"Language-Team: none\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=UTF-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+
+msgid "%b %e  %Y"
+msgstr "%b %e  %Y"
+
+msgid "%b %e %H:%M"
+msgstr "%b %e %H:%M"
+EOF
+
 bootstrap_epilogue()
 {
   # Change paths in gnulib-tests/gnulib.mk from "../.." to "..".
   m=gnulib-tests/gnulib.mk
   sed 's,\.\./\.\.,..,g' $m > $m-t
   mv -f $m-t $m
+
+  # Patch the autopoint-supplied po/Makefile.in.in to ensure that we
+

Re: [PATCH]: ls: do not show long iso time format for en_* locales

2009-09-26 Thread Ondřej Vašík
Paul Eggert wrote:
> Ondřej Vašík  writes:
> 
> > as reported in https://bugzilla.redhat.com/show_bug.cgi?id=525134 by
> > Daniel Qarras, ls -l shows iso long format for en_* locales.
> 
> I just now read that Bugzilla report, and the diagnosis and the
> patch do not seem correct.  The diagnosis says:
> 
> > In ls.c (case locale_time_style)  is dcgettext (NULL, long_time_format[i],
> > LC_TIME); ... that translates the string, but the translation is THE SAME as
> > the default - as the format is the same for en_* locales.
> 
> But that is not what the ls.c source code does.  The code does this:
> 
> char const *locale_format =
>   dcgettext (NULL, long_time_format[i], LC_TIME);
> if (locale_format == long_time_format[i])
>   goto case_long_iso_time_style;
> 
> The "==" test returns true when dcgettext returns the msgid (its 2nd
> argument) because it finds no translation.  If it found a translation,
> dcgettext would return a different string, so the "==" test would
> return false, and the code would use the translation.  Even if the
> translation has the same _contents_ as the msgid, it will have a
> different _address_, so the code is correct as-is and does not need
> this modification.

Ah, sorry... you are right, the address should be different, so the code
is correct, I got confused somehow.
> 
> Also, the proposed patch would use U.S. styles for all English
> locales, which certainly is not right.
> 
> I suspect the diagnosis given by Jim Meyering in comment #3 at that
> bug report is correct, and that something is going wrong at install
> time.

But as Pádraig wrote in the reply, there are no translation for en_*
languages, so long iso style is used - which is imho wrong. The patch is
fixing it (although it seems to be only a workaround and wrong
approach). Better would be to have translations even for en_* locales in
some cases - like this. This is better way than this workaround... Is it
possible add those translations?

Sorry for noise...

Greetings,
 Ondřej


signature.asc
Description: Toto je digitálně	 podepsaná část	 zprávy


Re: [PATCH]: ls: do not show long iso time format for en_* locales

2009-09-26 Thread Ondřej Vašík
Jim Meyering wrote:
> Here's an incomplete patch.
> It needs a test and a NEWS entry.
> Ondřej, can you adjust your test to work (or skip)
> if there is no en* locale?

Maybe something like that (attachment)?
It's checking for existence of locale binary and en_US locale and
performing the test only if both exists. I guess en_US locale is better
than random choosing of one en_* locale. Also NEWS entry added although
it maybe needs some tweaks as usually.

Greetings,
 Ondřej

From 3e327dd41c55c5f25180c779a766549b90af59fa Mon Sep 17 00:00:00 2001
From: =?utf-8?q?Ond=C5=99ej=20Va=C5=A1=C3=ADk?= 
Date: Fri, 25 Sep 2009 15:20:47 +0200
Subject: [PATCH] tests/misc/ls-time: ls shouldn't show long iso time format for en_* locales

* tests/misc/ls-time: test if ls doesn't show long iso time format
  when en_US locale is present
* NEWS: mention that ls no longer shows long iso time format for en_* locales
---
 NEWS   |4 
 tests/misc/ls-time |   13 +
 2 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/NEWS b/NEWS
index 1571c9c..2cf8b33 100644
--- a/NEWS
+++ b/NEWS
@@ -16,6 +16,10 @@ GNU coreutils NEWS-*- outline -*-
   when the color of a more specific type is disabled.
   [bug introduced in coreutils-5.90]
 
+  ls -l now correctly shows en_* locale timestamps of files instead of
+  long iso format.
+  [bug introduced in coreutils-6.0]
+
 ** Portability
 
   On Solaris 9, many commands would mistakenly treat file/ the same as
diff --git a/tests/misc/ls-time b/tests/misc/ls-time
index abdd429..39108f0 100755
--- a/tests/misc/ls-time
+++ b/tests/misc/ls-time
@@ -123,4 +123,17 @@ EOF
   fail=1
 fi
 
+# The output for english locale should differ from long iso format
+# This failed between 6.0 and 7.7
+# Do the test only when locale binary and any en_US locale is available
+if (locale --version) > /dev/null 2>&1; then
+  en_locale=`locale -a | grep "^en_US$"`
+  if test -n $en_locale; then
+LC_ALL=$en_locale ls -l c >en_output
+ls -l --time-style=long-iso c >liso_output
+compare en_output liso_output && { fail=1;
+  echo "Long format timestamp for $en_locale locale is same as for long iso." 1>&2; }
+  fi
+fi
+
 Exit $fail
-- 
1.5.6.1.156.ge903b



signature.asc
Description: Toto je digitálně	 podepsaná část	 zprávy


Re: [PATCH]: ls: do not show long iso time format for en_* locales

2009-09-27 Thread Pádraig Brady
Jim Meyering wrote:
> Pádraig Brady wrote:
> 
>> Paul Eggert wrote:
>>> Ondřej Vašík  writes:
>>>
 as reported in https://bugzilla.redhat.com/show_bug.cgi?id=525134 by
 Daniel Qarras, ls -l shows iso long format for en_* locales.
>>> I just now read that Bugzilla report, and the diagnosis and the
>>> patch do not seem correct.  The diagnosis says:
>>>
 In ls.c (case locale_time_style)  is dcgettext (NULL, long_time_format[i],
 LC_TIME); ... that translates the string, but the translation is THE SAME 
 as
 the default - as the format is the same for en_* locales.
>>> But that is not what the ls.c source code does.  The code does this:
>>>
>>> char const *locale_format =
>>>   dcgettext (NULL, long_time_format[i], LC_TIME);
>>> if (locale_format == long_time_format[i])
>>>   goto case_long_iso_time_style;
>>>
>>> The "==" test returns true when dcgettext returns the msgid (its 2nd
>>> argument) because it finds no translation.
>> Right. We don't have any translations for "en".
> 
> This is key.
> 
> Looking at Makefile.in.in, you can see that
> the file /usr/share/locale/en/LC_TIME/coreutils.mo is never installed,
> and neither is .../LC_MESSAGES/coreutils.mo, because coreutils has
> no po/en.po file.  Pre-optimizers might argue that not installing the
> latter is a good thing, because gettext will then not waste time with the
> fstat+mmap+close that it normally performs after each successful .mo file
> open.  Not to mention the cost of each subsequent failed message lookup.
> 
> I try not to pre-optimize, so question whether the Makefile.in.in patch
> is worthwhile, but from an aesthetics point of view, I prefer not to
> install the LC_MESSAGES/coreutils.mo file.  The core of the patch
> is this two-line addition:
> 
> ++lang=en; for lc in '' $(EXTRA_LOCALE_CATEGORIES); do \ 
> ++  test -n "$$lc" && mv -f $(message_mo) $(lc_mo_file); done
> 
> All of the rest is factoring.  I'm leaning towards rewriting it to
> insert the non-factored equivalent.  However, one advantage of using the
> patch is that it records the context: if we update to a newer gettext
> that changes the body of that rule, the patch will no longer apply,
> and that will make bootstrap fail.  One alternative that could be used
> with the 3-line-insertion approach is to record (and always cross-check)
> a checksum of the rule contents.
> 
> With this patch, in an en* locale, ls -l now uses the conventional
> date formats.
> 
> Here's an incomplete patch.
> It needs a test and a NEWS entry.
> Ondřej, can you adjust your test to work (or skip)
> if there is no en* locale?
> 
>>From 53c88d531d88e5d4ef393a61758bc3fee4894e48 Mon Sep 17 00:00:00 2001
> From: Jim Meyering 
> Date: Sat, 26 Sep 2009 14:59:05 +0200
> Subject: [PATCH] ls: use conventional date format in long listing for en* 
> locales
> 
> * bootstrap.conf: Generate po/en.po with identity translations for
> the two LC_TIME strings required by ls.c.
> * configure.ac: Require gettext version 0.17.
> * po/Makefile.in.in-patch: Patch autopoint-provided file so that
> it provides the LC_TIME .mo file, but not the LC_MESSAGES one.
> * po/.gitattributes: Allow space-before-TAB in the patch.
> ---
>  bootstrap.conf  |   41 +
>  configure.ac|2 +-
>  po/.gitattributes   |1 +
>  po/Makefile.in.in-patch |   65 
> +++

So that will apply generate an en.po with the
traditional unix format to apply to en_* for e.g.

$ locale -a | sed -n 's/\(en_..\).*/\1/p' | sort -u |
> while read LANG; do echo $LANG $(locale territory); done

en_AG Antigua and Barbuda
en_AU Australia
en_BW Botswana
en_CA Canada
en_DK Denmark
en_GB Great Britain
en_HK Hong Kong
en_IE Ireland
en_IN India
en_NG Nigeria
en_NZ New Zealand
en_PH Philippines
en_SG Singapore
en_US USA
en_ZA South Africa
en_ZW Zimbabwe

Is that not functionally equivalent to Ondřej's patch
which is much simpler and probably more efficient?
His patch will also use an en_ translation if supplied
(say if en_HK wanted a different format).

cheers,
Pádraig.






Re: [PATCH]: ls: do not show long iso time format for en_* locales

2009-09-27 Thread Jim Meyering
Pádraig Brady wrote:
...
> So that will apply generate an en.po with the
> traditional unix format to apply to en_* for e.g.

Right.

> $ locale -a | sed -n 's/\(en_..\).*/\1/p' | sort -u |
>> while read LANG; do echo $LANG $(locale territory); done
>
> en_AG Antigua and Barbuda
> en_AU Australia
> en_BW Botswana
...
> Is that not functionally equivalent to Ondřej's patch

Yes.
However, ...

> which is much simpler and probably more efficient?
> His patch will also use an en_ translation if supplied

his added setlocale call (unnecessary in a non-en_* locale)
is not a plus.  Though that could be fixed.

> (say if en_HK wanted a different format).

The only advantage is that my patch uses the existing framework,
rather than adding special case code in ls.c proper.
Whether that is worth the apparent complexity...

If you prefer his patch and want to adjust it and
handle the rest, I have no objection.




Re: [PATCH]: ls: do not show long iso time format for en_* locales

2009-09-27 Thread Pádraig Brady
Jim Meyering wrote:
> The only advantage is that my patch uses the existing framework,
> rather than adding special case code in ls.c proper.
> Whether that is worth the apparent complexity...
> 
> If you prefer his patch and want to adjust it and
> handle the rest, I have no objection.

Yes it's debatable.
I'll take a closer look at both
and send an updated one if I think
the special case in ls.c more appropriate.

cheers,
Pádraig.




Re: [PATCH]: ls: do not show long iso time format for en_* locales

2009-09-28 Thread Pádraig Brady
Pádraig Brady wrote:
> Jim Meyering wrote:
>> The only advantage is that my patch uses the existing framework,
>> rather than adding special case code in ls.c proper.
>> Whether that is worth the apparent complexity...
>>
>> If you prefer his patch and want to adjust it and
>> handle the rest, I have no objection.
> 
> Yes it's debatable.
> I'll take a closer look at both
> and send an updated one if I think
> the special case in ls.c more appropriate.

Thinking more about this I'm wondering about special casing en_* at all.

The result of this patch is that for most people the usual timestamp
format changes from 1 (ISO) to 3 fields (POSIX).

So the first minor issue I have is that ISO has been the usual
format for 4 years at least, so I suspect that this might trigger
bugs in scripts parsing ls output. I do prefer the traditional
POSIX specified format myself and I'm surprised that no one
reported this until now. In summary I'm about 60:40 for making
the change, and if we do I'll add appropriate text to NEWS.

The other question I have is why do we assume ISO anyway when a
format translation it not available? For example we've no translations
for en_PH or tl_PH and so at the moment they'll get ISO format
even though Tagalog month abbreviations are available:

$ LANG=tl_PH locale abmon
Ene;Peb;Mar;Abr;May;Hun;Hul;Ago;Sep;Okt;Nob;Dis

Now if we do apply the special casing for en_* then you'll have
different date formats for en_PH and tl_PH. Really the date
format is associated with the country rather than the language.
(Note I don't think we can determine whether abmon is specific
to the locale or whether it's just the "C" default).

So I think if we accept the first point above that we would change the
default format to POSIX for most people I think we should just remove the
code defaulting to ISO if a translation is not available ?

I.E. revert part of 6837183d as follows:

diff --git a/src/ls.c b/src/ls.c
index 1bb6873..4531b94 100644
--- a/src/ls.c
+++ b/src/ls.c
@@ -2014,7 +2014,6 @@ decode_switches (int argc, char **argv)
 break;

   case long_iso_time_style:
-  case_long_iso_time_style:
 long_time_format[0] = long_time_format[1] = "%Y-%m-%d %H:%M";
 break;

@@ -2030,13 +2029,8 @@ decode_switches (int argc, char **argv)
formats.  If not, fall back on long-iso format.  */
 int i;
 for (i = 0; i < 2; i++)
-  {
-char const *locale_format =
-  dcgettext (NULL, long_time_format[i], LC_TIME);
-if (locale_format == long_time_format[i])
-  goto case_long_iso_time_style;
-long_time_format[i] = locale_format;
-  }
+  long_time_format[i] =
+dcgettext (NULL, long_time_format[i], LC_TIME);
   }
   }
   /* Note we leave %5b etc. alone so user widths/flags are honored.  */




Re: [PATCH]: ls: do not show long iso time format for en_* locales

2009-09-28 Thread Pádraig Brady
Pádraig Brady wrote:
> Pádraig Brady wrote:
>> Jim Meyering wrote:
>>> The only advantage is that my patch uses the existing framework,
>>> rather than adding special case code in ls.c proper.
>>> Whether that is worth the apparent complexity...
>>>
>>> If you prefer his patch and want to adjust it and
>>> handle the rest, I have no objection.
>> Yes it's debatable.
>> I'll take a closer look at both
>> and send an updated one if I think
>> the special case in ls.c more appropriate.
> 
> Thinking more about this I'm wondering about special casing en_* at all.
> 
> The result of this patch is that for most people the usual timestamp
> format changes from 1 (ISO) to 3 fields (POSIX).

ISO is 2 fields actually :P

> 
> So the first minor issue I have is that ISO has been the usual
> format for 4 years at least, so I suspect that this might trigger
> bugs in scripts parsing ls output. I do prefer the traditional
> POSIX specified format myself and I'm surprised that no one
> reported this until now. In summary I'm about 60:40 for making
> the change, and if we do I'll add appropriate text to NEWS.
> 
> The other question I have is why do we assume ISO anyway when a
> format translation it not available? For example we've no translations
> for en_PH or tl_PH and so at the moment they'll get ISO format
> even though Tagalog month abbreviations are available:
> 
> $ LANG=tl_PH locale abmon
> Ene;Peb;Mar;Abr;May;Hun;Hul;Ago;Sep;Okt;Nob;Dis
> 
> Now if we do apply the special casing for en_* then you'll have
> different date formats for en_PH and tl_PH. Really the date
> format is associated with the country rather than the language.
> (Note I don't think we can determine whether abmon is specific
> to the locale or whether it's just the "C" default).
> 
> So I think if we accept the first point above that we would change the
> default format to POSIX for most people I think we should just remove the
> code defaulting to ISO if a translation is not available ?

The full patch is attached.

cheers,
Pádraig.
>From 2829fa07edc1ef3f550521b53999dc53c89ff215 Mon Sep 17 00:00:00 2001
From: =?utf-8?q?P=C3=A1draig=20Brady?= 
Date: Mon, 28 Sep 2009 17:32:15 +0100
Subject: [PATCH] ls: use the POSIX date style when the locale does not specify one

Previously we defaulted to "long-iso" format in locales without
specific format translations, like the en_* locales for example.
This reverts part of commit 6837183d, 08-11-2005, "ls ... acts like
--time-style='posix-long-iso' if the locale settings are messed up"
* src/ls.c (decode_switches): Only use the ISO format when specified.
* NEWS: Mention the change in behavior.
Reported at http://bugzilla.redhat.com/525134
---
 NEWS |5 +
 src/ls.c |   10 ++
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/NEWS b/NEWS
index 1571c9c..0380975 100644
--- a/NEWS
+++ b/NEWS
@@ -32,6 +32,11 @@ GNU coreutils NEWS-*- outline -*-
   last component (possibly via a dangling symlink) can be created,
   since mkdir will succeed in that case.
 
+  ls -l now uses the traditional three field time style rather than
+  the two field numeric ISO style, in locales where a specific style
+  has not been specified. This currently affects all English language
+  locales for example. [old behavior was introduced in coreutils-6.0]
+
 ** Improvements
 
   rm: rewrite to use gnulib's fts
diff --git a/src/ls.c b/src/ls.c
index 1bb6873..4531b94 100644
--- a/src/ls.c
+++ b/src/ls.c
@@ -2014,7 +2014,6 @@ decode_switches (int argc, char **argv)
 break;
 
   case long_iso_time_style:
-  case_long_iso_time_style:
 long_time_format[0] = long_time_format[1] = "%Y-%m-%d %H:%M";
 break;
 
@@ -2030,13 +2029,8 @@ decode_switches (int argc, char **argv)
formats.  If not, fall back on long-iso format.  */
 int i;
 for (i = 0; i < 2; i++)
-  {
-char const *locale_format =
-  dcgettext (NULL, long_time_format[i], LC_TIME);
-if (locale_format == long_time_format[i])
-  goto case_long_iso_time_style;
-long_time_format[i] = locale_format;
-  }
+  long_time_format[i] =
+dcgettext (NULL, long_time_format[i], LC_TIME);
   }
   }
   /* Note we leave %5b etc. alone so user widths/flags are honored.  */
-- 
1.6.2.5



Re: [PATCH]: ls: do not show long iso time format for en_* locales

2009-09-28 Thread Jim Meyering
Pádraig Brady wrote:
> The full patch is attached.
...

That looks fine.  Thanks.
However, I'm a little reluctant to change back.
Let's wait a day or two, in case Paul Eggert has an objection.

Would you please add a test to exercise this,
perhaps based on the one from Ondřej?

>>From 2829fa07edc1ef3f550521b53999dc53c89ff215 Mon Sep 17 00:00:00 2001
> From: =?utf-8?q?P=C3=A1draig=20Brady?= 
> Date: Mon, 28 Sep 2009 17:32:15 +0100
> Subject: [PATCH] ls: use the POSIX date style when the locale does not 
> specify one
...




Re: [PATCH]: ls: do not show long iso time format for en_* locales

2009-09-29 Thread Paul Eggert
Jim Meyering  writes:

> However, I'm a little reluctant to change back.
> Let's wait a day or two, in case Paul Eggert has an objection.

Here are some objections to the change, under the assumption that
we're in a poorly-configured environment (as the behavior is
unaffected in well-configured ones):

* The change will cause column-alignment problems in non-English
  locales where %b generates different numbers of columns for
  different months.

* There are more users in non-English locales than in non-"C" English
  locales, and the harm in the non-English case (incomprehensible
  dates) is much greater than the harm in the English case
  (comprehensible but ugly dates).

* The existing code is better for the poorly-configured case where
  only one of the two translations is missing.

Unless I'm missing something, I'd leave it alone, as it sounds like
the change will cause more trouble than it'll cure.




Re: [PATCH]: ls: do not show long iso time format for en_* locales

2009-09-30 Thread Pádraig Brady
Paul Eggert wrote:
> Jim Meyering  writes:
> 
>> However, I'm a little reluctant to change back.
>> Let's wait a day or two, in case Paul Eggert has an objection.
> 
> Here are some objections to the change, under the assumption that
> we're in a poorly-configured environment (as the behavior is
> unaffected in well-configured ones):
> 
> * The change will cause column-alignment problems in non-English
>   locales where %b generates different numbers of columns for
>   different months.

Good point, but a separate issue which is already handled:
http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commit;h=612b647d

> * There are more users in non-English locales than in non-"C" English
>   locales, and the harm in the non-English case (incomprehensible
>   dates) is much greater than the harm in the English case
>   (comprehensible but ugly dates).

Yes this is the crux of the question.

With the patch we have:
no_coreutils_style_translation && wrong_sys_abmon => English month shown

While currently we have:
no_coreutils_style_translation => ISO date shown (4 chars wider)

A quick check on my system shows the first condition where
abmon is wrong, triggers for 3 locales. We've no control over
abmon so one could argue that we should not worry about it.
It might even help get these fixed up?

  locale -a | sed -n 's/\(.*utf8.*\)/\1/p' | sort -u |
  while read LANG; do
printf "$LANG\t"; locale abmon
  done |
  grep "Jan;Feb;Mar;Apr;May;Jun;Jul;Aug;Sep;Oct;Nov;Dec" |
  cut -f1 | grep -v ^en_

  ar_SA.utf8
  sid_ET.utf8
  ug_CN.utf8


On the other hand we'll display ISO dates for languages
for which we've no translations, which on my system is:

  (
   locale -a | cut -s -d_ -f1 | sort -u | sort -u

   cd ~/git/coreutils/po && printf "%s\n" *.po |
   sed 's/\([^._]*\).*/\1/' | sort -u
  ) |
  sort | uniq -u | wc -l

  108

> 
> * The existing code is better for the poorly-configured case where
>   only one of the two translations is missing.

Very unlikely, but a valid point.

> 
> Unless I'm missing something, I'd leave it alone, as it sounds like
> the change will cause more trouble than it'll cure.

Ok, I'm fine to hold off to see if there are any more
objections from the 108 languages above (we've only
had 1 objection in 4 years for english at present).

thanks,
Pádraig.

On a related note, I notice that we don't check the return
from setlocale() and so will use english months
when there is no system locale installed at all.