Package: coreutils
Version: 6.10-6
Severity: Important
Tags: patch

==== NOTE ====
Please note that this bug was reported to and resolved
in the upstream mailing-list. ([EMAIL PROTECTED])

Bug report:
http://lists.gnu.org/archive/html/bug-coreutils/2008-07/msg00209.html

Resolution:
http://lists.gnu.org/archive/html/bug-coreutils/2008-08/msg00016.html
==== NOTE ====

Hello,

As you can guess from the subject line, the date program
distributed with coreutils malfunctions in the Turkish locale.
To be specific, it only malfunctions when it tries to process
English day or month names containing the letter "i".

This happens because there are four "i"s in Turkish and the
relationship between dotted and dotless "i"s are different
when compared to the relationship between "i" and "I".
In Turkish: "ı" -> "I" and "i" -> "İ"

Here is a short command line conversation which clearly
demonstrates the problem:

===
$ LANG=tr_TR.UTF-8 date -d "Fri"   ### Malfunction in action!
date: invalid date `Fri'

$ LANG=tr_TR.UTF-8 date -d "FRI"   ### "I" works - already uppercase
Cum Haz 27 00:00:00 EEST 2008

$ LANG=en_US.UTF-8 date -d "Fri"   ### English locale is okay
Fri Jun 27 00:00:00 EEST 2008

$ LANG=en_US.UTF-8 date -d "FRI"
Fri Jun 27 00:00:00 EEST 2008
===

The reason of this malfunction can be seen by looking at the
following lines of code in the "lookup_word" function in
"lib/getdate.c":

===
2688
2689    /* Make it uppercase.  */
2690    for (p = word; *p; p++)
2691      {
2692        unsigned char ch = *p;
2693        *p = toupper (ch);
2694      }
2695
===

As you can see, even though the program is going to process English
day/month names, it uses the locale-dependent "toupper()" function.
And because the relationship between "i" and "I" is different in
Turkish, when date converts "Fri" to uppercase according to Turkish
capitalization rules, it ends up with something other than "FRI". (***)
And later on in the same function, because the resulting string does
not match "FRI", date concludes that "Fri" is not a valid day name
in the Turkish locale.

The following patch by Jim Meyering fixes this problem by making
sure that the "c_toupper()" function is called instead of the
"toupper()" in the relevant part of the coreutils code for date.

Regards,

M. Vefa Bıçakcı

===
Note: It might be necessary to update lib/getdate.c as well.
(It will probably be regenerated but just to be sure.)

(***): To be specific, in tr_TR.UTF-8, it ends up with "FRi".
This is because "toupper()" and "tolower()" functions do not
return UTF-8 characters which are needed to represent letters
such as "idotabove" and "idotless". Please note that this
problem exists in the non-unicode Turkish locale too, as the
"toupper()" and "tolower()" functions return the corresponding
8-bit characters in ISO-8859-9 encoding in tr_TR.ISO-8859-9.

=== 8< ===
Patch by Jim Meyering

diff --git a/lib/getdate.y b/lib/getdate.y
index 695fd59..a94bf8b 100644
--- a/lib/getdate.y
+++ b/lib/getdate.y
@@ -60,7 +60,7 @@
 # undef static
 #endif

-#include <ctype.h>
+#include <c-ctype.h>
 #include <limits.h>
 #include <stdio.h>
 #include <stdlib.h>
@@ -900,7 +900,7 @@ lookup_word (parser_control const *pc, char *word)
   for (p = word; *p; p++)
     {
       unsigned char ch = *p;
-      *p = toupper (ch);
+      *p = c_toupper (ch);
     }

   for (tp = meridian_table; tp->name; tp++)
@@ -965,7 +965,7 @@ yylex (YYSTYPE *lvalp, parser_control *pc)

   for (;;)
     {
-      while (c = *pc->input, isspace (c))
+      while (c = *pc->input, c_isspace (c))
        pc->input++;

       if (ISDIGIT (c) || c == '-' || c == '+')
@@ -976,7 +976,7 @@ yylex (YYSTYPE *lvalp, parser_control *pc)
          if (c == '-' || c == '+')
            {
              sign = c == '-' ? -1 : 1;
-             while (c = *++pc->input, isspace (c))
+             while (c = *++pc->input, c_isspace (c))
                continue;
              if (! ISDIGIT (c))
                /* skip the '-' sign */
@@ -1080,7 +1080,7 @@ yylex (YYSTYPE *lvalp, parser_control *pc)
            }
        }

-      if (isalpha (c))
+      if (c_isalpha (c))
        {
          char buff[20];
          char *p = buff;
@@ -1092,7 +1092,7 @@ yylex (YYSTYPE *lvalp, parser_control *pc)
                *p++ = c;
              c = *++pc->input;
            }
-         while (isalpha (c) || c == '.');
+         while (c_isalpha (c) || c == '.');

          *p = '\0';
          tp = lookup_word (pc, buff);
@@ -1205,7 +1205,7 @@ get_date (struct timespec *result, char const *p, struct
timespec const *now)
   if (! tmp)
     return false;

-  while (c = *p, isspace (c))
+  while (c = *p, c_isspace (c))
     p++;

   if (strncmp (p, "TZ=\"", 4) == 0)
--
=== >8 ===




-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to