Package: coreutils Version: 6.10-6 Severity: Important Tags: patch ==== NOTE ==== Please note that this bug was reported to and resolved in the upstream mailing-list. ([EMAIL PROTECTED])
Bug report: http://lists.gnu.org/archive/html/bug-coreutils/2008-07/msg00209.html Resolution: http://lists.gnu.org/archive/html/bug-coreutils/2008-08/msg00016.html ==== NOTE ==== Hello, As you can guess from the subject line, the date program distributed with coreutils malfunctions in the Turkish locale. To be specific, it only malfunctions when it tries to process English day or month names containing the letter "i". This happens because there are four "i"s in Turkish and the relationship between dotted and dotless "i"s are different when compared to the relationship between "i" and "I". In Turkish: "ı" -> "I" and "i" -> "İ" Here is a short command line conversation which clearly demonstrates the problem: === $ LANG=tr_TR.UTF-8 date -d "Fri" ### Malfunction in action! date: invalid date `Fri' $ LANG=tr_TR.UTF-8 date -d "FRI" ### "I" works - already uppercase Cum Haz 27 00:00:00 EEST 2008 $ LANG=en_US.UTF-8 date -d "Fri" ### English locale is okay Fri Jun 27 00:00:00 EEST 2008 $ LANG=en_US.UTF-8 date -d "FRI" Fri Jun 27 00:00:00 EEST 2008 === The reason of this malfunction can be seen by looking at the following lines of code in the "lookup_word" function in "lib/getdate.c": === 2688 2689 /* Make it uppercase. */ 2690 for (p = word; *p; p++) 2691 { 2692 unsigned char ch = *p; 2693 *p = toupper (ch); 2694 } 2695 === As you can see, even though the program is going to process English day/month names, it uses the locale-dependent "toupper()" function. And because the relationship between "i" and "I" is different in Turkish, when date converts "Fri" to uppercase according to Turkish capitalization rules, it ends up with something other than "FRI". (***) And later on in the same function, because the resulting string does not match "FRI", date concludes that "Fri" is not a valid day name in the Turkish locale. The following patch by Jim Meyering fixes this problem by making sure that the "c_toupper()" function is called instead of the "toupper()" in the relevant part of the coreutils code for date. Regards, M. Vefa Bıçakcı === Note: It might be necessary to update lib/getdate.c as well. (It will probably be regenerated but just to be sure.) (***): To be specific, in tr_TR.UTF-8, it ends up with "FRi". This is because "toupper()" and "tolower()" functions do not return UTF-8 characters which are needed to represent letters such as "idotabove" and "idotless". Please note that this problem exists in the non-unicode Turkish locale too, as the "toupper()" and "tolower()" functions return the corresponding 8-bit characters in ISO-8859-9 encoding in tr_TR.ISO-8859-9. === 8< === Patch by Jim Meyering diff --git a/lib/getdate.y b/lib/getdate.y index 695fd59..a94bf8b 100644 --- a/lib/getdate.y +++ b/lib/getdate.y @@ -60,7 +60,7 @@ # undef static #endif -#include <ctype.h> +#include <c-ctype.h> #include <limits.h> #include <stdio.h> #include <stdlib.h> @@ -900,7 +900,7 @@ lookup_word (parser_control const *pc, char *word) for (p = word; *p; p++) { unsigned char ch = *p; - *p = toupper (ch); + *p = c_toupper (ch); } for (tp = meridian_table; tp->name; tp++) @@ -965,7 +965,7 @@ yylex (YYSTYPE *lvalp, parser_control *pc) for (;;) { - while (c = *pc->input, isspace (c)) + while (c = *pc->input, c_isspace (c)) pc->input++; if (ISDIGIT (c) || c == '-' || c == '+') @@ -976,7 +976,7 @@ yylex (YYSTYPE *lvalp, parser_control *pc) if (c == '-' || c == '+') { sign = c == '-' ? -1 : 1; - while (c = *++pc->input, isspace (c)) + while (c = *++pc->input, c_isspace (c)) continue; if (! ISDIGIT (c)) /* skip the '-' sign */ @@ -1080,7 +1080,7 @@ yylex (YYSTYPE *lvalp, parser_control *pc) } } - if (isalpha (c)) + if (c_isalpha (c)) { char buff[20]; char *p = buff; @@ -1092,7 +1092,7 @@ yylex (YYSTYPE *lvalp, parser_control *pc) *p++ = c; c = *++pc->input; } - while (isalpha (c) || c == '.'); + while (c_isalpha (c) || c == '.'); *p = '\0'; tp = lookup_word (pc, buff); @@ -1205,7 +1205,7 @@ get_date (struct timespec *result, char const *p, struct timespec const *now) if (! tmp) return false; - while (c = *p, isspace (c)) + while (c = *p, c_isspace (c)) p++; if (strncmp (p, "TZ=\"", 4) == 0) -- === >8 === -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]