[Bug libstdc++/38411] [4.4 Regression] Revision 142439 caused 22_locale/locale/cons/7.cc execution test
--- Comment #4 from paolo dot carlini at oracle dot com 2008-12-05 08:24 --- Are you using the same glibc on x86_64 and ia64? The two failing testcases (cons/7.cc and members/char/2.cc, the other are implied) are essentially the same: something is different on that ia64 machine about the localedata having to do with numeric decimal point and grouping. I'll try to further debug this (because 38368 is a real issue and we want a fix) but since I can't reproduce on my machines, I need to know the value of the various dp1, dp2, etc. in the failing VERIFY (assert). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38411
[Bug libstdc++/38411] [4.4 Regression] Revision 142439 caused 22_locale/locale/cons/7.cc execution test
--- Comment #5 from jakub at gcc dot gnu dot org 2008-12-05 08:36 --- The only thing I remember was fr_FR locale changing some stuff and Fedora backing out that change as it was done upstream too late in the cycle to get feedback from the French community. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38411
[Bug libstdc++/38411] [4.4 Regression] Revision 142439 caused 22_locale/locale/cons/7.cc execution test
--- Comment #6 from paolo dot carlini at oracle dot com 2008-12-05 08:47 --- Yes, I'm not sure is the same issue. Anyway, the problem can only be in this idea: _M_data-_M_thousands_sep = *(__nl_langinfo_l(THOUSANDS_SEP, __cloc)); ... if (_M_data-_M_thousands_sep == '\0') { _M_data-_M_thousands_sep = ','; that is, we are trying to standardize on ',' (the same we have for the C locale) in case the localedata is \0 for the thousands separator. Apparently for some versions of glibc, it causes problems, I'm still trying to disentangle the logic... Jakub, how does it sound to you? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38411
[Bug libstdc++/38411] [4.4 Regression] Revision 142439 caused 22_locale/locale/cons/7.cc execution test
--- Comment #7 from paolo dot carlini at oracle dot com 2008-12-05 08:50 --- By the way, yes the fr_FR locale is heavily used in those tests... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38411
[Bug libstdc++/38411] [4.4 Regression] Revision 142439 caused 22_locale/locale/cons/7.cc execution test
--- Comment #8 from paolo dot carlini at oracle dot com 2008-12-05 08:55 --- I suspect in the localedata of fr_FR, the thousands separator may have changed in some glibc from ' ' (0x20) to '\0'. Jakub can you confirm that? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38411
[Bug libstdc++/38411] [4.4 Regression] Revision 142439 caused 22_locale/locale/cons/7.cc execution test
--- Comment #9 from jakub at gcc dot gnu dot org 2008-12-05 09:10 --- Forcing , thousands separator when none should be used is very weird. Does C++ standard mandate that behavior? means thousands shouldn't be separated by any separator. In most cases such locales also have grouping 0;0 or -1, but there are buggy? locales, e.g. bg_BG, that specify empty thousands_sep, yet have grouping 3;3. For empty thousands_sep glibc just forces no grouping: if ((wide thousands_sepwc == L'\0') || (! wide *thousands_sep == '\0')) grouping = NULL; BTW, thousands_sep is a multibyte string, it can be multiple bytes (or none, as discussed here). Say ru_RU has: U2002 which is 3 bytes: U2002 /xe2/x80/x82 EN SPACE and a bunch of locales are using U00A0, which is 2 bytes: U00A0 /xc2/xa0 NO-BREAK SPACE _NL_NUMERIC_THOUSANDS_SEP_WC is always just one wchar_t though. thousands_sep is one of the things that changed in fr_FR this year, see sourceware #6040. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38411
[Bug libstdc++/38411] [4.4 Regression] Revision 142439 caused 22_locale/locale/cons/7.cc execution test
--- Comment #10 from jakub at gcc dot gnu dot org 2008-12-05 09:13 --- To reply to #c8, it changed the other way around, from to U0020 in April 2008. But as I said, Fedora 9, which shipped glibc 2.9, was backing out these changes (Fedora 10 has them already). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38411
[Bug libstdc++/38411] [4.4 Regression] Revision 142439 caused 22_locale/locale/cons/7.cc execution test
--- Comment #11 from paolo dot carlini at oracle dot com 2008-12-05 09:16 --- (In reply to comment #9) Forcing , thousands separator when none should be used is very weird. Is the same C locale has. We only want consistency, see 38368. Does C++ standard mandate that behavior? means thousands shouldn't be separated by any separator. In most cases such locales also have grouping 0;0 or -1, but there are buggy? I suppose so, because we have a comment in the code (resulting from feedback me and / or Benjamin got from glibc people clearly saying that '\0' implies no grouping. We always worked under this hypothesis. locales, e.g. bg_BG, that specify empty thousands_sep, yet have grouping 3;3. For empty thousands_sep glibc just forces no grouping: if ((wide thousands_sepwc == L'\0') || (! wide *thousands_sep == '\0')) grouping = NULL; Ok... BTW, thousands_sep is a multibyte string, it can be multiple bytes This is just a C++ standard issue, unfortunately... Paolo. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38411
[Bug libstdc++/38411] [4.4 Regression] Revision 142439 caused 22_locale/locale/cons/7.cc execution test
--- Comment #12 from paolo dot carlini at oracle dot com 2008-12-05 09:17 --- (In reply to comment #10) To reply to #c8, it changed the other way around, from to U0020 in April 2008. But as I said, Fedora 9, which shipped glibc 2.9, was backing out these changes (Fedora 10 has them already). Crazy. Anyway, that is the problem. I will change the tests to not rely on such named locales. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38411
[Bug libstdc++/38411] [4.4 Regression] Revision 142439 caused 22_locale/locale/cons/7.cc execution test
--- Comment #13 from jakub at gcc dot gnu dot org 2008-12-05 09:38 --- I see you already disable grouping if it is empty, good. If _M_thousands_sep must be a single _CharT, then for char I guess you should transliterate it if the string is longer than one character. Either by using glibc transliteration (more generic, but slower), or by hardcoding the few multibyte strings that are used in glibc locales ATM. That's just U00A0 in current glibc (transliterate to ' ') and in some older libcs was also that U2002 (also to ' '). Maybe change everything longer than one byte to ' ' ;). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38411
[Bug libstdc++/38411] [4.4 Regression] Revision 142439 caused 22_locale/locale/cons/7.cc execution test
--- Comment #14 from paolo dot carlini at oracle dot com 2008-12-05 09:42 --- (In reply to comment #13) If _M_thousands_sep must be a single _CharT, then for char I guess you should transliterate it if the string is longer than one character. Either by using glibc transliteration (more generic, but slower), or by hardcoding the few multibyte strings that are used in glibc locales ATM. That's just U00A0 in current glibc (transliterate to ' ') and in some older libcs was also that U2002 (also to ' '). Maybe change everything longer than one byte to ' ' ;). Ok, thanks. I think we have to think more about this issue, it's as old as v3 ;) For 4.5 maybe... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38411
[Bug libstdc++/38411] [4.4 Regression] Revision 142439 caused 22_locale/locale/cons/7.cc execution test
-- paolo dot carlini at oracle dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |paolo dot carlini at oracle |dot org |dot com Status|UNCONFIRMED |ASSIGNED Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2008-12-05 09:43:10 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38411
[Bug libstdc++/38411] [4.4 Regression] Revision 142439 caused 22_locale/locale/cons/7.cc execution test
--- Comment #15 from jakub at gcc dot gnu dot org 2008-12-05 09:48 --- Well, if you take first byte from a multibyte sequence, then it is IMNSHO something that should be solved for 4.4 too. For say ru_RU.UTF-8 that means you emit invalid UTF-8 if you separate digits with say '\xc2', \x33\xc2\x33\x33\x33 is invalid UTF-8. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38411
[Bug libstdc++/38411] [4.4 Regression] Revision 142439 caused 22_locale/locale/cons/7.cc execution test
--- Comment #16 from paolo dot carlini at oracle dot com 2008-12-05 09:51 --- (In reply to comment #15) Well, if you take first byte from a multibyte sequence, then it is IMNSHO something that should be solved for 4.4 too. For say ru_RU.UTF-8 that means you emit invalid UTF-8 if you separate digits with say '\xc2', \x33\xc2\x33\x33\x33 is invalid UTF-8. Maybe, but, as I should learn from you ;) this is definitely not a regression, the time is short and the issue is tricky. Therefore, I don't think I will tackle it directly, unless you are willing to help, of course! -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38411
[Bug libstdc++/38411] [4.4 Regression] Revision 142439 caused 22_locale/locale/cons/7.cc execution test
--- Comment #17 from paolo dot carlini at oracle dot com 2008-12-05 09:54 --- Created an attachment (id=16831) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=16831action=view) Draft Draft patch using is_IS instead of fr_FR. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38411
[Bug libstdc++/38411] [4.4 Regression] Revision 142439 caused 22_locale/locale/cons/7.cc execution test
--- Comment #18 from paolo dot carlini at oracle dot com 2008-12-05 09:55 --- HJ, can you test it and report? Thanks! -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38411
[Bug libstdc++/38411] [4.4 Regression] Revision 142439 caused 22_locale/locale/cons/7.cc execution test
--- Comment #19 from tsyvarev at ispras dot ru 2008-12-05 11:08 --- It seems that C++ standard contains contradiction about thousands separator in C locale: 22.2.3.1, p1 says: The instantiations required in Table 51 (22.1.1.1.1), namely numpunctwchar_t and numpunctchar, provide classic C numeric formats, i.e. they contain information equivalent to that contained in the C locale or their wide character counterparts as if obtained by a call to widen. also, 22.2.3.1.2 p.2 says: char_type do_thousands_sep() const; Returns: A character for use as the digit group separator. The required instantiations return , or L,. It appears, that according to C++ standard, thousands separator for C locale is ','. But according to the ISO standard of C(POSIX) locale (Section 7.3, Locale Definition), thousands separator in this locale should be '\0', which means N/A or not assigned. Or is this reasoning wrong? -- tsyvarev at ispras dot ru changed: What|Removed |Added CC||tsyvarev at ispras dot ru http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38411
[Bug libstdc++/38411] [4.4 Regression] Revision 142439 caused 22_locale/locale/cons/7.cc execution test
--- Comment #20 from paolo at gcc dot gnu dot org 2008-12-05 13:09 --- Subject: Bug 38411 Author: paolo Date: Fri Dec 5 13:07:53 2008 New Revision: 142472 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=142472 Log: 2008-12-05 Paolo Carlini [EMAIL PROTECTED] PR libstdc++/38411 * testsuite/22_locale/numpunct/members/char/2.cc: Use is_IS instead of fr_FR. * testsuite/22_locale/numpunct/members/wchar_t/2.cc: Likewise. * testsuite/22_locale/locale/cons/7.cc: Likewise. Modified: trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/testsuite/22_locale/locale/cons/7.cc trunk/libstdc++-v3/testsuite/22_locale/numpunct/members/char/2.cc trunk/libstdc++-v3/testsuite/22_locale/numpunct/members/wchar_t/2.cc -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38411
[Bug libstdc++/38411] [4.4 Regression] Revision 142439 caused 22_locale/locale/cons/7.cc execution test
--- Comment #21 from paolo dot carlini at oracle dot com 2008-12-05 13:09 --- Fixed. -- paolo dot carlini at oracle dot com changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED Target Milestone|--- |4.4.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38411
[Bug libstdc++/38411] [4.4 Regression] Revision 142439 caused 22_locale/locale/cons/7.cc execution test
--- Comment #2 from hjl dot tools at gmail dot com 2008-12-05 05:25 --- Revision 142439 is the cause. -- hjl dot tools at gmail dot com changed: What|Removed |Added Summary|[4.4 Regression]|[4.4 Regression] Revision |22_locale/locale/cons/7.cc |142439 caused |execution test |22_locale/locale/cons/7.cc ||execution test http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38411
[Bug libstdc++/38411] [4.4 Regression] Revision 142439 caused 22_locale/locale/cons/7.cc execution test
--- Comment #3 from hjl dot tools at gmail dot com 2008-12-05 05:34 --- Jakub, there was a similar problem with locale on Linux before. Do you remember it? -- hjl dot tools at gmail dot com changed: What|Removed |Added CC||jakub at redhat dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38411