[Bug fortran/47007] Values from namelist file should not depend on locale settings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 --- Comment #24 from Janne Blomqvist jb at gcc dot gnu.org --- Author: jb Date: Mon Nov 10 00:17:16 2014 New Revision: 217273 URL: https://gcc.gnu.org/viewcvs?rev=217273root=gccview=rev Log: PR 47007 and 61847 Locale failures in libgfortran. 2014-11-10 Janne Blomqvist j...@gcc.gnu.org PR libfortran/47007 PR libfortran/61847 * config.h.in: Regenerated. * configure: Regenerated. * configure.ac (AC_CHECK_HEADERS_ONCE): Check for xlocale.h. (AC_CHECK_FUNCS_ONCE): Check for newlocale, freelocale, uselocale, strerror_l. * io/io.h (locale.h): Include. (xlocale.h): Include if present. (c_locale): New variable. (old_locale): New variable. (old_locale_ctr): New variable. (old_locale_lock): New variable. (st_parameter_dt): Add old_locale member. * io/transfer.c (data_transfer_init): Set locale to C if doing formatted transfer. (finalize_transfer): Reset locale to previous. * io/unit.c (c_locale): New variable. (old_locale): New variable. (old_locale_ctr): New variable. (old_locale_lock): New variable. (init_units): Init c_locale, init old_locale_lock. (close_units): Free c_locale. * runtime/error.c (locale.h): Include. (xlocale.h): Include if present. (gf_strerror): Use strerror_l if available. Reset locale to LC_GLOBAL_LOCALE for strerror_r branch. 2014-11-10 Janne Blomqvist j...@gcc.gnu.org PR libfortran/47007 PR libfortran/61847 * gfortran.texi: Add note about locale issues to thread-safety section. Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/gfortran.texi trunk/libgfortran/ChangeLog trunk/libgfortran/config.h.in trunk/libgfortran/configure trunk/libgfortran/configure.ac trunk/libgfortran/io/io.h trunk/libgfortran/io/transfer.c trunk/libgfortran/io/unit.c trunk/libgfortran/runtime/error.c
[Bug fortran/47007] Values from namelist file should not depend on locale settings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 Janne Blomqvist jb at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Assignee|unassigned at gcc dot gnu.org |jb at gcc dot gnu.org --- Comment #25 from Janne Blomqvist jb at gcc dot gnu.org --- Fixed on trunk, closing.
[Bug fortran/47007] Values from namelist file should not depend on locale settings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 Janne Blomqvist jb at gcc dot gnu.org changed: What|Removed |Added URL||https://gcc.gnu.org/ml/gcc- ||patches/2014-11/msg00277.ht ||ml --- Comment #23 from Janne Blomqvist jb at gcc dot gnu.org --- Patch at https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00277.html
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 Dominique d'Humieres dominiq at lps dot ens.fr changed: What|Removed |Added Status|ASSIGNED|NEW
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 Tobias Burnus burnus at gcc dot gnu.org changed: What|Removed |Added CC||burnus at gcc dot gnu.org --- Comment #18 from Tobias Burnus burnus at gcc dot gnu.org --- Given that we already have some special handling for DECIMAL=comma, I wonder whether one couldn't simply undo the setting by used, e.g., const char *decimal; size_t decimal_len; #if defined USE_LOCALECONV !defined USE_NL_LANGINFO const struct lconv *lc = localeconv (); #endif #ifdef USE_NL_LANGINFO decimal = nl_langinfo (DECIMAL_POINT); decimal_len = strlen (decimal); assert (decimal_len 0); #elif defined USE_LOCALECONV decimal = lc-decimal_point; if (decimal == NULL || *decimal == '\0') decimal = .; decimal_len = strlen (decimal); #else decimal = .; decimal_len = 1; #endif And then simply replace for decimal_len the existing (. or ,) decimal character from the read string/file by the one obtained for the local - before sending it to strtof/strtod/strtold/strtoflt128. (Note: decimal_len is probably nearly always == 1, but it might be different.)
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 --- Comment #19 from Tobias Burnus burnus at gcc dot gnu.org --- Seemingly, g++ prefers to unset the locale. Namely, libstdc++-v3/config/locale/generic/c_locale.cc has: // Assumes __s formatted for C locale. char* __old = setlocale(LC_ALL, 0); const size_t __len = strlen(__old) + 1; char* __sav = new char[__len]; memcpy(__sav, __old, __len); setlocale(LC_ALL, C); ... setlocale(LC_ALL, __sav); delete [] __sav;
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 --- Comment #20 from Tobias Burnus burnus at gcc dot gnu.org --- And yet another possibility: Some systems (like Darwin) have strtod_l (etc.) which takes a locale argument. GLIBC seems to have it only as weak symbol but offers __strtod_l. Looking againt at GCC's C++ compiler, one finds in libstdc++-v3/config/locale/gnu/c_locale.cc: __v = __strtof_l(__s, __sanity, __cloc); #if __GLIBC__ 2 || (__GLIBC__ == 2 __GLIBC_MINOR__ 2) // Prefer strtold_l, as __strtold_l isn't prototyped in more recent // glibc versions. __v = strtold_l(__s, __sanity, __cloc); #else __v = __strtold_l(__s, __sanity, __cloc); #endif libstdc++ uses locale::facet::_S_get_c_locale() to get the locale. The argument __cloc is in glibc/Darwin (-xlocate.h) __locale_t. However, the symbol became official in POSIX 2008 and is now locale_t. Seemingly, one has to obtain the locale with newlocale(..., C,...) - and later should free it with freelocale. * * * To summarize: It is probably best to use __strtold_l as C++ does, if it is available. That should be the fastest, but requires that one caches the locale_t var (e.g. get it at startup, free it at the end). If it is not available, one presumably should use comment 19's method of swapping the locale. For libquadmath's strtoflt128, one either has to add an _l version - or one has to always use the second method.
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 --- Comment #21 from Tobias Burnus burnus at gcc dot gnu.org --- (In reply to Tobias Burnus from comment #20) has to obtain the locale with newlocale(..., C,...) - and later should free it with freelocale. Respectively, __newlocale and __freelocale for pre-POSIX-2008 GLIBC. (The __* ones are used by libstdc++-v3/config/locale/gnu/c_locale.cc).
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 --- Comment #22 from Andreas Schwab sch...@linux-m68k.org --- In glibc newlocale(0,C,0) is cheap and doesn't need to be cached.
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 --- Comment #17 from JTappin jtappin at gmail dot com 2012-07-05 23:03:00 UTC --- Created attachment 27748 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=27748 Testcase for List-directed read By default gtk_init calls setlocale, so if the locale is set to one that uses comma for decimal, the attached code fails to read the numbers after the decimal in the case with the point. However neither before or after is the cases with the comma handled as a decimal: james@amarice-4 Dev $ export LC_ALL=de_DE.UTF8 james@amarice-4 Dev $ ./a.out 100.34560 100.0 100.0 100.0 james@amarice-4 Dev $ export LC_ALL= james@amarice-4 Dev $ ./a.out 100.34560 100.0 100.34560 100.0 james@amarice-4 Dev $ N.B. I do have a work-around for gtk-fortran by adding a call to gtk_disable_setlocale, which I will add to the gtk_init routine.
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 --- Comment #16 from PierreC pchwood at yahoo dot com 2012-03-14 13:42:13 UTC --- I bumped into this problem using f2py to interface a Fortran library into Python. The read statement upon an internal buffer depends on the locale...
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 Janne Blomqvist jb at gcc dot gnu.org changed: What|Removed |Added CC||jb at gcc dot gnu.org --- Comment #13 from Janne Blomqvist jb at gcc dot gnu.org 2010-12-20 08:48:18 UTC --- Also vaguely related: PR 36857 I don't think it's a good idea to call setlocale() in the library - I suspect this would (subtly?) break a lot of programs that set the locale and then at some later point call a Fortran library function.
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 --- Comment #14 from Alexander Varnin fenixk19 at mail dot ru 2010-12-20 14:04:08 UTC --- There is internal variants of strtof/strtod/strtold/etc functions in glibc, that allow explicitly set locale of convertion. These functions are base for user variants of strto*. If we only could access these internal functions some way, it would be solution of problem, and not workaround.
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 --- Comment #15 from Steve Kargl sgk at troutmask dot apl.washington.edu 2010-12-20 14:48:03 UTC --- On Mon, Dec 20, 2010 at 02:04:14PM +, fenixk19 at mail dot ru wrote: There is internal variants of strtof/strtod/strtold/etc functions in glibc, that allow explicitly set locale of convertion. These functions are base for user variants of strto*. If we only could access these internal functions some way, it would be solution of problem, and not workaround. It is not a solution to the problem. gfortran runs on numerous platforms where glibc is not libc nor is glibc available. Also, note that gfortran cannot simply take the glibc code and put it into libgfortran due to licensing. Tobias is probably correct. gfortran eventually will need to implement its own set of conversion routines.
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 --- Comment #5 from Tobias Burnus burnus at gcc dot gnu.org 2010-12-19 11:35:14 UTC --- (In reply to comment #4) The standard says namelist IO will use whatever current decimal separator is. If locale sets the default to ',', then that is what should be used unless explicitly set by the user. I disagree: The standard says for the OPEN statement (9.5.6.7 DECIMAL= specifier in the OPEN statement, quote of F2008): If this specifier is omitted in an OPEN statement that initiates a connection, the default value is POINT. Thus, for unit-based I/O the answer is clear: A decimal POINT shall be used by default - independent of the LOCALE setting. For internal I/O: The initial value of a connection mode (9.5.2) is the value that would be implied by an initial OPEN statement without the corresponding keyword. -- Thus, DECIMAL=POINT is the default (which can be overridden by the DECIMAL= specifier in the data transfer statement - or [if a format specification is used] by the dp/dc edit descriptors).
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 --- Comment #6 from Alexander Varnin fenixk19 at mail dot ru 2010-12-19 13:54:58 UTC --- Created attachment 22820 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=22820 Test case
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 --- Comment #7 from Alexander Varnin fenixk19 at mail dot ru 2010-12-19 13:58:35 UTC --- $ gfortran -v Using built-in specs. Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.4.4-14ubuntu5' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.4 --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --disable-werror --with-arch-32=i686 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 4.4.5 (Ubuntu/Linaro 4.4.4-14ubuntu5) libc6 version - 2.12.1. This is from Ubuntu 10.10 repo, it is called Embedded GNU C library there.
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 --- Comment #8 from Alexander Varnin fenixk19 at mail dot ru 2010-12-19 14:13:24 UTC --- Created attachment 22821 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=22821 Write namelist test case Here is also test case for writing namelist. It shows, that '.' is used whatever locale is selected. It is fun - i can generate namelist file on ru_RU with '.' as decimal, and then program will fail to read it on the same locale, because it expects ','. Moreover, ',' can't be used as decimal, because it is used to separate values in namelist.
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 Tobias Burnus burnus at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2010.12.19 14:41:16 Ever Confirmed|0 |1 --- Comment #9 from Tobias Burnus burnus at gcc dot gnu.org 2010-12-19 14:41:16 UTC --- I can reproduce this. It only occurs if one explicitly calls setlocale - as by default the locale for C programs is LC_ALL=C, which works. As the locale can be change all the time, one would need to do: char *cur_locale; cur_local = setlocale(LC_NUMERIC, NULL); setlocale(LC_NUMERIC, ); ... setlocale(LC_NUMERIC, cur_locale); around each and every READ transfer statement ...
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 Jerry DeLisle jvdelisle at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED AssignedTo|unassigned at gcc dot |jvdelisle at gcc dot |gnu.org |gnu.org --- Comment #10 from Jerry DeLisle jvdelisle at gcc dot gnu.org 2010-12-19 17:47:54 UTC --- Putting the Fortran standard aside for a moment, it seems reasonable to me that we do something about this bug, even if it is as an extension. I have assigned to myself and will give it some thought before I suggest a solution. Thanks for the report.
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 --- Comment #11 from Alexander Varnin fenixk19 at mail dot ru 2010-12-19 18:04:04 UTC --- I was glad to help.
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 --- Comment #12 from Tobias Burnus burnus at gcc dot gnu.org 2010-12-19 21:32:49 UTC --- Vaguely related: PR 35862 (rounding on input). Both PR 35862 and this PR seem to require writing a libgfortran-specific version of strtof/strtod/strtold/quadmath_strtopQ - currently, the system's version is called in convert_real. The only alternatives I see are: a) To reset the locale for every READ statement ('setlocale(LC_NUMERIC, C)') - and restore it afterwards. b) To use localeconv and convert the decimal sign appropriately in the string (as currently done for DECIMAL=comma) - again, this needs to happen for every READ statement c) Simply document (gfortran.texi) that messing around with setlocale might produce surprising results. [LC_ALL/LANG environment variables do not have any effect unless setlocale is explicitly called. Cf. http://www.opengroup.org/onlinepubs/007904875/functions/setlocale.html] I think (a) and (b) are (too) slow and cumbersome and regarding (c): No one reads the documentation.
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 kargl at gcc dot gnu.org changed: What|Removed |Added CC||kargl at gcc dot gnu.org --- Comment #1 from kargl at gcc dot gnu.org 2010-12-18 19:38:22 UTC --- (In reply to comment #0) Namelist files can be generated on any platform with any locale settings. And namelist files must be usable among this platforms. But today some data in namelist files depends on locale settings, for example floating point format. So when you try to open namelist file, generated on en_US locale in the environment with ru_RU locale, for example, you get into trouble with reading floating points values. Namelist entry DT=7.302E-011 is read as 7.3E-11 on en_US and as 7 in ru_RU. Namelist should not depend on locale settings, so we can open namelist file generated with any locale settings on any other locale settings, without changing file and locale. The standard is fairly clear on the behavior. From the final committee draft of F2008, page 268 The datum c (10.11) is any input value acceptable to format specifications for a given type, except for a restriction on the form of input values corresponding to list items of types logical, integer, and character as specified in this subclause. The form of a real or complex value is dependent on the decimal edit mode in effect (10.6). F2008, Section 10.6 The decimal symbol is the character that separates the whole and fractional parts in the decimal representation of a real number in an internal or external file. When the decimal edit mode is POINT, the decimal symbol is a decimal point. When the decimal edit mode is COMMA, the decimal symbol is a comma. If a user switches between systems with different locale, then the user should explicitly set the editor descriptor (or edit the namelist file).
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 Tobias Burnus burnus at gcc dot gnu.org changed: What|Removed |Added CC||burnus at gcc dot gnu.org --- Comment #2 from Tobias Burnus burnus at gcc dot gnu.org 2010-12-18 20:51:58 UTC --- (In reply to comment #0) But today some data in namelist files depends on locale settings, for example floating point format. I cannot reproduce this. Using openSUSE Factory with GCC 4.1, 4.3 and with GCC 4.6 I always get a decimal point and never a comma. Additionally, the number seems to be always correctly read - independent whether I use en_US.UTF-8, ru_RU.UTF-8 or de_DE.UTF-8. Given that the number of users with non-US locales is quite large, I do not think that this is a common problem. In order to use a decimal comma (and ; as separator), one can use DECIMAL=comma in the OPEN statement. (Won't work with internal list-directed or namelist I/O, which should thus always use to a ..) As you seemingly have problems when using the ru_RU locale: Can you provide more information about your system (operating system, glibc version [when using GLIBC-based system], gfortran version (gfortran -v target triplet and version number) - and can you create a minimal example which reproduces the problem?
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 Jerry DeLisle jvdelisle at gcc dot gnu.org changed: What|Removed |Added CC||jvdelisle at gcc dot ||gnu.org --- Comment #3 from Jerry DeLisle jvdelisle at gcc dot gnu.org 2010-12-18 21:24:02 UTC --- Please confirm for me. Is the difference between EN and RU decimal point vs decimal comma? Can you attach a small example of the namelist data file generated with RU locale?
[Bug fortran/47007] Values from namelist file should not depend on locale settings
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47007 --- Comment #4 from Steve Kargl sgk at troutmask dot apl.washington.edu 2010-12-19 02:14:40 UTC --- On Sat, Dec 18, 2010 at 08:52:07PM +, burnus at gcc dot gnu.org wrote: In order to use a decimal comma (and ; as separator), one can use DECIMAL=comma in the OPEN statement. (Won't work with internal list-directed or namelist I/O, which should thus always use to a ..) The stndard disagrees with your 'thus always use to a .' See my first comment. The standard says namelist IO will use whatever current decimal separator is. If locale sets the default to ',', then that is what should be used unless explicitly set by the user.