On 02/02/19 17:32, Pádraig Brady wrote: > On 01/02/19 16:03, Felix Neuper wrote: >> Hi, >> >> >> Recently I stumbled upon seq's behaviour of using the floating point >> separator as defined in the current locale. >> >> Regarding portability of scripts and standard practice in most data >> processing environments, I would kindly suggest to define usage of dots >> as standard behaviour and loading locale settings only when requested >> via an option (e.g. -l, --locale). >> >> Alternatively one could allow the -f option to define the separator ( -f >> %1.2f still gave commas for a German locale) or base the output on the >> input format ( the input issue has been addressed before: >> https://lists.gnu.org/archive/html/bug-coreutils/2014-02/msg00044.html ). >> >> Unfortunately the locale-dependency in seq's behaviour is also not >> mentioned in any manual, making error tracking a hard time. > > There are many aspects of these utilities that are dependent on locale > settings. > Adding another way to control the locale would just confuse things at > this stage IMHO. What you want is to set LC_NUMERIC=C when your script > is dependent on that format. > >> Apart from that I also noticed odd behaviour with bad locale settings: >> With LANG=en_US (erroneous) and LC_NUMERIC=de_DE.UTF-8, output format is >> mixed in specific cases >> >> seq 0.1 0.2 1.3 >> 0.1 >> 0.3 >> 0.5 >> 0.7 >> 0.9 >> 1.1 >> 1,3 >> >> (note the comma in the last line) > > Well that's a bug. > The first set of numbers are output by printf(3) after: > setlocale (LC_ALL, "") > and the last one after > setlocale (LC_NUMERIC, "") > > Now your first set of numbers should be outputting ',' as the decimal point. > My glibc-2.24 system does at least. Can you give the output from the locale > command so that we can double check the values of all env vars that might > be significant here. Also it would be useful to show the specific values for > these env vars: > LANGUAGE, LC_ALL, LC_NUMERIC, LANG > > It sounds like on your system that LANG takes precedence in the first case, > but not in the second. That's a bug (that we might be able to work around > if deemed widespread enough). I know also that OpenBSD can only set some > locales > from LC_ALL, so perhaps doing an explicit setlocale (LC_NUMERIC, "") at > startup > is appropriate to handle these systems. > > For the record, here's the setlocale output on my system: > > $ LANG=en_US LC_NUMERIC=de_DE.UTF-8 ltrace -a40 -e setlocale src/seq 0.1 0.2 > 1.3 >/dev/null > seq->setlocale(LC_ALL, "") = "LC_CTYPE=en_US;LC_NUMERIC=de_DE."... > seq->setlocale(LC_NUMERIC, "C") = "C" > seq->setlocale(LC_NUMERIC, "") = "de_DE.UTF-8"
Ah I see. en_US isn't valid at all on your system. By setting an invalid LANG I was able to repro, and the attached should address this inconsistency. cheers, Pádraig
>From e0f4a209a4442d499d3987e46891db4037bb3287 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A1draig=20Brady?= <p...@draigbrady.com> Date: Sat, 2 Feb 2019 19:16:02 -0800 Subject: [PATCH] seq: output decimal points consistently with invalid locales * src/seq.c (print_numbers): Only reset the locale if it was successfully set originally. * tests/misc/seq-locale.sh: Add a new test. * tests/local.mk: Reference the new test. * NEWS: Mention the fix. --- NEWS | 4 ++++ src/seq.c | 11 ++++++++--- tests/local.mk | 1 + tests/misc/seq-locale.sh | 28 ++++++++++++++++++++++++++++ 4 files changed, 41 insertions(+), 3 deletions(-) create mode 100755 tests/misc/seq-locale.sh diff --git a/NEWS b/NEWS index 4b6b8bf..cbd8fcb 100644 --- a/NEWS +++ b/NEWS @@ -14,6 +14,10 @@ GNU coreutils NEWS -*- outline -*- df no longer corrupts displayed multibyte characters on macOS. [bug introduced with coreutils-8.18] + seq no longer outputs inconsistent decimal point characters + for the last number, when locales are misconfigured. + [bug introduced in coreutils-7.0] + shred, sort, and split no longer falsely report ftruncate errors when outputting to less-common file types. For example, the shell command 'sort /dev/null -o /dev/stdout | cat' no longer fails with diff --git a/src/seq.c b/src/seq.c index 61d20fe..b591336 100644 --- a/src/seq.c +++ b/src/seq.c @@ -42,6 +42,9 @@ #define AUTHORS proper_name ("Ulrich Drepper") +/* True if the locale settings were honored. */ +static bool locale_ok; + /* If true print all number with equal width. */ static bool equal_width; @@ -324,9 +327,11 @@ print_numbers (char const *fmt, struct layout layout, long double x_val; char *x_str; int x_strlen; - setlocale (LC_NUMERIC, "C"); + if (locale_ok) + setlocale (LC_NUMERIC, "C"); x_strlen = asprintf (&x_str, fmt, x); - setlocale (LC_NUMERIC, ""); + if (locale_ok) + setlocale (LC_NUMERIC, ""); if (x_strlen < 0) xalloc_die (); x_str[x_strlen - layout.suffix_len] = '\0'; @@ -559,7 +564,7 @@ main (int argc, char **argv) initialize_main (&argc, &argv); set_program_name (argv[0]); - setlocale (LC_ALL, ""); + locale_ok = !!setlocale (LC_ALL, ""); bindtextdomain (PACKAGE, LOCALEDIR); textdomain (PACKAGE); diff --git a/tests/local.mk b/tests/local.mk index 290e30f..4751886 100644 --- a/tests/local.mk +++ b/tests/local.mk @@ -244,6 +244,7 @@ all_tests = \ tests/misc/seq.pl \ tests/misc/seq-epipe.sh \ tests/misc/seq-io-errors.sh \ + tests/misc/seq-locale.sh \ tests/misc/seq-long-double.sh \ tests/misc/seq-precision.sh \ tests/misc/head.pl \ diff --git a/tests/misc/seq-locale.sh b/tests/misc/seq-locale.sh new file mode 100755 index 0000000..8a46ab7 --- /dev/null +++ b/tests/misc/seq-locale.sh @@ -0,0 +1,28 @@ +#!/bin/sh +# Test for output with appropriate precision + +# Copyright (C) 2019 Free Software Foundation, Inc. + +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. + +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. + +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <https://www.gnu.org/licenses/>. + +. "${srcdir=.}/tests/init.sh"; path_prepend_ ./src +print_ver_ seq + +# With coreutils-8.30 and earlier, the last decimal point would be ',' +# when setlocale(LC_ALL, "") failed, but setlocale(LC_NUMERIC, "") succeeded. +LC_ALL= LANG=invalid LC_NUMERIC=$LOCALE_FR_UTF8 seq 0.1 0.2 0.7 > out || fail=1 +uniq -w2 out > out-merge || framework_failure_ +test "$(wc -l < out-merge)" = 1 || { fail=1; cat out; } + +Exit $fail -- 2.9.3