Re: lintian groff-message warning "can't set the locale"
Hi Nick, On Sun, Oct 25, 2020 at 6:23 PM Nick Black wrote: > > My Salsa CI pipeline is blowing up in the lintian step, with > lots of warnings of the form: Following an upgrade of the Salsa runners to bullseye [1] the bug you reported here originally was closed. [2] Thank you for using Lintian! Kind regards, Felix Lechner [1] https://salsa.debian.org/salsa/support/-/issues/277#note_299577 [2] https://bugs.debian.org/973313
Re: lintian groff-message warning "can't set the locale"
Hi, There is now Bug#973313. Please comment there going forward. Thanks! * * * On Mon, Oct 26, 2020 at 1:53 PM Colin Watson wrote: > > If all else fails then setting MAN_NO_LOCALE_WARNING=1 may be a viable > workaround. Colin, thanks for the workaround. We will spend a few more days trying to find the bug first. Kind regards Felix Lechner
Re: lintian groff-message warning "can't set the locale"
On Mon, Oct 26, 2020 at 08:16:43PM +, Simon McVittie wrote: > On Mon, 26 Oct 2020 at 18:35:53 +, Colin Watson wrote: > > LC_ALL should imply LANG > > One thing that it does not imply is LANGUAGE, used for LC_MESSAGES as a > GNU extension (at a higher precedence than even LC_ALL). Indeed, though I don't believe it's possible for it to cause the warning message in question here (which results from setlocale (LC_ALL, "") returning NULL). If all else fails then setting MAN_NO_LOCALE_WARNING=1 may be a viable workaround. -- Colin Watson (he/him) [cjwat...@debian.org]
Re: lintian groff-message warning "can't set the locale"
On Mon, 26 Oct 2020 at 18:35:53 +, Colin Watson wrote: > LC_ALL should imply LANG One thing that it does not imply is LANGUAGE, used for LC_MESSAGES as a GNU extension (at a higher precedence than even LC_ALL). smcv
Re: lintian groff-message warning "can't set the locale"
On Mon, Oct 26, 2020 at 07:57:58AM -0700, Felix Lechner wrote: > On Mon, Oct 26, 2020 at 5:11 AM Nick Black wrote: > > C.UTF-8 sounds like the right way to go. > > As noted in the issue tracker [1], Lintian already sets LC_ALL to > C.UTF-8 [2] in a sanitized environment, but we do not currently set > LANG. LC_ALL should imply LANG, and as far as I know that works fine in man (which is the program producing the warning message in this case), so this should make no difference. If somebody can come up with a reduced test environment in which man does not seem to interpret LC_ALL as implying LANG, I'd consider that a bug. -- Colin Watson (he/him) [cjwat...@debian.org]
Re: lintian groff-message warning "can't set the locale"
Hi Nick, On Mon, Oct 26, 2020 at 5:11 AM Nick Black wrote: > > C.UTF-8 sounds like the right way to go. As noted in the issue tracker [1], Lintian already sets LC_ALL to C.UTF-8 [2] in a sanitized environment, but we do not currently set LANG. That would have been my next step, except these issues do not occur in a clean chroot for unstable and are therefore more likely related to Salsa or Salsa CI. Kind regards Felix Lechner [1] https://salsa.debian.org/salsa-ci-team/pipeline/-/issues/182 [2] https://salsa.debian.org/lintian/lintian/-/blob/master/checks/documentation/manual.pm#L281
Re: lintian groff-message warning "can't set the locale"
On Mon, 26 Oct 2020 11:47:37 +, Simon McVittie wrote: > Minimal container/chroot environments, and in particular the official > Debian buildds, will normally only have C and C.UTF-8. See src:gtk+4.0 > for an example of how to generate additional locales on-demand if your > unit tests need them. Alternatively, build-depending on locales-all usually also works (benefit: no manual meddling with locales, cost: installation size). Cheers, gregor -- .''`. https://info.comodo.priv.at -- Debian Developer https://www.debian.org : :' : OpenPGP fingerprint D1E1 316E 93A7 60A8 104D 85FA BB3A 6801 8649 AA06 `. `' Member VIBE!AT & SPI Inc. -- Supporter Free Software Foundation Europe `- NP: Davy Graham: Hornpipe For Harpsichord Playes Upon Guitar signature.asc Description: Digital Signature
Re: lintian groff-message warning "can't set the locale"
Simon McVittie left as an exercise for the reader: > If you care about portability to non-Debian systems, note that C.UTF-8 is > a somewhat popular extension (I think it originated in the Fedora/Red Hat > family before it was adopted by Debian and other distros) but is far from > universally available. In particular, I'm aware of Arch Linux specifically > *not* having it. The glibc maintainers consider the implementation used > in e.g. Fedora and Debian to be a hack rather than something they want to > maintain forever, but my understanding is that they would be willing to > accept a better implementation. As I "need" this only within the Debian Salsa CI (and only to deal with this groff lintian warning, which it sounds like will be handled another way), a Debian-specific solution would be fine =]. Thanks for the details -- C.UTF-8 sounds like the right way to go. -- nick black -=- https://www.nick-black.com to make an apple pie from scratch, you need first invent a universe. signature.asc Description: PGP signature
Re: lintian groff-message warning "can't set the locale"
On Mon, 26 Oct 2020 at 00:35:45 -0400, Nick Black wrote: > Thanks for the quick response, Felix. You say that "[you] will > probably start setting $LANG in that part of Lintian." what LANG > will you be using? Attempting to set LANG=en_US.UTF-8 in my > salsa ci variables resulted in setlocale(3) failing all over the > place, presumably due to the locale not having been generated. C.UTF-8 is available on all Debian systems. It's the standard C/POSIX locale, except that in the C locale the meaning of bytes 0x80-0xFF is undefined, while in C.UTF-8 they are assumed/defined to be part of a character encoded in UTF-8. If you care about portability to non-Debian systems, note that C.UTF-8 is a somewhat popular extension (I think it originated in the Fedora/Red Hat family before it was adopted by Debian and other distros) but is far from universally available. In particular, I'm aware of Arch Linux specifically *not* having it. The glibc maintainers consider the implementation used in e.g. Fedora and Debian to be a hack rather than something they want to maintain forever, but my understanding is that they would be willing to accept a better implementation. en_US.UTF-8 is indeed not portable. Some OSs (Fedora, I think?) always generate the en_US.UTF-8 locale regardless of any other configuration that might exist, but Debian does not: if you chose a non-English locale like fr_FR.UTF-8 or a non-American English locale like en_GB.UTF-8 during installation, then you will normally only have three locales, your chosen national locale plus the international locales C and C.UTF-8. Minimal container/chroot environments, and in particular the official Debian buildds, will normally only have C and C.UTF-8. See src:gtk+4.0 for an example of how to generate additional locales on-demand if your unit tests need them. Third-party software from outside Debian frequently assumes that the en_US.UTF-8 locale does exist - in particular, it's common enough for Steam games to want it to exist that Steam's diagnostic tool now checks for it. This is mostly because it's semi-frequently (ab)used as a way to parse and serialize C-syntax floating point in programming languages or configuration files without getting confused by non-English decimal points (e.g. 1.23 in English locales is 1,23 in French locales, which means a naive implementation might write {"x": 1,23, "y": 4,56} into a JSON file, which is of course a syntax error). The portable way to read/write configuration files and C-like source code is to avoid the POSIX locale-sensitive functions completely, and use something like GLib's g_ascii_strtod() or CPython's PyOS_string_to_double() (lots of libraries and frameworks will have an equivalent, those are just the ones I'm most familiar with). This also has the advantage of being thread-safe, unlike temporarily switching POSIX locales, which is normally process-wide and therefore not thread-safe. Another correct way to do this since POSIX.1-2008 is to use POSIX uselocale() and the C locale, but that's unlikely to be portable to Windows or to exotic Unix implementations, so widely-portable software generally ends up having to reinvent something equivalent to g_ascii_strtod() anyway. smcv
Re: lintian groff-message warning "can't set the locale"
Felix Lechner left as an exercise for the reader: > It's not a problem with your package. Lintian's own pipeline is > likewise affected, even though our test suite completes fine in an > unstable chroot. The issue is being tracked here: > https://salsa.debian.org/salsa-ci-team/pipeline/-/issues/182 Thanks for the quick response, Felix. You say that "[you] will probably start setting $LANG in that part of Lintian." what LANG will you be using? Attempting to set LANG=en_US.UTF-8 in my salsa ci variables resulted in setlocale(3) failing all over the place, presumably due to the locale not having been generated. -- nick black -=- https://www.nick-black.com to make an apple pie from scratch, you need first invent a universe. signature.asc Description: PGP signature
Re: lintian groff-message warning "can't set the locale"
Hi Nick, On Sun, Oct 25, 2020 at 6:23 PM Nick Black wrote: > > Is this due to having supra-ascii UTF8 characters in my man > pages? It's not a problem with your package. Lintian's own pipeline is likewise affected, even though our test suite completes fine in an unstable chroot. The issue is being tracked here: https://salsa.debian.org/salsa-ci-team/pipeline/-/issues/182 Kind regards Felix Lechner