Re: lintian groff-message warning "can't set the locale"

2022-03-03 Thread Felix Lechner
Hi Nick,

On Sun, Oct 25, 2020 at 6:23 PM Nick Black  wrote:
>
> My Salsa CI pipeline is blowing up in the lintian step, with
> lots of warnings of the form:

Following an upgrade of the Salsa runners to bullseye [1] the bug you
reported here originally was closed. [2]

Thank you for using Lintian!

Kind regards,
Felix Lechner

[1] https://salsa.debian.org/salsa/support/-/issues/277#note_299577
[2] https://bugs.debian.org/973313



Re: lintian groff-message warning "can't set the locale"

2020-10-28 Thread Felix Lechner
Hi,

There is now Bug#973313. Please comment there going forward. Thanks!

* * *

On Mon, Oct 26, 2020 at 1:53 PM Colin Watson  wrote:
>
> If all else fails then setting MAN_NO_LOCALE_WARNING=1 may be a viable
> workaround.

Colin, thanks for the workaround. We will spend a few more days trying
to find the bug first.

Kind regards
Felix Lechner



Re: lintian groff-message warning "can't set the locale"

2020-10-26 Thread Colin Watson
On Mon, Oct 26, 2020 at 08:16:43PM +, Simon McVittie wrote:
> On Mon, 26 Oct 2020 at 18:35:53 +, Colin Watson wrote:
> > LC_ALL should imply LANG
> 
> One thing that it does not imply is LANGUAGE, used for LC_MESSAGES as a
> GNU extension (at a higher precedence than even LC_ALL).

Indeed, though I don't believe it's possible for it to cause the warning
message in question here (which results from setlocale (LC_ALL, "")
returning NULL).

If all else fails then setting MAN_NO_LOCALE_WARNING=1 may be a viable
workaround.

-- 
Colin Watson (he/him)  [cjwat...@debian.org]



Re: lintian groff-message warning "can't set the locale"

2020-10-26 Thread Simon McVittie
On Mon, 26 Oct 2020 at 18:35:53 +, Colin Watson wrote:
> LC_ALL should imply LANG

One thing that it does not imply is LANGUAGE, used for LC_MESSAGES as a
GNU extension (at a higher precedence than even LC_ALL).

smcv



Re: lintian groff-message warning "can't set the locale"

2020-10-26 Thread Colin Watson
On Mon, Oct 26, 2020 at 07:57:58AM -0700, Felix Lechner wrote:
> On Mon, Oct 26, 2020 at 5:11 AM Nick Black  wrote:
> > C.UTF-8 sounds like the right way to go.
> 
> As noted in the issue tracker [1], Lintian already sets LC_ALL to
> C.UTF-8 [2] in a sanitized environment, but we do not currently set
> LANG.

LC_ALL should imply LANG, and as far as I know that works fine in man
(which is the program producing the warning message in this case), so
this should make no difference.  If somebody can come up with a reduced
test environment in which man does not seem to interpret LC_ALL as
implying LANG, I'd consider that a bug.

-- 
Colin Watson (he/him)  [cjwat...@debian.org]



Re: lintian groff-message warning "can't set the locale"

2020-10-26 Thread Felix Lechner
Hi Nick,

On Mon, Oct 26, 2020 at 5:11 AM Nick Black  wrote:
>
> C.UTF-8 sounds like the right way to go.

As noted in the issue tracker [1], Lintian already sets LC_ALL to
C.UTF-8 [2] in a sanitized environment, but we do not currently set
LANG. That would have been my next step, except these issues do not
occur in a clean chroot for unstable and are therefore more likely
related to Salsa or Salsa CI.

Kind regards
Felix Lechner

[1] https://salsa.debian.org/salsa-ci-team/pipeline/-/issues/182
[2] 
https://salsa.debian.org/lintian/lintian/-/blob/master/checks/documentation/manual.pm#L281



Re: lintian groff-message warning "can't set the locale"

2020-10-26 Thread gregor herrmann
On Mon, 26 Oct 2020 11:47:37 +, Simon McVittie wrote:

> Minimal container/chroot environments, and in particular the official
> Debian buildds, will normally only have C and C.UTF-8. See src:gtk+4.0
> for an example of how to generate additional locales on-demand if your
> unit tests need them.

Alternatively, build-depending on locales-all usually also works
(benefit: no manual meddling with locales, cost: installation size).
 
Cheers,
gregor

-- 
 .''`.  https://info.comodo.priv.at -- Debian Developer https://www.debian.org
 : :' : OpenPGP fingerprint D1E1 316E 93A7 60A8 104D  85FA BB3A 6801 8649 AA06
 `. `'  Member VIBE!AT & SPI Inc. -- Supporter Free Software Foundation Europe
   `-   NP: Davy Graham: Hornpipe For Harpsichord Playes Upon Guitar


signature.asc
Description: Digital Signature


Re: lintian groff-message warning "can't set the locale"

2020-10-26 Thread Nick Black
Simon McVittie left as an exercise for the reader:
> If you care about portability to non-Debian systems, note that C.UTF-8 is
> a somewhat popular extension (I think it originated in the Fedora/Red Hat
> family before it was adopted by Debian and other distros) but is far from
> universally available. In particular, I'm aware of Arch Linux specifically
> *not* having it. The glibc maintainers consider the implementation used
> in e.g. Fedora and Debian to be a hack rather than something they want to
> maintain forever, but my understanding is that they would be willing to
> accept a better implementation.

As I "need" this only within the Debian Salsa CI (and only to
deal with this groff lintian warning, which it sounds like will
be handled another way), a Debian-specific solution would be
fine =]. Thanks for the details -- C.UTF-8 sounds like the right
way to go.

-- 
nick black -=- https://www.nick-black.com
to make an apple pie from scratch,
you need first invent a universe.


signature.asc
Description: PGP signature


Re: lintian groff-message warning "can't set the locale"

2020-10-26 Thread Simon McVittie
On Mon, 26 Oct 2020 at 00:35:45 -0400, Nick Black wrote:
> Thanks for the quick response, Felix. You say that "[you] will
> probably start setting $LANG in that part of Lintian." what LANG
> will you be using? Attempting to set LANG=en_US.UTF-8 in my
> salsa ci variables resulted in setlocale(3) failing all over the
> place, presumably due to the locale not having been generated.

C.UTF-8 is available on all Debian systems. It's the standard C/POSIX
locale, except that in the C locale the meaning of bytes 0x80-0xFF is
undefined, while in C.UTF-8 they are assumed/defined to be part of a
character encoded in UTF-8.

If you care about portability to non-Debian systems, note that C.UTF-8 is
a somewhat popular extension (I think it originated in the Fedora/Red Hat
family before it was adopted by Debian and other distros) but is far from
universally available. In particular, I'm aware of Arch Linux specifically
*not* having it. The glibc maintainers consider the implementation used
in e.g. Fedora and Debian to be a hack rather than something they want to
maintain forever, but my understanding is that they would be willing to
accept a better implementation.

en_US.UTF-8 is indeed not portable. Some OSs (Fedora, I think?) always
generate the en_US.UTF-8 locale regardless of any other configuration
that might exist, but Debian does not: if you chose a non-English locale
like fr_FR.UTF-8 or a non-American English locale like en_GB.UTF-8 during
installation, then you will normally only have three locales, your chosen
national locale plus the international locales C and C.UTF-8.

Minimal container/chroot environments, and in particular the official
Debian buildds, will normally only have C and C.UTF-8. See src:gtk+4.0
for an example of how to generate additional locales on-demand if your
unit tests need them.

Third-party software from outside Debian frequently assumes that the
en_US.UTF-8 locale does exist - in particular, it's common enough for
Steam games to want it to exist that Steam's diagnostic tool now checks
for it. This is mostly because it's semi-frequently (ab)used as a way
to parse and serialize C-syntax floating point in programming languages
or configuration files without getting confused by non-English decimal
points (e.g. 1.23 in English locales is 1,23 in French locales, which
means a naive implementation might write {"x": 1,23, "y": 4,56} into a
JSON file, which is of course a syntax error).

The portable way to read/write configuration files and C-like source
code is to avoid the POSIX locale-sensitive functions completely,
and use something like GLib's g_ascii_strtod() or CPython's
PyOS_string_to_double() (lots of libraries and frameworks will have an
equivalent, those are just the ones I'm most familiar with). This also
has the advantage of being thread-safe, unlike temporarily switching
POSIX locales, which is normally process-wide and therefore not thread-safe.

Another correct way to do this since POSIX.1-2008 is to use POSIX
uselocale() and the C locale, but that's unlikely to be portable
to Windows or to exotic Unix implementations, so widely-portable
software generally ends up having to reinvent something equivalent to
g_ascii_strtod() anyway.

smcv



Re: lintian groff-message warning "can't set the locale"

2020-10-25 Thread Nick Black
Felix Lechner left as an exercise for the reader:
> It's not a problem with your package. Lintian's own pipeline is
> likewise affected, even though our test suite completes fine in an
> unstable chroot. The issue is being tracked here:
> https://salsa.debian.org/salsa-ci-team/pipeline/-/issues/182

Thanks for the quick response, Felix. You say that "[you] will
probably start setting $LANG in that part of Lintian." what LANG
will you be using? Attempting to set LANG=en_US.UTF-8 in my
salsa ci variables resulted in setlocale(3) failing all over the
place, presumably due to the locale not having been generated.

-- 
nick black -=- https://www.nick-black.com
to make an apple pie from scratch,
you need first invent a universe.


signature.asc
Description: PGP signature


Re: lintian groff-message warning "can't set the locale"

2020-10-25 Thread Felix Lechner
Hi Nick,

On Sun, Oct 25, 2020 at 6:23 PM Nick Black  wrote:
>
> Is this due to having supra-ascii UTF8 characters in my man
> pages?

It's not a problem with your package. Lintian's own pipeline is
likewise affected, even though our test suite completes fine in an
unstable chroot. The issue is being tracked here:
https://salsa.debian.org/salsa-ci-team/pipeline/-/issues/182

Kind regards
Felix Lechner