On 2017-01-03 15:13:23, Daniel Kahn Gillmor wrote:
> On Tue 2017-01-03 14:20:43 -0500, anarcat wrote:
>> I'm happy to follow whatever upstream decides, but I'd like to point out
>> that this is not just a feature request ("non-ASCII wordlist", which can
>> be supported fine even if we go back to py2 btw), but an actual bug
>> ("fails to work").
>
> "fails to work when the user explicitly sets LANG=C on a program that
> deals with human-readable text in 2017" :)

I think this is inaccurate: users do not need to explicitely set LANG=C,
it is the default when unset. :)

>> C.UTF-8 is necessarily available on all Debian systems, let alone the
>> default. In fact, I believe the default locale, on Debian systems, is
>> still C. Having our package fail to work in that locale breaks the
>> Principle Of Least Astonishment.
>
> I don't think this is the case, but i could be wrong.  What makes you
> think that this is true?

primarly out of gut feeling: i commonly get pushback from people not
running unicode locales when i report bugs triggered by my funny name.

but also, i have a debian wheezy VM here created with vmdebootstrap (so
fairly plain). here's the locale:

root@debian:~# echo $LANG


ie. not set. i believe that a minimal Debian install will not set LANG
if you login through the console, nor if you login through
SSH. graphical environments typically set that variable, but we can't
assume the users will have a display manager to set that up.

i believe we may be conflating "having C.UTF-8 *available*" with "having
LANG set to some UTF-8 locale". it may be true that most systems will
have a UTF-8 locale available (although I question even that, given my
experience with this VM), but i am pretty certain that we can't assume
LANG will be properly set.

> the default for debian systems is to install
> a task-$LANGUAGE package based on the choice made during d-i, and
> configures a sensible localse C.UTF-8
> is always available.

in this (vm)debootstrap-built chroot here, it is not the case:
task-language is not installed, and the C.UTF-8 locale is not
configured. and even if it was, it is not necessarily set.

root@debian:~# apt-cache policy task-english
task-english:
  Installed: (none)
  Candidate: 3.14.1
  Version table:
     3.14.1 0
        500 http://httpredir.debian.org/debian/ wheezy/main amd64 Packages

d-i is not the only way to install debian...

(i'd be curious to see if debirf actually sets that up correctly, btw ;)

> I do note that when LANG is completely unset, we see the same failure,
> even though C.UTF-8 is available.  In that case, i'd recommend that we
> just explicitly set LANG=C.UTF-8 (within wormhole) to work around
> python-click's idiosyncracies on py3.

i think that's not necessarily a good idea: this is *exactly* the kind
of stuff the python-click warning is there for - to avoid assuming any
sort of encoding or locale, and forcing the user to decide on it.

by setting the locale, we are basically ignoring the warning, and we
might as well just catch the exception and/or silence it (which is
possible with monkeypatching).

> But if the user deliberately sets LANG=whatever to something
> non-unicode, i don't think it's unreasonable for wormhole to decline to
> work in that environment if one of its dependencies is dependent on a
> UTF-8 locale.

as we have seen, the problem is not if a user deliberately configures a
"wrong" locale, but also when no locale is configured, which is a
surprisingly common situation.

>> I still believe the simplest fix, in the short term, is to revert back
>> packaging to Python2. We could (and should, anyways) provide both
>> python2 and python3 bindings for the magic-wormhole *libraries* and make
>> the binary use the python2 libraries until the click bug is fixed or
>> Debian defaults to a UTF-8 locale.
>
> why not just (a) fix the unset $LANG situation with a small patch, and

because that silences a real issue with python3-click that we do not
want to silence. click needs to be fixed, we shouldn't hide potential
errors like this.

what if the user is running under a latin1 locale that just happens to
work because it's an extension of ASCII? before you tell me how wrong
that sounds, consider that i have done exactly that for about a decade,
over various operating systems...

> (b) tag the python-click bug as "affects: magic-wormhole" and leave it
> as is?

that sounds like a good idea in any case...

my bottom line on this bug is that wormhole is a file transfer program
that doesn't, a priori, have to specifically deal with locale
problems. garbage in, garbage out. i agree that if someone has the wrong
locale and/or passes corrupt data to wormhole, it should bail out
preemptively.

but in this case, there is a legitimate use case where no locale is
configured, or, actually, the C locale is configured (by default) and
only ASCII data is passed. we shouldn't bail out in that specific case
and i don't know of anything special in wormhole that should make it
crash in this way (apart, *maybe* from weird filenames, but then again,
users get what they deserve in that special case).

a.

-- 
Some believe it is only great power that can hold evil in check, but
that is not what I have found. It is the small everyday deeds of
ordinary folk that keep the darkness at bay. Small acts of kindness and
love.                   - J.R.R. Tolkien

Reply via email to