[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2019-04-07 Thread Nick Coghlan


Nick Coghlan  added the comment:

I'm inclined to leave 3.7 alone unless/until we get an actual bug report for 
the current behaviour - if an embedder finds that it just straight up doesn't 
work for them, then they can either resort to tinkering with the private APIs 
(since they won't change until Python 3.8, and at that point we expect to have 
a public API available), or else stick with Python 3.6 until later this year 
when Python 3.8 is available.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2019-04-05 Thread STINNER Victor


STINNER Victor  added the comment:

I fixed the issue in Python 3.8 with bpo-36443:

commit d929f1838a8fba881ff0148b7fc31f6265703e3d
Author: Victor Stinner 
Date:   Wed Mar 27 18:28:46 2019 +0100

bpo-36443: Disable C locale coercion and UTF-8 Mode by default (GH-12589)

I'm not sure if it would be accceptable to the behavior in Python 3.7.x, 
especially without PEP 587 which targets Python 3.8.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-12-19 Thread Nick Coghlan


Nick Coghlan  added the comment:

I still think the current Python 3.7 behaviour makes the CPython runtime a 
badly behaved C library, since we may end up changing the active libc locale 
even when it isn't the main application, and a change hasn't been explicitly 
requested by the embedding app.

However, the only way to encounter that misbehaviour is to be running a 
Unicode-unfriendly embedding application that leaves the C locale in place on a 
Unicode-unfriendly operating system that doesn't set a more sensible locale in 
the first place.

I've also gone through a few iterations on 
https://github.com/python/cpython/pull/9257 now, and respecting the -E and -I 
options *without* deferring locale coercion until the same point where UTF-8 
mode is checked for turns out to require duplication of an irritatingly large 
amount of argument processing code (as even if all you're checking for is two 
options that don't take arguments, you still need to correctly skip over all 
the other permitted arguments, stop at the arguments that terminate the python 
option list, and find the options of interest even when they're in a combined 
option string like "-vI").

Accordingly, I no longer think it's worth pursuing this change purely to 
assuage my sense of responsibility to developers of applications embedding the 
Python runtime - instead, we can leave the Python 3.7.0 solution in place for 
3.8 as well, until such time as we have embedding application authors actually 
reporting bug reports against the Python 3.7 behaviour.

To be completely honest, I expect the odds of that actually happening in 
practice to be incredibly low, and even if it does happen, our answer may well 
be to point to the in-development multi-phase configuration API rather than 
allowing it to be disabled when using Py_Initialize() and/or Py_Main().

--
resolution:  -> postponed
stage: patch review -> resolved
status: open -> closed
type:  -> behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-11-28 Thread Kubilay Kocak


Change by Kubilay Kocak :


--
nosy: +koobs

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-11-24 Thread STINNER Victor


STINNER Victor  added the comment:

C locale coercion is just one parameter used to select the Python
filesystem enconding. There are many others which are read later during
Python initialization, and they are inter-dependent.

https://github.com/python/cpython/blob/7f4ba4afd47f21f61de9035544809fc67d136f35/Include/coreconfig.h#L69

I don't see which problem you are trying to solve here.

For me, PYTHONCOERCECLOCALE environment variable must be ignored when -I or
-E is used, for security reasons and to get more deterministic behavior. So
I am opposed to read/enable C locale coercion earlier.

That's also why PYTHONMALLOC only change the memory allocator late in the
Python initialisation, whereas it would be simpler to read/apply the option
earlier.

For the FreeBSD CURRENT bug, I recently modified Python to use ASCII if
"force ASCII" mode is enabled. I made this change for... HP-UX! On FreeBSD,
the filesystem encoding was already ASCII in practice (FreeBSD uses Latin1
but announes ASCII!?). I am not surprised to see bugs on specific
configuration with special option (disable UTF-8 mode) :-)

I am fixing Unicode bugs on weird cased before 3.0 and we are still not
done! Recently I fixed a bug in localeconv() when LC_MONETARY uses a
different encoding than LC_CTYPE... Such config sounds like a bad choice,
but well, it "works" on Python2...

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-11-24 Thread Nick Coghlan


Nick Coghlan  added the comment:

If I've understood https://bugs.python.org/issue35290 and its resolution in 
https://github.com/python/cpython/commit/f6e323ce322cf54b1a9e9252b13f93ebc28b5c24
 correctly, that's a concrete example of why I consider the early coercion 
approach that avoids any kind of redecoding dance significantly more robust 
against issues arising from future changes in the CPython startup sequence.

I'm away for AWS Re:Invent until Dec 2, but after I get back, I'll finally sit 
down and take one last pass over this to see if I can come up with a version 
that Victor is actually willing to review and approve.

--
nosy: +vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-11-06 Thread Nick Coghlan


Nick Coghlan  added the comment:

The discussion with Victor on https://bugs.python.org/issue34914 highlighted 
the fact that it's OK to use 8-bit string comparisons to check for "-E", "-I", 
and a "-X coerce_legacy_c_locale=0" due to the fact that all encodings used as 
locale encodings are sufficiently ASCII compatible for that to work as desired.

So before proceeding with merging https://github.com/python/cpython/pull/9257, 
I'm going to review that possibility, and see how much code it would actually 
add to support an extra internal helper API like:

int _Py_LegacyLocaleCoercionEnabled(int argv, char* argv[]);

That would then inspect the unprocessed 8-bit command line arguments, as well 
as the process environment, to determine whether or not locale coercion should 
be attempted, making the complete dance:

_Py_SetLocaleFromEnv(LC_CTYPE);
if (_Py_LegacyLocaleDetected() && _Py_LegacyLocaleCoercionEnabled(argc, 
argv))
{
_Py_CoerceLegacyLocale(_locale_coercion);
}

If we wanted to officially expose this for embedding apps, the API would need a 
bit more thought, so it probably makes sense to wait and see if the 
nominally-private-but-exposed-to-the-linker approach is good enough in practice.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-21 Thread STINNER Victor


Change by STINNER Victor :


--
nosy:  -vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-21 Thread Nick Coghlan


Nick Coghlan  added the comment:

Correct - it won't change anything from 3.7.0, and even the original 
discrepancies relative to PEP 538 only affect applications that:

1. Are embedding a CPython runtime
2. *Aren't* already ensuring that they're running in a locale other than the C 
locale
3. Aren't able to set PYTHONCOERCECLOCALE=0 or LC_ALL=C in the environment 
prior to calling Py_Initialize() or Py_Main() in order to disable the locale 
coercion

(Because of the high quality of Victor's work on the UTF-8 mode implementation, 
it's practically impossible to externally distinguish between the behaviour of 
late coercion and the intended early coercion in the CPython CLI itself)

--
priority: release blocker -> normal

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-20 Thread Ned Deily


Ned Deily  added the comment:

Thanks, Victor and Nick, for the work.  With the revert merged (GH-9430), this 
issue is no longer a release blocker for 3.7.1, correct?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-20 Thread Nick Coghlan


Nick Coghlan  added the comment:

(Also, the patch is currently still written on the assumption that it won't be 
backported to 3.7.1. I haven't checked if it would apply cleanly to 3.7.x, but 
I suspect not, given the magnitude of the startup changes targeting 3.8.0 since 
3.7.0 was released)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-20 Thread Nick Coghlan


Nick Coghlan  added the comment:

https://github.com/python/cpython/pull/9257 should be complete now, although I 
may have gone overboard on the documentation updates (as it seems to be 
unclear, at least to Victor, why PYTHONCOERCECLOCALE exists in the first place)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-19 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset 06e7608207daab9fb82d13ccf2d3664535442f11 by Victor Stinner in 
branch 'master':
Revert "bpo-34589: Add -X coerce_c_locale command line option (GH-9378)" 
(GH-9430)
https://github.com/python/cpython/commit/06e7608207daab9fb82d13ccf2d3664535442f11


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-19 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +8847

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-19 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset 95cc3ee00cfa079751ae2bb9a8d3387053b50489 by Victor Stinner in 
branch '3.7':
Revert "[3.7] bpo-34589: Add -X coerce_c_locale option; C locale coercion off 
by default (GH-9379)" (GH-9416)
https://github.com/python/cpython/commit/95cc3ee00cfa079751ae2bb9a8d3387053b50489


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-19 Thread STINNER Victor


STINNER Victor  added the comment:

The issue is now discussed on python-dev. My latest email:
https://mail.python.org/pipermail/python-dev/2018-September/155242.html

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-19 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +8836

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-19 Thread Nick Coghlan


Nick Coghlan  added the comment:

(The other reason this change should be reverted is because it added a new 
feature to Python 3.7.1, even though in the earlier discussion we had agreed to 
leave the as-shipped implementation in 3.7.0 alone for the rest of the Python 
3.7.x releases, and only discuss how we wanted it to work in Python 3.8, 
precisely so that we could have the discussion on a timeline that was amenable 
to me, without impacting the 3.7.1 release schedule)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-19 Thread Nick Coghlan


Nick Coghlan  added the comment:

test_c_locale_coercion *did* test the -E and -I options, by running everything 
in isolated mode. This was only broken by Victor's changes to the test suite 
(which broke locale coercion in isolated mode, and hence broke the tests).

As for why PYTHONCLOCALECOERCION=0 and PYTHONCLOCALECOERCION=warn exist even 
though I don't actually want people to use them, it's because they're a 
potentially necessary debugging tool if you're seeing an issue that you suspect 
may be due to locale coercion working on one platform (e.g Fedora), and not 
working on another (e.g. CentOS 7). You can't easily hide the C.UTF-8 locale 
from the Python interpreter on Fedora, but you *can* tell the interpreter not 
to use it.

UTF-8 mode is different, as folks may want to opt in to that regardless of 
their current locale setting.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-19 Thread Nick Coghlan


Nick Coghlan  added the comment:

(Note: I won't have time to work on this myself until this weekend at the 
earliest)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-19 Thread Nick Coghlan


Nick Coghlan  added the comment:

The only reason this got through my original review was because PEP 540 was 
implemented when I was going through a personal situation that lead to me 
quitting my job cold without a new one to go to, so my review of Victors 
changes to the PEP 538 implementation was pretty cursory.

Now that I've finally reviewed it properly, I can see those changes were 
fundamentally misguided, and completely missed the intent and purpose of the 
PEP - it's deliberately designed to work completely differently from PEP 540, 
but Victor's now turned it into a pale shadow of the latter. (Victor's patch 
even had to change the locale coercion tests because it broke the isolated mode 
behaviour described in the PEP)

I was trying to be nice about it, but given that Victor is starting to just go 
ahead and change things further, I'm going to go with the hardline veto option: 
make PEP 538 work the it is described in the PEP and stop trying to change it 
on an arbitrary whim without writing a new PEP to overrule the original one.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-19 Thread Nick Coghlan


Nick Coghlan  added the comment:

3.7.1 should not ship until PEP 538 is once more implemented as documented, 
without Victor's personal editorialising and feature additions to a maintenance 
release.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-19 Thread Nick Coghlan


Nick Coghlan  added the comment:

Victor, no - you've completely broken PEP 538 now.

Please just give up, and implement the PEP as written, and stop trying to use 
your copious amounts of available development time to railroad me.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-18 Thread Ned Deily


Change by Ned Deily :


--
versions: +Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-17 Thread STINNER Victor


STINNER Victor  added the comment:

test_c_locale_coercion didn't test the -E option: I wrote PR 9382 to test it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-17 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +8806

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-17 Thread Ned Deily


Ned Deily  added the comment:

I'm marking this as a 3.7.1 "release blocker" until Nick has had a chance to 
review the merged PR and he and Victor are in agreement that this is a 
satisfactory approach for 3.7.1.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-17 Thread Ned Deily


Change by Ned Deily :


--
priority: normal -> release blocker

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-17 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset 144f1e2c6f4a24bd288c045986842c65cc289684 by Victor Stinner in 
branch '3.7':
[3.7] bpo-34589: Add -X coerce_c_locale option; C locale coercion off by 
default (GH-9379)
https://github.com/python/cpython/commit/144f1e2c6f4a24bd288c045986842c65cc289684


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-17 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +8803

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-17 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset dbdee0073cf0b88fe541980ace1f650900f455cc by Victor Stinner in 
branch 'master':
bpo-34589: Add -X coerce_c_locale command line option (GH-9378)
https://github.com/python/cpython/commit/dbdee0073cf0b88fe541980ace1f650900f455cc


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-17 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +8802

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-17 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset 7a0791b6992d420dc52536257f2f093851ed7215 by Victor Stinner in 
branch 'master':
bpo-34589: C locale coercion off by default (GH-9073)
https://github.com/python/cpython/commit/7a0791b6992d420dc52536257f2f093851ed7215


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-17 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset 188ebfa475a6f6aa2d0ea14ca8e1fbe7865b6d27 by Victor Stinner in 
branch 'master':
bpo-34589: Make _PyCoreConfig.coerce_c_locale private (GH-9371)
https://github.com/python/cpython/commit/188ebfa475a6f6aa2d0ea14ca8e1fbe7865b6d27


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-17 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +8794

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-17 Thread STINNER Victor


STINNER Victor  added the comment:

> I'd strongly prefer to just go back to the PEP 538 design. It's much simpler 
> to implement, we don't actually want anyone turning off locale coercion 
> except for debugging purposes (unlike UTF-8 mode), and the only argument 
> against doing this the way PEP 538 describes is a purist one, not a practical 
> one (which was already resolved in favour of practicality when PEP 538 was 
> accepted).
> (...)
> That's quite a bit of code to add for the sake of a flag we don't really want 
> anyone to ever use, though. (If it hadn't been for the 
> debugging-CentOS-7-problems-on-Fedora issue, I doubt I would have included 
> the off switch in PEP 538 at all)

Why did you add PYTHONCOERCECLOCALE=warn if you don't want that people use it?

> The actual functional error is that the following will currently give 
> different outputs on Fedora and CentOS 7, whereas in the original PEP 538 
> implementation it would always print "C", even if locale coercion would 
> otherwise normally work on your current platform:
>
>LC_CTYPE=C PYTHONCOERCECLOCALE=0 python3 -E -c 'import os; 
> print(os.env["LC_CTYPE"])'

You wrote that we need PYTHONCOERCECLOCALE=0 and then that we don't need it, I 
don't understand :-)

IMHO there are use cases where PYTHONCOERCECLOCALE=0 is important, that's why I 
propose to add -X coerce_c_locale=0.

There are also cases where you want the warning, so -X coerce_c_locale=warn 
would help as well, when -E or -I is needed.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-16 Thread Nick Coghlan


Nick Coghlan  added the comment:

I'd strongly prefer to just go back to the PEP 538 design. It's much simpler to 
implement, we don't actually want anyone turning off locale coercion except for 
debugging purposes (unlike UTF-8 mode), and the only argument against doing 
this the way PEP 538 describes is a purist one, not a practical one (which was 
already resolved in favour of practicality when PEP 538 was accepted).

However, if you were willing to do the updates on my PR branch to implement it, 
then I'd accept an alternative that added a pass through the command line 
arguments that checks for at least one of the two following cases after a 
legacy locale has been detected:

- the new option "-X coerce_c_locale=0" (ASCII) is present in the arg list
- neither "-E" nor "-I" (ASCII) are present in the arg list, and 
PYTHONCOERCECLOCALE is not set to zero

The code that sets warn_on_c_locale in the core config would then look for `-X 
coerce_c_locale=warn` in addition to looking for `PYTHONCOERCECLOCALE=warn`.

That's quite a bit of code to add for the sake of a flag we don't really want 
anyone to ever use, though. (If it hadn't been for the 
debugging-CentOS-7-problems-on-Fedora issue, I doubt I would have included the 
off switch in PEP 538 at all)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-13 Thread STINNER Victor


STINNER Victor  added the comment:

> LC_CTYPE=C PYTHONCOERCECLOCALE=0 python3 -E ...

The UTF-8 Mode has the same behavior: PYTHONUTF8 env var is ignored when using 
-E, but enabled by the C locale: you can use -X utf8=0 in that case to ensure 
that the UTF-8 Mode is disabled.

I wanted to propose you to add a new -X option, ex: -X coerce_c_locale=0, which 
would have the priority over the environment variable. I'm not a big fan of env 
vars, it's not always convenient to use them. I prefer an env var, but both are 
even better :-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-13 Thread Nick Coghlan


Nick Coghlan  added the comment:

Alternate PR is up at https://github.com/python/cpython/pull/9257

There's one setting that remains in CoreConfig: "warn_on_c_locale", which 
_Py_UnixMain also uses to decide whether or not to actually emit the passed in 
locale coercion warning.

The coercion warning information itself moves out of CoreConfig, and into the 
private _PyMain helper struct.

The signature of _Py_CoerceLegacyLocale changed to both return an integer 
indicating whether or not coercion took place, as well as supporting two output 
variables that allow the display of the coercion warning message to be deferred 
(or skipped entirely).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-13 Thread Nick Coghlan


Change by Nick Coghlan :


--
pull_requests: +8689

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-13 Thread Nick Coghlan


Nick Coghlan  added the comment:

The actual functional error is that the following will currently give different 
outputs on Fedora and CentOS 7, whereas in the original PEP 538 implementation 
it would always print "C", even if locale coercion would otherwise normally 
work on your current platform:

LC_CTYPE=C PYTHONCOERCECLOCALE=0 python3 -E -c 'import os; 
print(os.env["LC_CTYPE"])'


As per the comment that was deleted in 
https://github.com/python/cpython/commit/9454060e84a669dde63824d9e2fcaf295e34f687#diff-8c018c3ada66d06c8e101e47a313c2c7,
 that was an intentional behaviour to allow developers testing scripts that are 
invoked using -E and -I to pretend that their local C.UTF-8 locale didn't exist 
when trying to reproducing behaviour otherwise seen only when locale coercion 
fails due to the lack of a suitable coercion target.

I'm working on an alternate PR that restores that aspect of the original 
behaviour, while preserving all the other improvements you made (such as 
ensuring that emitting the warning gets delayed until after the C level stderr 
is properly initialised)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-13 Thread Nick Coghlan


Nick Coghlan  added the comment:

Reviewing the way things are now, I don't think it's worth rearranging this yet 
again for 3.7 - instead, I'll just put it back the way I intended it to be (and 
the way the PEP describes) for 3.8.

The issue is that Victor had good reasons for moving the handling of the 
"PYTHONCOERCECLOCALE=warn" case to be after the low level C stdio streams were 
properly initialised, and retaining that aspect of the changes while still 
moving the C locale coercion back to the beginning of the program means 
changing the signature of _Py_CoerceLegacyLocale to return a stateless static 
string to be printed to stderr later.

--
priority: release blocker -> normal
versions:  -Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-11 Thread STINNER Victor

STINNER Victor  added the comment:

Example of C locale coercion with Python 3.7:

$ env -i PYTHONCOERCECLOCALE=0 ./python -X utf8=0 -c 'import sys; 
print(ascii(sys.argv[1]))' 'hé'
'h\udcc3\udca9'

$ env -i PYTHONCOERCECLOCALE=1 ./python -X utf8=0 -c 'import sys; 
print(ascii(sys.argv[1]))' 'hé'
'h\xe9'

It works as expected no? What's wrong with the current implementation according 
to you, Nick?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-11 Thread STINNER Victor


STINNER Victor  added the comment:

> (...) The documentation could definitely be much clearer on that point 
> though, as even in the PEP it's only implied by the final paragraph in 
> https://www.python.org/dev/peps/pep-0538/#legacy-c-locale-coercion-in-the-standalone-python-interpreter-binary
>  which is mostly talking about the fact that this particular environment 
> variable is still checked, even if you pass the -I or -E command line options.

That's a separated issue: I created bpo-34639.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-11 Thread STINNER Victor


STINNER Victor  added the comment:

> (...) the locale should be coerced before *ANY* calls are made to 
> Py_DecodeLocale, as the whole point of the architecture of the PEP 
> implementation was to ensure that *nothing ever gets decoded incorrectly in 
> the first place*.

See my comment:
https://github.com/python/cpython/pull/9073#issuecomment-420461537

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-11 Thread Nick Coghlan


Nick Coghlan  added the comment:

FWIW, this is also why I don't want PEP 432 to expose any wstr fields in the 
configuration settings: it means embedding applications have to figure out 
which encoding Python is expecting to be used for those fields.

Everything should be much cleaner if the public API requires those settings to 
be provided as Py_Unicode objects, and the core configuration step takes care 
of getting the C API to the point where the embedding application can create 
Py_Unicode objects for passing to later steps.

The fact that CPython itself currently uses wstr to configure these settings 
should be considered a legacy implementation detail that we do our best to hide 
from the public API.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-11 Thread Nick Coghlan


Nick Coghlan  added the comment:

(The one exception to "nothing gets decoded incorrectly" is that 
PYTHONCOERCECLOCALE itself is always interpreted as an ASCII field: the values 
that it checks for are actually ASCII byte sequences, not Unicode code points.

The documentation could definitely be much clearer on that point though, as 
even in the PEP it's only implied by the final paragraph in 
https://www.python.org/dev/peps/pep-0538/#legacy-c-locale-coercion-in-the-standalone-python-interpreter-binary
 which is mostly talking about the fact that this particular environment 
variable is still checked, even if you pass the -I or -E command line options.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-11 Thread Nick Coghlan


Nick Coghlan  added the comment:

The entire change affecting the PEP 538 implementation in 
https://github.com/python/cpython/commit/9454060e84a669dde63824d9e2fcaf295e34f687#diff-8c018c3ada66d06c8e101e47a313c2c7
 needs to be reverted: the locale should be coerced before *ANY* calls are made 
to Py_DecodeLocale, as the whole point of the architecture of the PEP 
implementation was to ensure that *nothing ever gets decoded incorrectly in the 
first place*.

I just initially missed that in Victor's enthusiasm for fixing the incorrect 
decodings that can arise in the absence of locale coercion he'd *introduced* 
incorrect decoding into the locale coercion case.

--
nosy: +ncoghlan, ned.deily
priority: normal -> release blocker

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-05 Thread STINNER Victor


Change by STINNER Victor :


--
keywords: +patch
pull_requests: +8532
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34589] Py_Initialize() and Py_Main() should not enable C locale coercion

2018-09-05 Thread STINNER Victor


New submission from STINNER Victor :

According to Nick Coghlan, author of the PEP 538, the C locale coercion should 
not be enabled by Py_Initialize() and Py_Main(), only by the python3 program.

Attached PR fix this issue.

--
components: Interpreter Core
messages: 324649
nosy: vstinner
priority: normal
severity: normal
status: open
title: Py_Initialize() and Py_Main() should not enable C locale coercion
versions: Python 3.7, Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com