from:"Petr Viktorin"

[Python-Dev] Re: subinterpreters and their possible impact on large extension projects

2021-12-16 Thread Petr Viktorin


On 16. 12. 21 3:41, Jim J. Jewett wrote:

In Python 3.11, Python still implements around 100 types as "static
types" which are not compatible with subinterpreters, like
_Type and _Type. I opened
https://bugs.python.org/issue40601 about these static types, but it
seems like changing it may break the C API *and* the stable ABI (maybe
a clever hack will avoid that).


If sub-interpreters each need their own copy of even immutable built-in types, 
then what advantage do they have over separate processes?


They need copies of all *Python* objects. A non-Python library may allow 
several Python wrappers/proxies for a single internal object, 
effectively sharing that object between subinterpreters.
(Which is a problem for removing the GIL -- currently all operations 
done by such wrappers are protected by the GIL.)

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KNMHKD3EXPXIMYEQOHEQ76DK64YRNQQX/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: my plans for subinterpreters (and a per-interpreter GIL)

2021-12-16 Thread Petr Viktorin

On 16. 12. 21 2:54, Guido van Rossum wrote:
(I just realized that we started discussing details of immortal objects
in the wrong thread -- this is Eric's overview thread, there's a
separate thread on immortal objects. But alla, I'll respond here below.)

On Wed, Dec 15, 2021 at 5:05 PM Neil Schemenauer > wrote:

On 2021-12-15 2:57 p.m., Guido van Rossum wrote:

But as long as the imbalance is less than 0x_2000_, the
refcount will remain in the inclusive range [ 0x_4000_ ,
0x_7FFF_ ] and we can test for immortality by testing a single
bit:

if (o->ob_refcnt & 0x_4000_)

Could we have a full GC pass reset those counts to make it even more
unlikely to get out of bounds?

Maybe, but so far these are all immutable singletons that aren't linked
into the GC at all. Of course we could just add extra code to the GC
code that just resets all these refcounts, but since there are ~260
small integers that might slow things down more than we'd like. More
testing is required. Maybe we can get away with doing nothing on 64-bit
machines but we'll have to slow down a tad for 32-bit -- that would be
acceptable (since the future is clearly 64-bit).

Allocating immortal objects from a specific memory region seems like
another idea worth pursuing. It seems mimalloc has the ability to
allocate pools aligned to certain large boundaries. That takes some
platform specific magic. If we can do that, the test for
immortality is pretty cheap. However, if you can't allocate them at
a fixed region determined at compile time, I don't think you can
match the performance of the code above. Maybe it helps that you
could determine immortality by looking at the PyObject pointer and
without loading the ob_refcnt value from memory? You would do
something like:

if (((uintptr_t)o) & _Py_immortal_mask)

The _Py_immortal_mask value would not be known at compile time but
would be a global constant. So, it would be cached by the CPU.

Inmmortal objects should be allocated dynamically. AFAIK, determining
whether something was malloc'd or not would need to be platform-specific.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/6Q2FQNLUKTBBRRBMN7DB5UP4RKCMQRLQ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: my plans for subinterpreters (and a per-interpreter GIL)

2021-12-16 Thread Petr Viktorin

On 15. 12. 21 23:57, Guido van Rossum wrote:
On Wed, Dec 15, 2021 at 6:04 AM Antoine Pitrou > wrote:

On Wed, 15 Dec 2021 14:13:03 +0100
Antoine Pitrou mailto:anto...@python.org>> wrote:

> Did you try to take into account the envisioned project for adding a
> "complete" GC and removing the GIL?

Sorry, I was misremembering the details. Sam Gross' proposal
(posted here on 07/10/2021) doesn't switch to a "complete GC", but it
changes reference counting to a more sophisticated scheme (which
includes immortalization of objects):

https://docs.google.com/document/d/18CXhDb1ygxg-YXNBJNzfzZsDFosB5e6BfnXLlejd9l0/edit

A note about this: Sam's immortalization covers exactly the objects that
Eric is planning to move into the interpreter state struct: "such as
interned strings, small integers, statically allocated PyTypeObjects,
and the True, False, and None objects". (Well, he says "such as" but I
think so does Eric. :-)

Sam's approach is to use the lower bit of the ob_refcnt field to
indicate immortal objects. This would not work given the stable ABI
(which has macros that directly increment and decrement the ob_refcnt
field). In fact, I think that Sam's work doesn't preserve the stable ABI
at all. However, setting a very high bit (the bit just below the sign
bit) would probably work. Say we're using 32 bits. We use the value
0x_6000_ as the initial refcount for immortal objects. The stable
ABI will sometimes increment this, sometimes decrement it. But as long
as the imbalance is less than 0x_2000_, the refcount will remain in
the inclusive range [ 0x_4000_ , 0x_7FFF_ ] and we can test for
immortality by testing a single bit:

if (o->ob_refcnt & 0x_4000_)

I don't know how long that would take, but I suspect that a program that
just increments the refcount relentlessly would have to run for hours
before hitting this range. On a 64-bit machine the same approach would
require years to run before a refcount would exceed the maximum
allowable imbalance. (These estimates are from Mark Shannon.)

But does the sign bit need to stay intact, and do we actually need to
rely on the immortal bit to always be set for immortal objects?
If the refcount rolls over to zero, an immortal object's dealloc could
bump it back and give itself another few minutes.
Allowing such rollover would mean having to deal with negative
refcounts, but that might be acceptable.

Another potential issue is that there may be some applications that take
refcounts at face value (perhaps obtained using sys.getrefcount()).
These would find that immortal objects have a very large refcount, which
might surprise them. But technically a very large refcount is totally
valid, and the kinds of objects that we plan to immortalize are all
widely shared -- who cares if the refcount for None is 5000 or
1610612736? As long as the refcount of *mortal* objects is the same as
it was before, this shouldn't be a problem.

A very small refcount would be even more surprising, but the same logic
applies: who cares if the refcount for None is 5000 or -5000?

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/2RZLYU2YPJET6SQYDORFEQSE53KPPCYJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Explicit markers for special C-API situations

2021-12-10 Thread Petr Viktorin

On 10. 12. 21 11:55, Christian Heimes wrote:

On 10/12/2021 03.08, Jim J. Jewett wrote:

Christian Heimes wrote:

On 09/12/2021 19.26, Petr Viktorin wrote:

If the code is the authoritative source of truth, we need a proper
parser to extract the information. ... unfortunately I don't trust it
enough to let it define the API. Bugs in the parser could result in
the API definition silently changing.

There are other options than writing a new parser. GCC and Clang are
flexible. For example GCC can be extended with plugins and custom
attributes.

But they have the same problem ... it can be difficult to know if
there is a subtle bug in someone's understanding of how the plugin
interacts with, for example, nested ifndef.

The failure mode for an explicitly manually maintained text file is
that something doesn't get added when it should, and the more
conservative API consumers wait an extra release before using it.

Macros and ifndefs are not a problem.

They are: we want to find PyErr_SetExcFromWindowsErr on all systems, and
include it in the docs.

A GCC plugin for user-defined
attributes hooks into the build process at a late stage. By the time the
plugin hook is invoked, the precompiler has resolved all macros and
ifdefs, and the C code has been parsed. The plugin operates on the same
intermediate code as the compiler.

The approach would allow us to make the headers the authoritative source
for most API and ABI symbols. I don't think that we can use it for
macros. We can even include additional metadata in the custom attribute,
e.g. version added

PyAPI_ABI_FUNC("3.2", PyObject *) PyLong_FromLong(long);

This looks a bit awkward already, and if/when we start including e.g.
"version removed" for PyAPI_ABI_ONLY, it'll get worse.

We can convert Misc/stable_abi.txt into an auto-generated file. The file
should still stay in git, so we can use it to verify the stable ABI in CI.

We can, but genuinely I think it works better as a source of truth than
a generated artifact. Changes to it should be deliberate.
I get that not everyone will agree with that. But it's also *much*
easier to maintain the current "best-effort" checks (which can punt on a
few edge cases) than add an all-encompassing, tested parser-based generator.

Not everything needs to be automated :)

Eric said:

The tooling is a secondary concern to my point. Mostly, I wish the
declarations in the header files had the extra classifications, rather than
having to remember to refer to a separate text file.

This part sounds like a good idea.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/HSIUJSE6QW2U4QEEHFOZSQIDD7Q3DT7G/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Explicit markers for special C-API situations (re: Clarification regarding Stable ABI and _Py_*)

2021-12-09 Thread Petr Viktorin

I'll not get back to CPython until Tuesday, but I'll add a quick note
for now. It's a bit blunt for lack of time; please don't be offended.

If the code is the authoritative source of truth, we need a proper
parser to extract the information. But we can't really use an existing
parser (e.g. we need to navigate various #ifdef combinations), and
writing a correct (=tested) custom C parser is pretty expensive. C
declarations being "deterministically discoverable by tools" is a
myth.
I know you wrote a parser (kudos!), but unfortunately I don't trust it
enough to let it define the API. Bugs in the parser could result in
the API definition silently changing.

That's why the info is in a separate version-controlled file, which
must be explicitly modified. That file is the source of truth (or at
least intent).
There are also checks to ensure the code matches the manifest, so if
you break things the CI should let you know.
See the rationale in PEP 652:
https://www.python.org/dev/peps/pep-0652/#rationale

As for the types you mentioned:
* PyAPI_ABI_INDIRECT, PyAPI_ABI_ONLY - these should get a comment. I
don't think adding machine-readable metadata (and tooling for it)
would be worth it, but I won't block it.
* PyAPI_ABI_ACCIDENTAL - could be deprecated in the Limited API, and
later removed from it, becoming "PyAPI_ABI_ONLY".

On Thu, Dec 9, 2021 at 6:41 PM Eric Snow  wrote:
>
> (replying to 
> https://mail.python.org/archives/list/python-dev@python.org/message/OJ65FPCJ2NVUFNZDXVNK5DU3R3JGLL3J/)
>
> On Wed, Dec 8, 2021 at 10:06 AM Eric Snow  wrote:
> > What about the various symbols listed in Misc/stable_abi.txt that were
> > accidentally added to the limited API?  Can we move toward dropping
> > them from the stable ABI?
>
> tl;dr We should consider making classifications related to the stable
> ABI harder to miss.
>
> 
>
> Knowing what is in the limited API is fairly straightforward. [1]
> However, it's clear that identifying what is part of the stable ABI,
> and why, is not so easy.  Currently, we must rely on
> Misc/stable_abi.txt [2] (and the associated
> Tools/scripts/stable_abi.py).  Documentation (C-API docs, PEPs,
> devguide) help too.
>
> Yet, there's a concrete disconnect here: the header files are by
> definition the authoritative single-source-of-truth for the C-API and
> it's too easy to forget about supplemental info in another file or
> document.  This out-of-sight-out-of-mind situation is part of how we
> accidentally added things to the limited API for a while. [3]
>
> The stable ABI isn't the only area where we must identify different
> subsets of the C-API.  However, in those other cases we use different
> structural/naming conventions to explicitly group things.  Most
> importantly, each of those conventions makes the grouping unavoidable
> when reading the code. [4]  For example:
>
> * closely related declarations go in the same header file (and then
> also exposed via Include/Python.h)
> * prefixes (e.g. Py_, PyDict_) provides similar grouping
> * an additional underscore prefix identifies "private" C-API
> * symbols are explicitly identified as part of the C-API via macros
> (PyAPI_FUNC, PyAPI_DATA) [5]
> * relatively recently, different directories correspond to different
> API layers (Include, Include/cpython, Include/internal) [3]
>
> 
>
> Could we take a similar explicit, coupled-to-the-code approach to
> identify when the different stable ABI situations apply?  Here's the
> specific approach I had in mind, with macros similar to PyAPI_FUNC:
>
> * PyAPI_ABI_FUNC - in stable ABI when it wouldn't be normally (e.g.
> underscore prefix, in Include/internal)
> * PyAPI_ABI_INDIRECT - exposed in stable ABI due to a macro
> * PyAPI_ABI_ONLY - it only exists for ABI compatibility and isn't
> actually used any more
> * PyAPI_ABI_ACCIDENTAL - unintentionally added to limited API,
> probably not used there
>
> (...or perhaps use a PyABI_ prefix, though that's a bit easy to miss
> when reading.)
>
> As a reader I would find markers like this helpful in recognizing
> those special situations, as well as the constraints those situations
> impose on modification.  At the least such macros would indicate
> something different is going on, and the macro name would be something
> I could look up if I needed more info.  I expect others reading the
> code would get comparable value.  I also expect tools like
> Tools/scripts/stable_abi.py would benefit.
>
> -eric
>
>
> [1] in Include/*.h and not #ifndef Py_LIMITED_API (sadly also making
> it easy to accidentally add things to the limited API, see [3])
> [2] Before that you had to  rely on comments or external documents or,
> in the worst case, work it out through careful study of the code,
> commit history, and mailing list archives.
> [3] The addition of Include/cpython and Include/internal helped us
> stop accidentally adding to the limited API.
> [4] It also makes the groupings deterministically discoverable by tools.
> [5] explicit use of "extern"

[Python-Dev] Re: Clarification regarding Stable ABI and _Py_*

2021-12-09 Thread Petr Viktorin


On 08. 12. 21 18:06, Eric Snow wrote:

On Wed, Dec 8, 2021 at 2:23 AM Petr Viktorin  wrote:

That really depends on what function we'd want to remove. There are
usually alternatives to deleting things, but the options depend on the
function. If we run out of other options we can make the function always
fail or make it leak memory.
And the regular backwards compatibility policy gives us 2 years to
figure something out :)


What about the various symbols listed in Misc/stable_abi.txt that were
accidentally added to the limited API?  Can we move toward dropping
them from the stable ABI?

Most notably, there are quite a few functions listed there that are in
the stable ABI but no longer in the limited API.  This implies that
either they were already deprecated in the limited API (and removed)
or they were just removed.  At least in some cases they were moved to
header files in Include/cpython or Include/internal.  So I would not
expect extensions to be using them.  This subset of those symbols
seems entirely appropriate to remove from the stable ABI.  Is that
okay?  Do we even need to bother deprecating them?  What about just
the "private" ones?

For example, I went to change/remove _PyThreadState_Init() (internal
API declared in Include/internal/pycore_pystate.h) and found that it
is in the stable ABI but not the limited API.  It's highly unlikely
anyone is using it and plan on double-checking.  As far as I can tell,
the function was accidentally exposed in the limited API and stable
ABI and later removed from the limited API.


It's possible to remove them just like _PyObject_GC_Malloc was removed, 
but check that it was unusable (e.g. not called from public macros) in 
all versions of Python from 3.2 up to now.
Could you check if this PR makes things clear? 
https://github.com/python/devguide/pull/778


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/T3M5AE6DFB2C6JAMCAMAIC2XQDNOKND5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Clarification regarding Stable ABI and _Py_*

2021-12-08 Thread Petr Viktorin

On 07. 12. 21 19:28, Guido van Rossum wrote:
On Tue, Dec 7, 2021 at 12:58 AM Petr Viktorin <mailto:encu...@gmail.com>> wrote:

On 06. 12. 21 21:50, Guido van Rossum wrote:

[...]

> Also, it looks like Mark is proposing to *remove*
_PyObject_GC_Malloc
> from stable_abi.txt in
https://github.com/python/cpython/pull/29879
<https://github.com/python/cpython/pull/29879>
> <https://github.com/python/cpython/pull/29879
<https://github.com/python/cpython/pull/29879>> Is that allowed? If
it's
> being used by a macro it means code using that macro will fail
unless
> recompiled for 3.11.

Generally, that's not allowed. In this particular case, Victor's
analysis is right: if you trawl through the history from 3.2 on, you
can
see that you can't call _PyObject_GC_Malloc via macros in the limited
API. So yes, this one can be removed.

Okay, that's very subtle, so thanks for confirming.

I'll also note that removing things that are "allowed" to go is not
nice
to people who relied on PEP 384, which says that defining
Py_LIMITED_API
"will hide all definitions that are not part of the ABI" -- even though
that's incompatible with the part where it says "All functions starting
with _Py are not available to applications".

I don't actually really follow what you are trying to say here. Probably
because I've never paid much attention to PEP 384. I guess the API is
confusing because the "right" way to do it (having to define some symbol
to *expose* extra stuff rather than to *hide* stuff) was not possible
for backwards compatibility reasons. But the extra negative will forever
make this confusing. Also, "All functions starting with _Py are not
available" sounds like a clumsy way to say "No functions starting with
_Py are available" (and you left out whether Py_LIMITED_API affects that
availability, whether it was intended to affect it, whether it did in
practice affect it in all cases, etc.

It's hard to say what PEP 384 was meant to say. My interpretation, PEP
652, is hopefully more consistent. But someone who had a different
interpretation of PEP 384 might feel that it broke some promise.

I assume it would be insensitive to ask whether we could just get rid of
the stable ABI altogether and focus on the limited API? Just tell
everyone they have to rebuild binary wheels for every Python feature
release. Presumably the deprecation of the stable ABI itself would
require a two-release waiting period. But maybe it would be worth it,
given how subtle it is to do the historical research about even a single
function.

A honest question wouldn't be insensitive. Thanks for asking!

The part where you don't need to rebuild extensions (not just wheels) is
the main reason for both Stable ABI and the Limited API.
Without it, there might be some reduced API to focus on, but it wouldn't
be this feature.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/SBQHQ6DWSXMDVQCETVDMRLZ3CNG4OJUL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 674: Disallow using macros as l-value

2021-12-08 Thread Petr Viktorin


On 07. 12. 21 17:54, Joao S. O. Bueno wrote:

Sorry for stepping in - but I am seeing too many arguments in favour
of the rules because "they are the rules", and just Victor arguing with
what is met in the "real world".


OTOH, coming up with rules and then blatantly ignoring them is silly at 
best.


If the rules are bad, they should definitely be changed. And if a case 
is exceptional enough, we should make an exception -- but to make an 
exception we need a very good understanding of why the rules are the way 
they are (and in this case. I don't think any single person has the 
proper understanding).



One of the roles the backwards compatibility policy serves is a promise 
to our users. They can expect to not run into problems if they only 
upgrade to every second Python version and fix deprecation warnings 
(except for "extreme situations such as dangerously broken or insecure 
features or features no one could reasonably be depending on").

That is, IMO, a pretty good reason to consider sticking to the rules.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NXOKMGCN3TOU7CA2YX3ZNRZMCPNJ2VUF/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 674: Disallow using macros as l-value

2021-12-08 Thread Petr Viktorin

On 08. 12. 21 1:47, Victor Stinner wrote:

For me, HPy is the only valid stable API and stable ABI in the long
term which is efficient on any Python implementation. Its design is
very different than the C API: HPy avoids all C API design mistakes,
it doesn't leak any implementation detail.

HPy can already be used today on CPython, even if it's not directly
provided by CPython.

Providing HPy as a first-class citizen in CPython, as already done in
PyPy, would be great to promote HPy! However, HPy evolves quickly and
so needs to be released more frequently than CPython. At least, we
could promote it more in the C API documentation, as we already
promote Cython.

Promoting the HPy usage doesn't solve any issue listed in PEP 620, 670
and 674 since CPython still has to continue supporting the C API. We
will only be fully free to make any change in Python internals without
having to care about breaking the C API once the *LAST* C extensions
using the C API will disappear... Look at Python 2.7 which is still
used in 2021. I bet that C extensions using the C API are not doing to
disappear soon.

For me, the question is:

=> Is it ok to no longer be able to make any change in Python
internals because of the public C API?

The sub-question is:

=> Is it ok to have a slow deprecation process and wait 5 to 10 years
until it will be possible again to evolve the Python internals?

No, they should be evolved when they *need* to be evolved.
This PEP calls for breaking things because they *might* need evolving in
the future, but doesn't presemt any immediate benefit. In that case, I
think it's better document that we don't like some usage, but only do
the removals when they're helpful.

Especially in cases where there can't be a proper deprecation period.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/WIEC7X4CTPL7PN7EIOJJJWCJJ5ZBYH7M/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Clarification regarding Stable ABI and _Py_*

2021-12-08 Thread Petr Viktorin

On 07. 12. 21 20:58, Guido van Rossum wrote:
On Tue, Dec 7, 2021 at 11:02 AM Christian Heimes > wrote:

On 07/12/2021 19.28, Guido van Rossum wrote:
 > I assume it would be insensitive to ask whether we could just get
rid of
 > the stable ABI altogether and focus on the limited API? Just tell
 > everyone they have to rebuild binary wheels for every Python feature
 > release. Presumably the deprecation of the stable ABI itself would
 > require a two-release waiting period. But maybe it would be worth
it,
 > given how subtle it is to do the historical research about even a
single
 > function.

The stable ABI is useful for Python packages that ship binary wheels.

Take PyCA cryptography [1] as an example. Alex and Paul already build,
upload, and ship 12 abi3 wheels for each release and combinations of
CPU
arch, platform, and libc ABI. Without a stable ABI they would have to
create a total of 60 binary abi3 wheels for Python 3.6 to 3.10. The
number will only increase over time. Python 3.6 is very common on
LTS/Enterprise Linux distros.

Thanks, that's a very useful example.

If the current stable ABI makes performance improvements too complex
then we should consider to define a new stable ABI with less symbols.

But then we will run into backwards compatibility concerns. Suppose we 
want to delete *one* functions from the stable ABI. How many releases do 
we have to wait before we can actually delete (as opposed to just 
deprecate) it? It sounds like you're saying it would take 5 releases, 
i.e. if we deprecate it in 3.11, we can delete it in 3.16. It would 
probably be easier to just not bother with the deprecation.

That really depends on what function we'd want to remove. There are 
usually alternatives to deleting things, but the options depend on the 
function. If we run out of other options we can make the function always 
fail or make it leak memory.
And the regular backwards compatibility policy gives us 2 years to 
figure something out :)

It is possible that we'll need a new stable ABI for nogil, though, since 
refcounting is one of the few areas where even the stable ABI uses 
direct struct access rather than functions.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WN7TBSV6LM24X6CIEQQDG63OYPIR5ZLU/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 674: Disallow using macros as l-value

2021-12-07 Thread Petr Viktorin


On 30. 11. 21 19:52, Victor Stinner wrote:

On Tue, Nov 30, 2021 at 7:34 PM Guido van Rossum  wrote:

How about *not* asking for an exception and just following the PEP 387 process? 
Is that really too burdensome?


The Backward Compatibility section gives an explanation:

"This change does not follow the PEP 387 deprecation process. There is no
known way to emit a deprecation warning when a macro is used as a
l-value, but not when it's used differently (ex: r-value)."

Apart of compiler warnings, one way to implement the PEP 387
"deprecation process" would be to announce the change in two "What's
New in Python 3.X?" documents. But I expect that it will not be
efficient. Extract of the Rejected Idea section:

"(...) only few developers read the documentation, and only a minority
is tracking changes of the Python C API documentation."

In my experience, even if a DeprecationWarning is emitted at runtime,
developers miss or ignore it. See the recent "[Python-Dev] Do we need
to remove everything that's deprecated?" discussion and complains
about recent removal of deprecated features, like:

* collections.MutableMapping was deprecated for 7 Python versions
(deprecated in 3.3) -- removed in 3.9 alpha, reverted in 3.9 beta,
removed again in 3.11
* the "U" open() flag was deprecated for 10 Python versions
(deprecated in 3.0) -- removed in 3.9 alpha, reverted in 3.9 beta,
removed again in 3.11

For this specific PEP changes, I consider that the number of impacted
projects is low enough to skip a deprecation process: only 4 projects
are known to be impacted. One year ago (Python 3.10), 16 were
impacted, and 12 have already been updated in the meanwhile. I'm
talking especially about Py_TYPE() and Py_SIZE() changes which, again,
has been approved by the Steering Council.



The current version of the PEP looks nice, but I don't think the 
rationale is strong enough.

I believe we should:
- Mark the l-value usage as deprecated in the docs,
- And then do nothing until we find an actual case where this issue 
blocks development (or is actively dangerous for users).


Specifically, for each of the Rationale parts:


## Using a macro as a l-value

The practice was often discouraged (e.g. by GET in many of the names), 
or even relatively widely and successfully used (e.g. Py_TYPE before 
Python 3.9).


If we would deprecate using Py_REFCNT as l-value in the docs, but wait 
with the conversion until it was *actually* needed, we would not lose 
anything:
- Users would *still* have no period of visible compiler warnings or 
DeprecationWarning.
- There would be more time for users to react to the documentation 
warning. Or even come up with a linter, or try compiling their favorite 
extensions with HPy/nogil and fixing the issues *on their own schedule*.



## CPython nogil fork

In CPython, we cannot change structs that are part of the stable ABI -- 
such as PyObject.ob_refcnt. IMO, getting rid of the macros that access 
ob_refcnt is a relatively small part of this issue.


AFAICS, the technical change is trivial compared to nogil, and can be 
easily made in the nogil fork -- if it is actually necessary in the end.



## HPy project

There is no reason for "Searching and replacing Py_SET_SIZE()".
If this change was not made, and Py_SIZE was semi-mechanically replaced 
by HPy_Length(), then misuses of Py_SIZE (using is as l-value) can be 
detected very easily: just compile against HPy.
And if the autoreplacer is smart enough to see when it should use 
HPyTupleBuilder/HPyListBuilder, then it can surely see when Py_SIZE is 
(mis)used as l-value!


There will always be some changes necessary when porting extensions to HPy.
CPython should definitely indicate what the best practice is, so that 
HPY adopters have an easy time convincing projects to take their pull 
requests -- but it should not break code for people who don't care about 
HPy yet.



## GraalVM Python

This PEP is not enough to get rid of wrappers in GraalVM, yet it forces 
users of CPython to adapt. Is it a part of a *plan* to remove wrappers, 
or just a minor step in what looks like the general direction?
I do agree this PEP looks like a good step towards a long-term goal. But 
even so, it should be made when it *actually benefits* existing users, 
or allows some concrete good thing -- an optimization, a more 
maintainable implementation, something that outweighs the need for churn 
in code that worked up to now.




Overall, the Rationale seems hollow to me. It's breaking existing code 
-- however bad that code is -- in the name of ideals rather than 
concrete improvements.
Until disallowing macros as l-values allows concrete improvements in 
CPython, it should be the job of linters.



FWIW, I do encourage alternative implementations to just not support 
l-value macros. There are only few projects doing this, and the fix is 
often (but not always!) easy. This should be a very small part of 
porting something to a different Python implementation (but I could

[Python-Dev] Re: Clarification regarding Stable ABI and _Py_*

2021-12-07 Thread Petr Viktorin

On 06. 12. 21 21:50, Guido van Rossum wrote:
On Mon, Dec 6, 2021 at 12:12 PM Petr Viktorin <mailto:encu...@gmail.com>> wrote:

On 06. 12. 21 20:29, Guido van Rossum wrote:
 > Hi Petr,
 >
 > In PEP 384 it is written that no functions starting with an
underscore
 > are part of the stable ABI:
 >
 > PEP 384 -- Defining a Stable ABI | Python.org
 > <https://www.python.org/dev/peps/pep-0384/#excluded-functions
<https://www.python.org/dev/peps/pep-0384/#excluded-functions>>
 >  > All functions starting with _Py are not available to applications
 >
 > OTOH there's a data file in the repo, Misc/stabe_abi.txt, which
lists
 > many functions starting with _Py_, for example
_PyObject_GC_Malloc. Then
 > again, that function is not listed in Doc/data/stable_abi.dat. (I
didn't
 > check other functions, but maybe there are others.)
 >
 > So is Misc/stable_abi.txt just out of date? Or how can the
discrepancy
 > be explained?

These are not part of the limited API, so extension authors can't use
them in the C source. But they typically are (or have been) called by
macros from the limited API. So, they are part of the stable ABI; they
need to be exported.

Misc/stable_abi.txt says "abi_only" for all of these. They don't
show up
in the user-facing docs.

Thanks, that helps. It's too bad that there's no comment at the top 
explaining the format (in fact it appears to discourage reading the file?).

You can read it, but I want to discourage people from relying on the 
format: Tools/scripts/stable_abi.py should be the only consumer.

I will add a comment though.

Also, it looks like Mark is proposing to *remove* _PyObject_GC_Malloc 
from stable_abi.txt in https://github.com/python/cpython/pull/29879 
<https://github.com/python/cpython/pull/29879> Is that allowed? If it's 
being used by a macro it means code using that macro will fail unless 
recompiled for 3.11.

Generally, that's not allowed. In this particular case, Victor's 
analysis is right: if you trawl through the history from 3.2 on, you can 
see that you can't call _PyObject_GC_Malloc via macros in the limited 
API. So yes, this one can be removed.

I'll also note that removing things that are "allowed" to go is not nice 
to people who relied on PEP 384, which says that defining Py_LIMITED_API 
"will hide all definitions that are not part of the ABI" -- even though 
that's incompatible with the part where it says "All functions starting 
with _Py are not available to applications".
PEP 384 is a historical document, but before 3.10 it was the best 
available documentation. PEP 652 sort of changed the rules mid-course 
(ref. https://www.python.org/dev/peps/pep-0652/#backwards-compatibility).

But for _PyObject_GC_Malloc specifically, IMO the speedup is worth it. 
Go ahead and remove it.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/S7PL37E4Z4RHICOHTK32NCQTYN5C6FAQ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Clarification regarding Stable ABI and _Py_*

2021-12-06 Thread Petr Viktorin


On 06. 12. 21 20:29, Guido van Rossum wrote:

Hi Petr,

In PEP 384 it is written that no functions starting with an underscore 
are part of the stable ABI:


PEP 384 -- Defining a Stable ABI | Python.org 


 > All functions starting with _Py are not available to applications

OTOH there's a data file in the repo, Misc/stabe_abi.txt, which lists 
many functions starting with _Py_, for example _PyObject_GC_Malloc. Then 
again, that function is not listed in Doc/data/stable_abi.dat. (I didn't 
check other functions, but maybe there are others.)


So is Misc/stable_abi.txt just out of date? Or how can the discrepancy 
be explained?


These are not part of the limited API, so extension authors can't use 
them in the C source. But they typically are (or have been) called by 
macros from the limited API. So, they are part of the stable ABI; they 
need to be exported.


Misc/stable_abi.txt says "abi_only" for all of these. They don't show up 
in the user-facing docs.


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WB3W2QY4PLM5XGLQMEEC4HE3HSQ4M3KO/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 670: Convert macros to functions in the Python C API

2021-11-24 Thread Petr Viktorin





On 24. 11. 21 15:32, Victor Stinner wrote:

On Wed, Nov 24, 2021 at 2:18 PM Petr Viktorin  wrote:

The "Backwards Compatibility" section is very small. Can you give a list
of macros which lost/will lose "return values"?


https://bugs.python.org/issue45476 lists many of them. See also:
https://github.com/python/cpython/pull/28976


Also, this PR is about preventing the use of some macros as l-values,
which you say is out of scope for the PEP. I'm connfused.


Oh right, now I'm also confused :-) I forgot about the details.

"Py_TYPE(obj) = new_type;" was used in 3rd party C extensions when
defining static types to work around linker issues on Windows.
Changing Py_TYPE() to disallow using it as an l-value is an
incompatible change.

 From what I saw in bpo-45476, the functions that I propose to change
are not used as l-value. Technically, it's an incompatible change. In
practice, it should not impact any 3rd party project.

For example, PyFloat_AS_DOUBLE() is used to read a float value (ex:
"double x = PyFloat_AS_DOUBLE(obj);"), but not to set a float value
(ex: "PyFloat_AS_DOUBLE(obj) = 1.0;").

Ok, I should clarify that in the PEP.


Yes. *Each* incompatible change should be listed, even if you believe it 
won't affect any project. The PEP reader should be allowed to judge if 
your assumptions are correct.


e.g. I've seen projects actually use "Py_TYPE(obj) = new_type;" to 
change an object's type after it was given to Python code. It would be 
great to document why that's wrong *and* what to do instead, both in the 
PEP that introduced the change and in the "What's New" entry.






Wait, so this PEP is about converting macros to functions, but not about
converting Py_SIZE to a function? I'm confused. Why is Py_SIZE listed in
the PEP?


Py_SIZE() is already converted to a static inline function. Later, it
can be converted to a regular function if it makes sense.

It's listed in the PEP to show macros which are already converted, to
help to estimate how many 3rd party applications would be affected by
the PEP.


Is such an estimate available?



Py_REFCNT(), Py_TYPE() and Py_SIZE() are special because they were
used as l-value on purpose. As far as I know, they were the only 3
macros used as l-value, no?


Who knows? If there's a list of what to change, someone can go through 
it and answer this for each macro.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XBEZS3KXDFGVEZMAO6HJGFB5YSYND7LS/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 670: Convert macros to functions in the Python C API

2021-11-24 Thread Petr Viktorin





On 24. 11. 21 15:22, Victor Stinner wrote:

On Wed, Nov 24, 2021 at 10:59 AM Petr Viktorin  wrote:

Since this is about converting existing macros (and not writing new
ones), can you talk about which of the "macro pitfalls" apply to the
macros in CPython that were/will be changed?


The PEP 670 lists many pitfalls affecting existing macros. Some
pitfalls are already worked around in the current implementations, but
the point is that it's easy to miss pitfalls when reviewing code
adding new macros or modifying macros.

Erlend did an analysis in: https://bugs.python.org/issue43502

For macros reusing arguments (known as "Duplication of side effects"
in GCC Macro Pitfalls), see his list:
https://bugs.python.org/file49877/macros-that-reuse-args.txt


That's s nice list. Could you link to it in the PEP, so the next person 
won't have to ask?



Meanwhile, I think I found a major source of my confusion with the PEP: 
I'm not clear on what it actually proposes. Is it justification for 
changes that were already done, or a plan for more changes, or a policy 
change ("don't write a public macro if it can be a function"), or all of 
those?

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AMZ4Z45JOYIS6YRMAFGWAN4D45WMO4DO/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 670: Convert macros to functions in the Python C API

2021-11-24 Thread Petr Viktorin


 On 24. 11. 21 13:20, Victor Stinner wrote:

On Wed, Nov 24, 2021 at 10:59 AM Petr Viktorin  wrote:

Are there more macros that are yet to be converted to macros,


I suppose that you mean "to be converted to functions". Yes, there are
many, it's the purpose of the PEP.

I didn't provide a list. I would prefer to do it on a case by case
basis, as I did previously.

To answer your question: it's basically all macros, especially the
ones defined by the public C API, except the ones excluded by the PEP:
https://www.python.org/dev/peps/pep-0670/#convert-macros-to-static-inline-functions



other than the ones in GH-29728?


The purpose of this PR is only to run benchmarks to compare the
performance of macros versus static inline functions. The PR title is
"Convert static inline to macros": it converts existing Python 3.11
static inline functions back to Python 3.6/3.7 macros. It's basically
the opposite of the PEP ;-)



The "Backwards Compatibility" section is very small. Can you give a list
of macros which lost/will lose "return values"?


https://bugs.python.org/issue45476 lists many of them. See also:
https://github.com/python/cpython/pull/28976


Can you put this in the PEP? If things should be evaluated on a 
case-by-case basiswe should know about the cases.


Also, this PR is about preventing the use of some macros as l-values, 
which you say is out of scope for the PEP. I'm connfused.



Can you add the fact that some macros now can't be used as l-values?


If you are are talking about my merged change preventing using
Py_TYPE() as an l-value, this is out of the scope of the PEP on
purpose.

Py_TYPE(), Py_REFCNT() and Py_SIZE() could be used an l-value in
Python 3.9, but it's no longer the case in Python 3.11. Apart of that,
I'm not aware of other macros which could be "abused" as l-value.


Wait, so this PEP is about converting macros to functions, but not about 
converting Py_SIZE to a function? I'm confused. Why is Py_SIZE listed in 
the PEP?




There are macros which can be "abused" ("used") to access to structure
members and object internals. For example, _GET_ITEM(tuple, 0)
and _GET_ITEM(list, 0) can be "abused" to access directly to an
array of PyObject* (PyObject** type) and so modify directly a
tuple/list. I would like to change that (disallow it), but it's out of
the scope of the PEP. See https://bugs.python.org/issue41078 for my
previous failed attempt (it broke too many things). But this is more
in the scope of the PEP 620 which is a different PEP. >


Are there any other issues that break existing code?


I listed all known backward incompatibles changes in the Backward
Compatibility section. I'm not aware of other backward incompatible
changes caused by the PEP.

Converting macros to static inline functions or regular functions
didn't change the API for the macros already converted, the ones
listed in the PEP.


It did for e.g. Py_SIZE, which no longer behaves like in 3.9, nor as it 
was documented in 3.8: 
https://docs.python.org/3.8/c-api/structures.html#c.Py_SIZE
Yet Py_SIZE is listed in the PEP as "Macros converted to static inline 
functions", so clearly it is in scope.

Same for Py_TYPE. Are there others?



The "Cast to PyObject*" section talks about adding new private functions
like _Py_TYPE, which are type-safe, but keeping old names (like Py_TYPE)
as macros that do a cast.
Could the newly added functions be made public from the start? (They
could use names like Py_Type.) This would allow languages that don't
have macros to use them directly, and if the non-typesafe macros are
ever discouraged/deprecated/removed, this would allow writing compatible
code now.


I don't want to increase the size of the C API and so I chose to make
the inner function accepting PyObject* private.

I see the addition of an hypothetical Py_Type() function as an
increase of the maintenance burden: we would have to maintain it,
document it, maybe add it to the limited C API / stable ABI, write
tests, etc.

I prefer to restrict the scope of the PEP. If you want to add variants
only accepting PyObject*, that's fine, but I suggest to open a
separated issue / PEP. Also, it can be discussed on a case by case
basic (function per function).


Since functions like _Py_TYPE will need to be maintained as part of the 
stable ABI, I'd like to do this right from the start. If you don't, can 
you add this to Rejected ideas?



I'm still interested in:

Since this is about converting existing macros (and not writing new ones), can you talk 
about which of the "macro pitfalls" apply to the macros in CPython that 
were/will be changed?

Is that just a theoretical issue?

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archiv

[Python-Dev] Re: PEP 670: Convert macros to functions in the Python C API

2021-11-24 Thread Petr Viktorin


On 23. 11. 21 18:00, Victor Stinner wrote:

I completed the PEP: https://python.github.io/peps/pep-0670/



What I don't like about this PEP is that it documents changes that were 
already pushed, not planned ones. But, what's done is done...
Are there more macros that are yet to be converted to macros, other than 
the ones in GH-29728? If so, can you give a list?


Since this is about converting existing macros (and not writing new 
ones), can you talk about which of the "macro pitfalls" apply to the 
macros in CPython that were/will be changed?


The "Backwards Compatibility" section is very small. Can you give a list 
of macros which lost/will lose "return values"?
Can you add the fact that some macros now can't be used as l-values? 
(and list which ones?) This change is also breaking existing code.
Are there any other issues that break existing code? (Even code that, 
for example, shouldn't work according to Python documentation, but still 
works fine in practice.)



The "Cast to PyObject*" section talks about adding new private functions 
like _Py_TYPE, which are type-safe, but keeping old names (like Py_TYPE) 
as macros that do a cast.
Could the newly added functions be made public from the start? (They 
could use names like Py_Type.) This would allow languages that don't 
have macros to use them directly, and if the non-typesafe macros are 
ever discouraged/deprecated/removed, this would allow writing compatible 
code now.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TBIGFWSBOGW55DBRUYURR5QKRVIK3B6I/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Do we need to remove everything that's deprecated?

2021-11-23 Thread Petr Viktorin

On 19. 11. 21 22:15, Mike Miller wrote:

This is the point where the pricey support contract comes in. Would
give options to those who need it and provide some revenue.

Not really; for a pricey support contract would need to freeze things
for even longer -- *and* make it an actual contract :)

Changing working code just to make it continue to work with a newer
Python version is boring. Companies might pay money to not have to do
that. Or they might pay their employees to do the work. Either way it's
money that could be spent on better things. (And hopefully, in some
cases those things will be investing into Python and its ecosystem.)

But it's similar with volunteer authors and maintainers of various tools
and libraries, who "pay" with their time that could be spent building
something useful (or something fun). I believe that each time we force
them to do pointless updates in their code, we sap some joy and
enthusiasm from the ecosystem.
Of course, we need to balance that with the joy and enthusiasm (and yes,
corporate money) that core devs pour into improving Python itself. But
it's the users that we're making Python for.

Otherwise, the "there's no such thing as a free lunch," factor takes
precedence.

That cuts both ways: deleting old ugly code is enjoyable, but it isn't
free ;)

Full disclosure: I do work for Red Hat, which makes money on pricey
support contracts. But Victor Stinner also works here.

This thread was motivated by watching rebuilds of Fedora packages with
Python 3.11 (https://bugzilla.redhat.com/show_bug.cgi?id=2016048), and
asking myself if all the work we're expecting people to do is worth it.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/DTORHHPWTK7DL35XTWOGD2SJE3EH5DBI/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Do we need to remove everything that's deprecated?

2021-11-18 Thread Petr Viktorin

On Wed, Nov 17, 2021 at 12:49 AM Terry Reedy  wrote:
>
> On 11/16/2021 7:43 AM, Petr Viktorin wrote:
> > On 16. 11. 21 1:11, Brett Cannon wrote:
>
> >> I think the key point with that approach is if you wanted to maximize
> >> your support across supported versions, this would mean there wouldn't
> >> be transition code except when the SC approves of a shorter
> >> deprecation. So a project would simply rely on the deprecated approach
> >> until they started work towards Python 3.13, at which point they drop
> >> support for the deprecated approach and cleanly switch over to the new
> >> approach as all versions of Python at that point will support the new
> >> approach as well.
> >
> > That sounds like a reasonable minimum for minor cleanups -- breakage
> > that doesn't block improvements.
> >
> > The current 'two years' minimum (and SC exceptions) is, IMO, appropriate
> > for changes that do block improvements -- e.g. if removing old Unicode
> > APIs allows reorganizing the internals to get a x% speedup, it should be
> > removed after the 2-years of warnings (*if* the speedup is also made in
> > that version -- otherwise the removal can be postponed).
> > Even better if there's some alternate API for the affected use cases
> > which works on all supported Python versions.
>
> I agree that the yearly releases make 2 releases with warnings a bit
> short.  Remove when a distributed replacement works in all supported
> releases seems pretty sensible.
>
>
> > And then there are truly trivial removals like the "failUnless" or
> > "SafeConfigParser" aliases. I don't see a good reason to remove those --
> > they could stay deprecated forever.
>
> This part I do not agree with.  In 3.10, there are 15 fail* and assert*
> aliases with a messy overlap pattern.
> https://docs.python.org/3/library/unittest.html#deprecated-aliases
> This is 15 unneeded names that appear in the doc, the index, vars(),
> dir(), TestCase.__dict__ listings, completion lists, etc.

Well, dir(), vars(), __dict__ and similar are already unpleasant --
ever since they started listing dunder methods, quite a long time ago.
But we could improve completion, docs and other cases that can filter the list.
How about adding a __deprecated__ attribute with a list of names that
tab completion should skip?

>
> If not used, there is no need to keep them.  If kept 'forever', they
> will be used, making unittest code harder to read.
>
> There was a recent proposal to add permanent _ aliases for all stdlib
> camelCase names: assert_equal, assert_true, etc.  After Guido gave a
> strong No, the proposal was reduced to doing so for logging and unittest
> only.  If permanent aliases are blessed as normal, the proposal will
> recur and it would be harder to say no.
>
> I expect that there would be disagreements as to what is trivial enough.
>
> > The only danger that API posed to
> > users is that it might be removed in the future (and that will break
> > their code), or that they'll get a warning or a linter nag.
>
> Python is nearly 30 years old.  I am really glad it is not burdened with
> 30 years of old names.  I expect someone reading this may write some
> version of Python 50 years from now.  I would not want they to have to
> read about names deprecated 60 years before such a time.

If they dedicated section is too distracting. they could be moved to a
subpage, reachable mainly by people searching for a particular name.
And the instructions on how to modernize code could be right next to
them.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7WRG54YN5LFFXDBUCCFRQYHRSEMLVIO5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Do we need to remove everything that's deprecated?

2021-11-18 Thread Petr Viktorin

On 16. 11. 21 20:13, Brett Cannon wrote:

On Tue, Nov 16, 2021 at 4:46 AM Petr Viktorin <mailto:encu...@gmail.com>> wrote:

On 16. 11. 21 1:11, Brett Cannon wrote:
 >
 >
 > On Sun, Nov 14, 2021 at 3:01 PM Victor Stinner
mailto:vstin...@python.org>
 > <mailto:vstin...@python.org <mailto:vstin...@python.org>>> wrote:
 >
 >     On Sun, Nov 14, 2021 at 6:34 PM Eric V. Smith
mailto:e...@trueblade.com>
 >     <mailto:e...@trueblade.com <mailto:e...@trueblade.com>>> wrote:
 >      > On second thought, I guess the existing policy already
does this.
 >     Maybe
 >      > we should make it more than 2 versions for deprecations?
I've written
 >      > libraries where I support 4 or 5 released versions.
Although maybe I
 >      > should just trim that back.
 >
 >     If I understood correctly, the problem is more for how long
is the new
 >     way available?
 >
 >
 > I think Eric was suggesting more along the lines of PEP 387
saying that
 > deprecations should last as long as there is a supported version of
 > Python that *lacks* the deprecation. So for something that's
deprecated
 > in 3.10, we wouldn't remove it until 3.10 is the oldest Python
version
 > we support. That would be October 2025 when Python 3.9 reaches
EOL and
 > Python 3.13 comes out as at that point you could safely rely on the
 > non-deprecated solution across all supported Python versions (or
if you
 > want a full year of overlap, October 2026 and Python 3.14).
 >
 > I think the key point with that approach is if you wanted to
maximize
 > your support across supported versions, this would mean there
wouldn't
 > be transition code except when the SC approves of a shorter
deprecation.
 > So a project would simply rely on the deprecated approach until they
 > started work towards Python 3.13, at which point they drop
support for
 > the deprecated approach and cleanly switch over to the new
approach as
 > all versions of Python at that point will support the new
approach as well.

That sounds like a reasonable minimum for minor cleanups -- breakage
that doesn't block improvements.

The current 'two years' minimum (and SC exceptions) is, IMO,
appropriate
for changes that do block improvements -- e.g. if removing old Unicode
APIs allows reorganizing the internals to get a x% speedup, it
should be
removed after the 2-years of warnings (*if* the speedup is also made in
that version -- otherwise the removal can be postponed).
Even better if there's some alternate API for the affected use cases
which works on all supported Python versions.

If enough people come forward supporting this idea then you could 
propose to the SC that PEP 387 get updated with this guidance.

Yes, this thread is the first step :)

And then there are truly trivial removals like the "failUnless" or
"SafeConfigParser" aliases. I don't see a good reason to remove
those --
they could stay deprecated forever. The only danger that API posed to
users is that it might be removed in the future (and that will break
their code), or that they'll get a warning or a linter nag.

If deprecations ever become permanent, then there will have to be a 
cleaning of the stdlib first before we lock the team into this level of 
support contract.

I'm not looking for a contract, rather a best practice.
I think we should see Python's benign warts as nice gestures to the 
users: signs that we're letting them focus on issues that matter to 
them, rather than forcing them to join a quest for perfection.
If a wart turns out to be a tumor, we should be able to remove it after 
the 2 years of warnings (or less with an exception). That's fine as a 
contract. But I don't like "spring cleaning" -- removing everything the 
contract allows us to remove.

Ensuring more perfect code should be a job for linters, not the 
interpreter/stdlib.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NNKXTOSYXT2YH2FWZRFRMEFHYPN4BF66/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Do we need to remove everything that's deprecated?

2021-11-16 Thread Petr Viktorin

On 16. 11. 21 1:11, Brett Cannon wrote:

On Sun, Nov 14, 2021 at 3:01 PM Victor Stinner > wrote:

On Sun, Nov 14, 2021 at 6:34 PM Eric V. Smith mailto:e...@trueblade.com>> wrote:
> On second thought, I guess the existing policy already does this.
Maybe
> we should make it more than 2 versions for deprecations? I've written
> libraries where I support 4 or 5 released versions. Although maybe I
> should just trim that back.

If I understood correctly, the problem is more for how long is the new
way available?

I think Eric was suggesting more along the lines of PEP 387 saying that
deprecations should last as long as there is a supported version of
Python that *lacks* the deprecation. So for something that's deprecated
in 3.10, we wouldn't remove it until 3.10 is the oldest Python version
we support. That would be October 2025 when Python 3.9 reaches EOL and
Python 3.13 comes out as at that point you could safely rely on the
non-deprecated solution across all supported Python versions (or if you
want a full year of overlap, October 2026 and Python 3.14).

I think the key point with that approach is if you wanted to maximize
your support across supported versions, this would mean there wouldn't
be transition code except when the SC approves of a shorter deprecation.
So a project would simply rely on the deprecated approach until they
started work towards Python 3.13, at which point they drop support for
the deprecated approach and cleanly switch over to the new approach as
all versions of Python at that point will support the new approach as well.

That sounds like a reasonable minimum for minor cleanups -- breakage
that doesn't block improvements.

The current 'two years' minimum (and SC exceptions) is, IMO, appropriate
for changes that do block improvements -- e.g. if removing old Unicode
APIs allows reorganizing the internals to get a x% speedup, it should be
removed after the 2-years of warnings (*if* the speedup is also made in
that version -- otherwise the removal can be postponed).
Even better if there's some alternate API for the affected use cases
which works on all supported Python versions.

And then there are truly trivial removals like the "failUnless" or
"SafeConfigParser" aliases. I don't see a good reason to remove those --
they could stay deprecated forever. The only danger that API posed to
users is that it might be removed in the future (and that will break
their code), or that they'll get a warning or a linter nag.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/THXWZ53X53I6GGLIDHO5T3Q4ZKALVGCP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Remove asyncore, asynchat and smtpd modules

2021-11-16 Thread Petr Viktorin


On 12. 11. 21 13:09, Victor Stinner wrote:

It was decided to start deprecating the asyncore, asynchat and smtpd
modules in Python 3.6 released in 2016, 5 years ago. Python 3.10 emits
DeprecationWarning.


Wait, only Python 3.10?
According to the policy, the warning should be there for *at least* two
releases. (That's a minimum, for removing entire modules it might make
sense to give people even more time.)


The PEP 387 says "Similarly a feature cannot be removed without notice
between any two consecutive releases."

It is the case here. The 3 modules are marked as deprecated for 4
releases in the documentation: Python 3.6, 3.7, 3.9 and 3.10. Example:
https://docs.python.org/3.6/library/asyncore.html


PEP 387 also contains a detailed process for making incompatible 
changes, which calls for warnings to appear in at least two releases.


Do you think the process section can be ignored? We should remove it 
from the PEP if that's the case.


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KF34VL46ZAX6FWEFMXKZASPKF65USOZT/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)

2021-11-15 Thread Petr Viktorin

On 15. 11. 21 9:25, Stephen J. Turnbull wrote:

Christopher Barker writes:

> Would a proposal to switch the normalization to NFC only have any hope of
> being accepted?

Hope, yes. Counting you, it's been proposed twice. :-) I don't know
whether it would get through. We know this won't affect the stdlib,
since that's restricted to ASCII. I suppose we could trawl PyPI and
GitHub for "compatibles" (the Unicode term for "K" normalizations).

I don't think PyPI/GitHub are good resources to trawl.

Non-ASCII identifiers were added for the benefit of people who use
non-English languages. But both on PyPI and GitHub are overwhelmingly
projects written in English -- especially if you look at the more
popular projects.
It would be interesting to reach out to the target audience here... but
they're not on this list, either. Do we actually know anyone using this?

I do teach beginners in a non-English language, but tell them that they
need to learn English if they want to do any serious programming. Any
code that's to be shared more widely than a country effectively has to
be in English. It seems to me that at the level where you worry about
supply chain attacks and you're doing code audits, something like
CPython's policy (ASCII only except proper names and Unicode-related
tests) is a good idea.
Or not? I don't know anyone who actually uses non-ASCII identifiers for
a serious project.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/AVCLMBIXWPNIIKRFMGTS5SETUCGAONLK/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Do we need to remove everything that's deprecated?

2021-11-12 Thread Petr Viktorin

On 12. 11. 21 14:18, Victor Stinner wrote:

For me, deprecated functions cause me a lot of thinking when I met
them as a Python maintainer and as a Python user. Why is it still
there? What is its purpose? Is there a better alternative? It's
related to the Chesterton's fence principle. Sometimes, reading the
doc is enough. Sometimes, I have to dig into the bug tracker and the
Git history.

Could you just add a comment when you find the answer? And a note in the
docs, for the users?

In Python, usually, there is a better alternative. A recent example is
the asyncore module that I'm proposing to remove. This module has
multiple design flaws which cause bugs in corner cases. It's somehow
dangerous to use this module. Deprecating the module doesn't help
users who continue to use it and may get bugs in production. Removing
the module forces user to think about why they chose asyncore and if
they can switch to a better alternative. It's supposed to help users
to avoid bugs.

Right. If something's an attractive-looking trap, that's a reasonable
reason to think about removing it.

But I'm not talking about that there.

The gray area is more about "deprecated aliases" and having two ways
to do the same things, but one way is deprecated. One example is the
removal of collections.MutableMapping: you must now use
collections.abc.MutableMapping. Another example is the removal the "U"
mode in the open() function: the flag was simply ignored since Python
3.0. So far, the trend is to remove these "aliases" and force users to
upgrade this code. Not removing these aliases has been discussed, and
it seems like each time, it was decided to remove them. Usually, the
"old way" is deprecated for many Python versions, like 5 years if not
longer.

Using deprecated functions is a threat in terms of technical debt. An
application using multiple deprecated functions will break with a
future Python version.

But "will break with a future Python version" just means that people's
code breaks because *we break it*. If we stopped doing that (in the
simple cases of name aliases or functions that are older but not
dangerous), then their code wouldn't break.

It's safe to avoid deprecated functions
whenever possible. Some deprecated functions have been removed but
then restored for 1 or 2 more Python releases, to give more time to
users to upgrade their code. At the end, the deprecated code is
removed.

We can warn developers to pay attention to DeprecationWarning
warnings, but sadly, in my experience, the removal is the only trigger
which works for everybody.

Do you have to repeat "You should check for DeprecationWarning in your
code" in every "What's New in Python X.Y?" document? Python 3.9 has
such section:
https://docs.python.org/dev/whatsnew/3.9.html#you-should-check-for-deprecationwarning-in-your-code

Clearly, that's not working. Python users want to write commits that
either bring value, or that are fun. Mass-replacing "failUnless" with
"assertTrue" just because someone decided it's a better name is neither.
Same with a forced move to the latest version of a function, if you
don't use the bells and whistles it added.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/IRBNWJ23KTL7YCSHQ6IMNHQPNDS3AU63/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Do we need to remove everything that's deprecated?

2021-11-12 Thread Petr Viktorin

On 12. 11. 21 13:51, Victor Stinner wrote:

The current backwards compatibility policy (PEP 387) sets a *minimum*
timeline for deprecations and removals -- "deprecation period must last
at least two years."

About the PEP 387 process and the 3 examples.

On Fri, Nov 12, 2021 at 11:58 AM Petr Viktorin wrote:

AttributeError: module 'configparser' has no attribute
'SafeConfigParser'. Did you mean: 'RawConfigParser'?
(bpo-45173)

SafeConfigParser was not even documented, was deprecated since Python
3.2, and emitted a DeprecationWarning.

ImportError: cannot import name 'formatargspec' from 'inspect'
(bpo-45320)

It was deprecated in the doc since Python 3.5, and emitted a DeprecationWarning.

AttributeError: '[...]Tests' object has no attribute 'failUnless'
(bpo-45162)

Deprecated in the doc since Python 3.1. It emitted a DeprecationWarning.

But it seems like it's not treated as a minimum
(...)
Note that I am criticizing the *process*

On these examples, the functions were deprecated for way longer than a
minimum of 2 Python versions, no?

Yes. And as far as I know, they haven't really caused problems in all
that time.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/S3BDHURSCE3ODZBFOKEW3P2UWIZPEU7R/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Do we need to remove everything that's deprecated?

2021-11-12 Thread Petr Viktorin

We're rebuilding many popular projects with Python 3.11 alpha, and I see 
many failures like:


  AttributeError: module 'configparser' has no attribute 
'SafeConfigParser'. Did you mean: 'RawConfigParser'?

(bpo-45173)

  ImportError: cannot import name 'formatargspec' from 'inspect'
(bpo-45320)

  AttributeError: '[...]Tests' object has no attribute 'failUnless'
(bpo-45162)

Are these changes necessary?
Does it really cost us that much in maintainer effort to keep a 
well-tested backwards compatibility alias name, or a function that has a 
better alternative?


I think that rather than helping our users, changes like these are 
making Python projects painful to maintain.
If we remove them to make Python easier for us to develop, is it now 
actually that much easier to maitain?


The current backwards compatibility policy (PEP 387) sets a *minimum* 
timeline for deprecations and removals -- "deprecation period must last 
at least two years."
But it seems like it's not treated as a minimum: if any contributor 
sends an issue/PR to remove deprecated functionality, it's merged 
without much discussion. And it's very easy to "solve" these "issues", 
since everything is already marked for deletion; I fear we get the same 
kind of bikeshed/powerplant problem https://bikeshed.com/ for changes 
that explains for discussion. It's just so much easier to do "spring 
cleaning" than solve other problems.


Note that I am criticizing the *process*; the examples I gave have some 
people's names attached, and I have no doubt the people acted with best 
intentions.
I'm also not talking about code that's buggy, insecure, or genuinely 
hard to maintain.



If deprecation now means "we've come up with a new way to do things, and 
you have two years to switch", can we have something else that means 
"there's now a better way to do things; the old way is a bit worse but 
continues to work as before"?


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AYJOQL36SK3EK5VMEKT5L5BVH25HVY4G/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Remove asyncore, asynchat and smtpd modules

2021-11-12 Thread Petr Viktorin

On 11. 11. 21 13:31, Victor Stinner wrote:

Hi,

The asyncore module is a very old module of the Python stdlib for
asynchronous programming, usually to handle network sockets
concurrently. It's a common event loop, but its design has many flaws.

The asyncio module was added to Python 3.4 with a well designed
architecture. Twisted developers, who have like 10 to 20 years of
experience in asynchronous programming, helped to design the asyncio
API. By design, asyncio doesn't have flaws which would be really hard
to fix in asyncore and asynchat.

It was decided to start deprecating the asyncore, asynchat and smtpd
modules in Python 3.6 released in 2016, 5 years ago. Python 3.10 emits
DeprecationWarning.

Wait, only Python 3.10?
According to the policy, the warning should be there for *at least* two
releases. (That's a minimum, for removing entire modules it might make
sense to give people even more time.)

asynchat and smtpd are implemented with asyncore.
Open issues in asyncore, asynchat and smtpd have been closed as "wont
fix" because these modules are deprecated. These modules are basically
no longer maintained.

I propose to remove asyncore, aynchat and smtpd in Python 3.11 to
reduce the Python maintenance burden, while asyncio remains available
in stdlib and is maintained:

* asyncore and asynchat can be replaced with asyncio
* smtpd can be replaced with aiosmtpd which is based on asyncio:
https://aiosmtpd.readthedocs.io/

If someone wants to continue using asyncore, asynchat or smtpd, it's
trivial to copy Python 3.10 asyncore.py, asynchat.py and smtpd.py to
their project, and maintain these files there. Someone is also free to
continue maintaining these modules as third-party projects on PyPI.

The removal is discussed at:
https://bugs.python.org/issue28533

I wrote a PR to remove the 3 modules:
https://github.com/python/cpython/pull/29521

... in short, the intent is to move the asyncore, asynchat and smtpd
maintenance outside the Pyhon project ;-) (if anyone still use them)

Victor

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/2Q22LYRMTYRDSCXXLM2DMVTT3VVRQF5B/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Proposal: Allow non-default after default arguments

2021-11-09 Thread Petr Viktorin





On 09. 11. 21 10:50, Chris Angelico wrote:

On Tue, Nov 9, 2021 at 8:38 PM Sebastian Rittau  wrote:


Currently, Python doesn't allow non-default arguments after default
arguments:

  >>> def foo(x=None, y): pass
File "", line 1
  def foo(x=None, y): pass
   ^
SyntaxError: non-default argument follows default argument

I believe that at the time this was introduced, no use cases for this
were known and this is is supposed to prevent a source of bugs. I have
two use cases for this, one fringe, but valid, the other more important:

The fringe use case: Suppose you have a function that takes a 2D
coordinate value as separate "x" and "y" arguments. The "x" argument is
optional, the "y" argument isn't. Currently there are two ways to do
this, none of them particularly great:

def foo(y, x):  # reverse x and y arguments, confusing
  ...
def foo(x, y=None):  # Treat the x argument as y if only one argument is
provided
  if y is None:
  x, y = y, x
  ...

To me, the "natural" solution looks like this:

def foo(x=None, y): ...
# Called like this:
foo(1, 2)
foo(y=2)

This could also be useful when evolving APIs. For example there is a
function "bar" that takes two required arguments. In a later version,
the first argument gains a useful default, the second doesn't. There is
no sensible way to evolve the API at the moment.



What would this mean, though:

foo(2)

Is that legal? If it is, it has to be the same as foo(y=2), by your
definition. But that would mean that it's hard to get your head around
the mapping of arguments and parameters.

foo(1, 2) # x=1, y=2
foo(1) # x=None, y=1

There are a very very few functions in Python that have this sort of
odd behaviour (range() being probably the only one most programmers
will ever come across), and it's not something to encourage.


A more extreme case is functions with an optional *group*. In curses, 
the first two arguments to addch are optional. In `help()` it's 
documented as `window.addch([y, x,] ch[, attr=...])` and you can call it 
as one of:


window.addch(ch)
window.addch(ch, attr)
window.addch(y, x, ch)
window.addch(y, x, ch, attr)

see: https://docs.python.org/3/library/curses.html#curses.window.addch

Supporting this was a headache for Argument Clinic (PEP 436), and AFAIK 
it still isn't possible to express this as an inspect.Signature (PEP 362).


Allowing non-default arguments after default arguments would mean 
introspection tools (and code that uses them) would need to be changed 
to prepare for the new possibilities. It's not free.



And for the "encoding" case: IMO, varying the return type based on an 
optional "encoding" argument" is a holdover from the pre-typing era, 
when return types were only specified in the documentation -- just like 
"addch" is a holdover from the days when function signatures were only 
described in the docs. Nowadays, I'd consider it bad API design. The 
@overloads are ugly but they work -- just like the API itself. IMO we 
shouldn't add special cases to encourage more of it.



I would instead recommend making the parameters keyword-only, which
would allow any of them to have defaults or not have defaults. In
terms of useful API design, this is usually more helpful than having
an early parameter omitted.


+1. I'm not sure if it's possible to mark args as keyword-only in the 
type stubs while keeping actual implementation backwards-compatible, but 
if it is, it might be a good option.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JIK7EJLN4WG4EJE4UVYVLUORGQB6XQWR/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

2021-11-03 Thread Petr Viktorin

On 03. 11. 21 12:33, Serhiy Storchaka wrote:

03.11.21 12:36, Petr Viktorin пише:

On 03. 11. 21 2:58, Kyle Stanley wrote:

I'd suggest both: briefer, easier to read write up for average user in
docs, more details/semantics in informational PEP. Thanks for working
on this, Petr!

Well, this is the brief write-up :)
Maybe it would work better if the info was integrated into the relevant
parts of the docs, rather than be a separate HOWTO.

I went with an informational PEP because it's quicker to publish.

What is the supposed target audience of this document?

Good question! At this point it looks like it's linter authors.

If it is core
Python developers only, then PEP is the right place to publish it. But I
think that it rather describes potential issues in arbitrary Python
project, and as such, it will be more accessible as a part of the Python
documentation (as a HOW-TO article perhaps). AFAIK all other
informational PEPs are about developing Python, not developing in Python
(even if they are (mis)used (e.g. PEP 8) outside their scope).

There's a bunch of packaging PEPs, or a PEP on what the the
/usr/bin/python command should be. I think PEP 672 is in good company
for now.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/UTNIZZVWL56G7KSYSS67PYYZ2YPE7NX3/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

2021-11-03 Thread Petr Viktorin

On 03. 11. 21 12:37, Chris Angelico wrote:

On Wed, Nov 3, 2021 at 10:22 PM Steven D'Aprano wrote:

On Wed, Nov 03, 2021 at 11:21:53AM +1100, Chris Angelico wrote:

TBH, I'm not entirely sure how valid it is to talk about *security*
considerations when we're dealing with Python source code and variable
confusions, but that's a term that is well understood.

It's not like Unicode is the only way to write obfuscated code,
malicious or otherwise.

But to the extent that it is a security concern, it's not one that
linters can really cope with. I'm not sure how a linter would stop
someone from publishing code on PyPI that causes confusion by its
character encoding, for instance.

Do we require that PyPI prevents people from publishing code that causes
confusion by its poorly written code and obfuscated and confusing
identifiers?

The linter is to *flag the issue* during, say, code review or before
running the code, like other code quality issues.

If you're just running random code you downloaded from the internet
using pip, then Unicode confusables are the least of your worries.

I'm not really sure why people get so uptight about Unicode confusables,
while being blasé about the opportunities to smuggle malicious code into
pure ASCII code.

Right, which is why I was NOT talking about confusables. I don't
consider them to be a particularly Unicode-related threat, although
the larger range of available characters does make it more plausible
than in ASCII.

But I do see a problem with code where most editors misrepresent the
code, where abuse of a purely ASCII character encoding for purely
ASCII code can cause all kinds of tooling issues. THAT is a more
viable attack vector, since code reviewers will be likely to assume
that their syntax highlighting is correct.

And yes, I'm aware that Python can't be expected to cope with poor
tools, but when *many* well-known tools have the same problem, one
must wonder who should be solving the issue.

This is a very good point. Let's not point fingers, but figure out how
to make users' lives easier together :)

This was the first time I was "in" on an embargoed "issue", and let me
tell you, I was surprised by the amount of time spent on polishing the
messaging. Now, you can't reasonably twist all this into a "Python is
insecure" or "Company X products are insecure" headline, which is good,
but with that out of the way we can focus on *what* could be improved
over *where* the improvement could be and who should do it.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/FNUZCNDF7K2LLHRYRDYY3ZZYISRCI4XJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)

2021-11-03 Thread Petr Viktorin

We seem to agree that this is work for linters. That's reasonable; I'd 
generalize it to "tools and policies". But even so, discussing what we'd 
expect linters to do is on topic here.
Perhaps we can even find ways for the language to support linters -- 
type checking is also for external tools, but has language support.

For example: should the parser emit a lightweight audit event if it 
finds a non-ASCII identifier? (See below for why ASCII is special.)

Or for encoding declarations?

On 03. 11. 21 6:26, Stephen J. Turnbull wrote:

Serhiy Storchaka writes:

  > All control characters except CR, LF, TAB and FF are banned outside
  > comments and string literals. I think it is worth to ban them in
  > comments and string literals too.

+1

  > > For homoglyphs/confusables, should there be a SyntaxWarning when an
  > > identifier looks like ASCII but isn't?
  >
  > It would virtually ban Cyrillic.

+1 (for the comment and for the implied -1 on SyntaxWarning, let's
keep the Cyrillic repertoire in Python!)

I don't think this would actually ban Cyrillic/Greek.
(My suggestion is not vanilla confusables detection; it might require 
careful reading: "should there be a [linter] warning when an identifier 
looks like ASCII but isn't?")

I am not a native speaker, but I did try a bit to find an actual 
ASCII-like word in a language that uses Cyrillic. I didn't succeed; I 
think they might be very rare.
Even if there was such a word -- or a one-letter abbreviation used as a 
variable name -- it would be confusing to use. Removing the possibility 
of confusion could *help* Cyrillic users. (I can't speak for them; this 
is just a brainstorming idea.)

Steven adds:
Let's not enshrine as a language "feature" that non Western European 
languages are dangerous second-class citizens.

That would be going too far, yes, but the fact is that non-English 
languages *are* second-class citizens. Code that uses Python keywords 
and stdlib must use English, and possibly another language. It is the 
mixing of languages that can be dangerous/confusing, not the languages 
themselves.

  > It is a work for linters,

+1

Aside from the reasons Serhiy presents, I'd rather not tie
this kind of rather ambiguous improvement in Unicode handling to the
release cycle.

It might be worth having a pep module/script in Python (perhaps
more likely, PyPI but maintained by whoever does the work to make
these improvements + Petr or somebody Petr trusts to do it), that
lints scripts specifically for confusables and other issues.

If I have any say in it, the name definitely won't include a PEP number ;)
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LB4O3YVDNVVNLYPMNH236QXGGUYG4BVI/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

2021-11-03 Thread Petr Viktorin

On 03. 11. 21 2:58, Kyle Stanley wrote:
I'd suggest both: briefer, easier to read write up for average user in 
docs, more details/semantics in informational PEP. Thanks for working on 
this, Petr!

Well, this is the brief write-up :)
Maybe it would work better if the  info was integrated into the relevant 
parts of the docs, rather than be a separate HOWTO.

I went with an informational PEP because it's quicker to publish.

On Tue, Nov 2, 2021 at 2:07 PM David Mertz, Ph.D. <mailto:david.me...@gmail.com>> wrote:

This is an amazing document, Petr. Really great work!

I think I agree with Marc-André that putting it in the actual Python
documentation would give it more visibility than in a PEP.

On Tue, Nov 2, 2021, 1:06 PM Marc-Andre Lemburg mailto:m...@egenix.com>> wrote:

On 01.11.2021 13:17, Petr Viktorin wrote:
 >> PEP: 
 >> Title: Unicode Security Considerations for Python
     >> Author: Petr Viktorin mailto:encu...@gmail.com>>
 >> Status: Active
 >> Type: Informational
 >> Content-Type: text/x-rst
 >> Created: 01-Nov-2021
 >> Post-History:

Thanks for writing this up. I'm not sure whether a PEP is the
right place
for such documentation, though. Wouldn't it be more visible in
the standard
Python documentation ?

-- 
Marc-Andre Lemburg

eGenix.com

Professional Python Services directly from the Experts (#1, Nov
02 2021)
 >>> Python Projects, Coaching and Support ...
https://www.egenix.com/ <https://www.egenix.com/>
 >>> Python Product Development ...
https://consulting.egenix.com/ <https://consulting.egenix.com/>

::: We implement business ideas - efficiently in both time and
costs :::

    eGenix.com Software, Skills and Services GmbH 
Pastor-Loeh-Str.48

     D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
            Registered at Amtsgericht Duesseldorf: HRB 46611
https://www.egenix.com/company/contact/
<https://www.egenix.com/company/contact/>
https://www.malemburg.com/ <https://www.malemburg.com/>

___
Python-Dev mailing list -- python-dev@python.org
<mailto:python-dev@python.org>
To unsubscribe send an email to python-dev-le...@python.org
<mailto:python-dev-le...@python.org>
https://mail.python.org/mailman3/lists/python-dev.python.org/
<https://mail.python.org/mailman3/lists/python-dev.python.org/>
Message archived at

https://mail.python.org/archives/list/python-dev@python.org/message/FSFG2B3LCWU5PQWX3WRIOJGNV2JFW4AU/

<https://mail.python.org/archives/list/python-dev@python.org/message/FSFG2B3LCWU5PQWX3WRIOJGNV2JFW4AU/>
Code of Conduct: http://python.org/psf/codeofconduct/
<http://python.org/psf/codeofconduct/>

___
Python-Dev mailing list -- python-dev@python.org
<mailto:python-dev@python.org>
To unsubscribe send an email to python-dev-le...@python.org
<mailto:python-dev-le...@python.org>
https://mail.python.org/mailman3/lists/python-dev.python.org/
<https://mail.python.org/mailman3/lists/python-dev.python.org/>
Message archived at

https://mail.python.org/archives/list/python-dev@python.org/message/6PHPDZRCYNA44NHSHXPBL7QMWXMHXWGU/

<https://mail.python.org/archives/list/python-dev@python.org/message/6PHPDZRCYNA44NHSHXPBL7QMWXMHXWGU/>
Code of Conduct: http://python.org/psf/codeofconduct/
<http://python.org/psf/codeofconduct/>

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/6OET4CKEZIA34PAXIJR7BUDKT2DPX2DG/
Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZNANXZ7VP6CVDAGWEFXHKYFO6AR3MZXQ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)

2021-11-02 Thread Petr Viktorin




On 01. 11. 21 18:32, Serhiy Storchaka wrote:

This is excellent!

01.11.21 14:17, Petr Viktorin пише:

CPython treats the control character NUL (``\0``) as end of input,
but many editors simply skip it, possibly showing code that Python
will not
run as a regular part of a file.


It is an implementation detail and we will get rid of it. It only
happens when you read the Python script from a file. If you import it as
a module or run with runpy, the NUL character is an error.


That brings us to possible changes in Python in this  area, which is an 
interesting topic.


As for \0, can we ban all ASCII & C1 control characters except 
whitespace? I see no place for them in source code.



For homoglyphs/confusables, should there be a SyntaxWarning when an 
identifier looks like ASCII but isn't?


For right-to-left text: does anyone actually name identifiers in 
Hebrew/Arabic? AFAIK, we should allow a few non-printing 
"joiner"/"non-joiner" characters to make it possible to use all Arabic 
words. But it would be great to consult with users/teachers of the 
languages.
Should Python run the bidi algorithm when parsing and disallow reordered 
tokens? Maybe optionally?

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TGB377QWGIDPUWMAJSZLT22ERGPNZ5FZ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: pre-PEP: Unicode Security Considerations for Python

2021-11-02 Thread Petr Viktorin


On 01. 11. 21 13:17, Petr Viktorin wrote:

Hello,
Today, an attack called "Trojan source" was revealed, where a malicious 
contributor can use Unicode features (left-to-right text and homoglyphs) 
to code that, when shown in an editor, will look different from how a 
computer language parser will process it.

See https://trojansource.codes/, CVE-2021-42574 and CVE-2021-42694.

This is not a bug in Python.
As far as I know, the Python Security Response team reviewed the report 
and decided that it should be handled in code editors, diff viewers, 
repository frontends and similar software, rather than in the language.


I agree: in my opinion, the attack is similar to abusing any other 
"gotcha" where Python doesn't parse text as a non-expert human would. 
For example: `if a or b == 'yes'`, mutable default arguments, or a 
misleading typo.


Nevertheless, I did do a bit of research about similar gotchas in 
Python, and I'd like to publish a summary as an informational PEP, 
pasted below.



Thanks for the comments, everyone! I've updated the document and sent it 
to https://github.com/python/peps/pull/2129
A rendered version is at 
https://github.com/encukou/peps/blob/pep-0672/pep-0672.rst




Toshio Kuratomi wrote:

  `Unicode`_ is a system for handling all kinds of written language.
It aims to allow any character from any human natural language (as
well as a few characters which are not from natural languages) to be
used. Python code may consist of almost all valid Unicode characters.


Thanks! That's a nice summary; I condensed it a bit more and used it.
(I'm not joining the conversation on glyphs, characters, codepoints and 
encodings -- that's much too technical for this document. Using the 
specific technical terms unfortunately doesn't help understanding, so I 
use the vague ones like "character" and "letter".)



Jim J. Jewett wrote:

"The East Asian symbol for *ten* looks like a plus sign, so ``十= 10`` is a complete 
Python statement."


Normally, an identifier must begin with a letter, and numbers can only be used in the 
second and subsequent positions.  (XID_CONTINUE instead of XID_START)  The fact that some 
characters with numeric values are considered letters (in this case, category Lo, Other 
Letters) is a different problem than just looking visually confusable with "+", 
and it should probably be listed on its own.


I'm not a native speaker, but as I understand it, "十" is closer to a 
single-letter word than a single-digit number. It translates better as 
"ten" than "10". (And it appears in "十四", "fourteen", just like "four" 
appears in "fourteen".)



Patrick Schultz wrote:

- The Unicode consortium has a list of confusables, in case useful


Yup, and it's linked from the documents that describe how to use it. I 
link to those rather than just the list.

But thank you!


Terry Reedy wrote:

Bidirectional Text
--

Some scripts, such as Hebrew or Arabic, are written right-to-left.


[Suggested addition, subject to further revision.]

There are at least three levels of handling r2l chars: none, local (contiguous 
sequences are properly reversed), and extended (see below).  The handling 
depends on the display software and may depend on the quoting.  Tk and hence 
tkinter (and IDLE) text widgets do local handing.  Windows Notepad++ does local 
handling of unquoted code but extending handling of quoted text.  Windows 
Notepad currently does extended handling even without quotes.


I'd like to leave these details out of the document. The examples should 
render convincingly in browsers. The text should now describe the 
behavior even if you open it in an editor that does things differently, 
and acknowledge that such editors exist. (The behavior of specific 
editors/toolkits might well change in the future.)



For example, with ``encoding: unicode_escape``, characters like
quotes or braces can be hidden in an (f-)string, with many tools (syntax
highlighters, linters, etc.) considering them part of the string.
For example::


I don't see the connection between the text above and the example that follows.


# For writing Japanese, you don't need an editor that supports
# UTF-8 source encoding: unicode_escape sequences work just as well.

[etc]


Let me know if it's clear in the newest version, with this note:


Here, ``encoding: unicode_escape`` in the initial comment is an encoding
declaration. The ``unicode_escape`` encoding instructs Python to treat
``\u0027`` as a single quote (which can start/end a string), ``\u002c`` as
a comma (punctuator), etc.



Steven D'Aprano wrote:

Before the age of computers, most mechanical typewriters lacked the keys 
for the digits ``0`` and ``1``


I'm not sure that "most" is justifed here. One of the most popular 
typewriters in history, the Underwood #5 (from 1900 to 1920), lacked 
the 1 key b

[Python-Dev] pre-PEP: Unicode Security Considerations for Python

2021-11-01 Thread Petr Viktorin


Hello,
Today, an attack called "Trojan source" was revealed, where a malicious 
contributor can use Unicode features (left-to-right text and homoglyphs) 
to code that, when shown in an editor, will look different from how a 
computer language parser will process it.

See https://trojansource.codes/, CVE-2021-42574 and CVE-2021-42694.

This is not a bug in Python.
As far as I know, the Python Security Response team reviewed the report 
and decided that it should be handled in code editors, diff viewers, 
repository frontends and similar software, rather than in the language.


I agree: in my opinion, the attack is similar to abusing any other 
"gotcha" where Python doesn't parse text as a non-expert human would. 
For example: `if a or b == 'yes'`, mutable default arguments, or a 
misleading typo.


Nevertheless, I did do a bit of research about similar gotchas in 
Python, and I'd like to publish a summary as an informational PEP, 
pasted below.





PEP: 
Title: Unicode Security Considerations for Python
Author: Petr Viktorin 
Status: Active
Type: Informational
Content-Type: text/x-rst
Created: 01-Nov-2021
Post-History:

Abstract


This document explains possible ways to misuse Unicode to write Python
programs that appear to do something else than they actually do.

This document does not give any recommendations and solutions.


Introduction


Python code is written in `Unicode`_ – a system for encoding and
handling all kinds of written language.
While this allows programmers from all around the world to express themselves,
it also allows writing code that is potentially confusing to readers.

It is possible to misuse Python's Unicode-related features to write code that
*appears* to do something else than what it does.
Evildoers could take advantage of this to trick code reviewers into
accepting malicious code.

The possible issues generally can't be solved in Python itself without
excessive restrictions of the language.
They should be solved in code edirors and review tools
(such as *diff* displays), by enforcing project-specific policies,
and by raising awareness of individual programmers.

This document purposefully does not give any solutions
or recommendations: it is rather a list of things to keep in mind.

This document is specific to Python.
For general security considerations in Unicode text, see [tr36]_ and [tr39]_.


Acknowledgement
===

Investigation for this document was prompted by [CVE-2021-42574],
*Trojan Source Attacks* reported by Nicholas Boucher and Ross Anderson,
which focuses on Bidirectional override characters in a variety of languages.


Confusing Features
==

This section lists some Unicode-related features that can be surprising
or misusable.


ASCII-only Considerations
-

ASCII is a subset of Unicode

While issues with the ASCII character set are generally well understood,
the're presented here to help better understanding of the non-ASCII cases.

Confusables and Typos
'

Some characters look alike.
Before the age of computers, most mechanical typewriters lacked the keys for
the digits ``0`` and ``1``: users typed ``O`` (capital o) and ``l``
(lowercase L) instead. Human readers could tell them apart by context only.
In programming language, however, distinction between digits and letters is
critical -- and most fonts designed for programmers make it easy to tell them
apart.

Similarly, the uppercase “I” and lowercase “l” can look similar in fonts
designed for human languages, but programmers' fonts make them noticeably
different.

However, what is “noticeably” different always depend on the context.
Humans tend to ignore details in longer identifiers: the variable name
``accessibi1ity_options`` can still look indistinguishable from
``accessibility_options``, while they are distinct for the compiler.

The same can be said for plain typos: most humans will not notice the typo in
``responsbility_chain_delegate``.

Control Characters
''

Python generally considers all ``CR`` (``\r``), ``LF`` (``\n``), and ``CR-LF``
pairs (``\r\n``) as an end of line characters.
Most code editors do as well, but there are editors that display “non-native”
line endings as unknown characters (or nothing at all), rather than ending
the line, displaying this example::

# Don't call this function:
fire_the_missiles()

as a harmless comment like::

# Don't call this function:⬛fire_the_missiles()

CPython treats the control character NUL (``\0``) as end of input,
but many editors simply skip it, possibly showing code that Python will not
run as a regular part of a file.

Some characters can be used to hide/overwrite other characters when source is
listed in common terminals:

* BS (``\b``, Backspace) moves the cursor back, so the character after it
  will overwrite the character before.
* CR (``\r``, carriage return) moves the cursor to the start of line,
  s

[Python-Dev] Re: PEP 670: Convert macros to functions in the Python C API

2021-10-20 Thread Petr Viktorin


On 20. 10. 21 3:15, Victor Stinner wrote:

Extra info that I didn't put in the PEP to keep the PEP short.

Since Python 3.8, multiple macros have already been converted,
including Py_INCREF() and Py_TYPE() which are very commonly used and
so matter for Python performance.

Macros converted to static inline functions:

* Py_INCREF(), Py_DECREF(), Py_XINCREF(), Py_XDECREF(): Python 3.8
* PyObject_INIT(), PyObject_INIT_VAR(): Python 3.8
* Private functions: _PyObject_GC_TRACK(), _PyObject_GC_UNTRACK(),
_Py_Dealloc(): Python 3.8
* Py_REFCNT(): Python 3.10
* Py_TYPE(), Py_SIZE(): Python 3.11

Macros converted to regular functions in Python 3.9:

* PyIndex_Check()
* PyObject_CheckBuffer()
* PyObject_GET_WEAKREFS_LISTPTR()
* PyObject_IS_GC()
* PyObject_NEW(): alias to PyObject_New()
* PyObject_NEW_VAR(): alias to PyObjectVar_New()

To keep best performances on Python built without LTO, fast private
variants were added as static inline functions to the internal C API:

* _PyIndex_Check()
* _PyObject_IS_GC()
* _PyType_HasFeature()
* _PyType_IS_GC()

--

Many of these changes have been made to prepare the C API to make
these structure opaque:

* PyObject: https://bugs.python.org/issue39573
* PyTypeObject: https://bugs.python.org/issue40170

Don't access structure members at the ABI level, but abstract them
through a function call.

Some functions are still static inline functions (and so still access
structure members at the ABI level), since the performance impact of
converting them to regular functions was not measured yet.


I think this info should be in the PEP.

If the PEP is rejected, would all these previous changes need to be 
reverted? Or just the ones done in 3.11?


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FQB2Z3A757SUTOCMAWB3BFKTP5ISQJWS/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Documenting Python versioning and stability expectations

2021-10-15 Thread Petr Viktorin


Hello,
I heard not everyone is Discourse, so I'm re-posting here as well.



Information about  is current scattered over the FAQs, active 
PEPs (387, 602), an active-but-severely-outdated PEP (6) and the Devguide.


I would like to consolidate as much of this as possible into user-facing 
reference documentation.


* Does that sound like a good idea?
* Where in the docs should such a page live?

Here is a draft of the docs; if added they’d be accompanied by more changes:

* The relevant FAQs would be replaced by links to these docs
* PEP 6 would be retired in favor of 602, 387 and these docs
* Parts of PEP 387/602 might be replaced by links here (or this document 
should link to the PEP rather than duplicating it)
* The examples of “backwards-incompatible APIs” from PEP 387 would only 
live in one place, and be linked from the other.
* C-API stability docs should link back to this more general document. 
(Details specific to the C-API shoudl still be there.)


Most of this is, hopefully, just capturing existing tribal knowledge, but:

* the “Unstable API” section contains several additional examples that 
PEP 387 doesn’t have. I’m proposing them as an update to the PEP; all of 
this will need SC approval anyway.
* “Most planned changes (such as removal of deprecated features) are 
done in alpha releases.” – this is also a newly proposed rule; I intend 
to formalize it further for the devguide. Combined with our deprecation 
periods, this rule would allow more time for testing the actual effect 
of removals.



A rendered (and up-to-date) version is at:
https://github.com/encukou/cpython/blob/cpython-version-docs/Doc/stability-docs/index.rst







.. _python-versioning:

==
Python's Versioning and Stability Expectations
==


Python Versions
===

The Python language is developed together with its reference
:ref:`implementation `, *CPython*.  Both share the same
release schedule and versioning scheme.

Production-ready Python versions numbered with three numbers,
``major.minor.micro``.

* New *major* versions are exceptional, and are planned very long in 
advance.

* New *minor* versions are feature releases; they get released annually.
* New *micro* versions are *bugfix* releases, which get released roughly
  every 2 months for 5 years after a minor release; or *security* releases
  which are made irregularly afterwards.

We also publish non-final *pre-release* versions with an additional
qualifier: *Alpha* (``a``), *Beta* (``b``) and *release candidates* 
(``rc``).
These versions are not production use; they're aimed at testers and 
maintainers

of third-party libraries.

The version number is combined into a single string, for example:


+--+---+---+---++
   | Version  | Major | Minor | Micro | Prerelease 
   |


+==+===+===+===++
   | Python ``3.6.3`` | 3 | 6 | 3 | Final 
(production-ready)   |


+--+---+---+---++
   | Python ``3.7.0a3``   | 3 | 7 | 0 | Third *alpha* 
   |


+--+---+---+---++
   | Python ``3.9.4b1``   | 3 | 9 | 4 | First *beta* 
   |


+--+---+---+---++
   | Python ``2.7.14rc2`` | 2 | 7 | 14| Second *release 
candidate* |


+--+---+---+---++

When discussing features that do not change in *micro* or *minor* releases,
or ones that are new in `x.y.0` or `x.0.0` versions,
it is common to only specify the relevant numbers:

   +-+---+---+---+
   | Version | Major | Minor | Micro |
   +=+===+===+===+
   | Python ``3.10`` | 3 | 10| any   |
   +-+---+---+---+
   | Python ``3``| 3 | any   | any   |
   +-+---+---+---+


See also the documentation for :data:`sys.version_info`,
:data:`sys.hexversion`, :data:`sys.version`, and :ref:`apiabiversion`,
which expose version numbers in different formats.


.. _python-releases:

Python Releases
===

All releases, including pre-releases, are available
from https://www.python.org/downloads/.  New releases are announced on the
comp.lang.python and comp.lang.python.announce newsgroups and on the Python
home page at `python.org`_; an RSS feed of news is available.

.. _python.org: https://python.org


.. _python-stability:

Versioning details and Stability Expectations
=

This section documents stability expectations for the various types of 
Python

releases. It is intended for users of Python,

[Python-Dev] Re: PEP 654 except* formatting

2021-10-06 Thread Petr Viktorin




On 06. 10. 21 15:34, Łukasz Langa wrote:


On 6 Oct 2021, at 12:06, Larry Hastings > wrote:


It seems like, for this to work, "group" would have to become a keyword.


No, just like `match` and `case` didn't have to.


This would play havoc with a lot of existing code.

Extraordinary claims require extraordinary evidence, Larry. I maintain 
this will be entirely backwards compatible.


Even making it a soft keyword, a la "await" in 3.5, would lead to 
ambiguity:


group = KeyboardInterrupt

try:
    while True:
    print("thou can only defeat me with Ctrl-C")
except group as error:
    print("lo, thou hast defeated me")


Two things:

1. This is a convoluted example, I bet $100 you won't find such an 
`except group` statement in any code predating my e-mail 鸞 Sure, 
sometimes (very rarely) it's useful to gather exceptions in a variable. 
But I'm pretty sure `group` won't be the name chosen for it.


2. While non-obvious, the example is not ambiguous. There can only be 
one parsing rule fitting this:


'except' expression 'as' NAME ':'

Note how this is different from:

'except' 'group' expression 'as' NAME ':'

There could be confusion if except-star, whatever its name is going to 
be, supported an empty "catch all" variant like `except:`. Thankfully, 
this is explicitly listed as a no-go in PEP 654. So `except group:` 
remains unambiguous.


What about this:

group = (KeyboardInterrupt, MemoryError)
other_group = (KeyError, IndexError)

try:
   ...
except group + other_group as error:
   ...
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KH7T6VDRYENBLLFNY7CAXFEVH4IILXZ7/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Worried about Python release schedule and lack of stable C-API

2021-10-05 Thread Petr Viktorin

On 05. 10. 21 8:59, Nick Coghlan wrote:
On Tue, 28 Sep 2021, 6:55 am Brett Cannon, > wrote:

On Sun, Sep 26, 2021 at 3:51 AM Phil Thompson via Python-Dev
mailto:python-dev@python.org>> wrote:

However the stable ABI is still a second class citizen as it is
still
not possible (AFAIK) to specify a wheel name that doesn't need to
explicitly include each supported Python version (rather than a
minimum
stable ABI version).

Actually you can do this. The list of compatible wheels for a
platform starts at CPython 3.2 when the stable ABI was introduced
and goes forward to the version of Python you are running. So you
can build a wheel file that targets the oldest version of CPython
that you are targeting and its version of the stable ABI and it is
considered forward compatible. See `python -m pip debug --verbose`
for the complete list of wheel tags that are supported for an
interpreter.

I think Phil's point is a build side one: as far as I know, the process
for getting one of those more generic file names is still to build a
wheel with an overly precise name for the stable ABI declarations used,
and then rename it.

The correspondence between "I used these stable ABI declarations in my
module build" and "I can use this more broadly accepted wheel name" is
currently obscure enough that I couldn't tell you off the top of my head
how to do it, and I contributed to the design of both sides of the equation.

Actually improving the build ergonomics would be hard (and outside
CPython's own scope), but offering a table in the stable ABI docs giving
suggested wheel tags for different stable ABI declarations should be
feasible, and would be useful to both folks renaming already built
wheels and anyone working on improving the build automation tools.

Indeed, thinking about proper wheel tags, and adding support for them in
both pip/installer and setuptools/build/poetry/etc., would be a very
helpful way to contribute to the stable ABI.

I don't think I will be able to get to this any time soon.

The current `abi3` tag doesn't encode the minimum ABI version. AFAIK
that info should go in the "Requires-Python" wheel metadata, but there's
not automation or clear guides for that. Putting it in the wheel tag
might be a good idea.

There are vague ideas floating around about removing old stable ABI
features (hopefully after they're deprecated for 5-10 years); if there's
a new wheel tag scheme it should be made with that possibility in mind.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/WSAF6H4P2GDORL6KOU6ZIBVUITDYAIBA/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Making code object APIs unstable

2021-09-02 Thread Petr Viktorin

On 01. 09. 21 22:28, Guido van Rossum wrote:

I apologize, I keep making the same mistake.

The PyCode_New[WithPosArgs] functions are *not* in the stable ABI or in 
the limited API, so there's no need to petition the SC, nor do I need 
Petr's approval.

We may be bound by backwards compatibility for the *cpython* API, but I 
think that if Cython is okay if we just break this we should be fine. 
Users of the CPython API are expected to recompile for each new version, 
and if someone were to be using these functions with the old set of 
parameters the compiler would give them an error.

The cpython CPI is still covered by the backwards compatibility policy 
(PEP 387). You do need to ask the SC to skip the two-year deprecation 
period.

I don't see an issue with the exception being granted, but I do think it 
should be rubber-stamped as a project-wide decision.

So let's just choose (E) and d*mn backwards compatibility for these two 
functions.

That means:
- Get rid of PyCode_NewWithPosArgs altogether
- PyCode_New becomes unstable (and gets a new posinlyargcount argument)

... but still remains available and documented, just with a note that it 
may change in minor versions. Right?

On Wed, Sep 1, 2021 at 11:52 AM Guido van Rossum > wrote:

(context)

Guido van Rossum schrieb am 13.08.21 um 19:24:
 > In 3.11 we're changing a lot of details about code objects.
Part of this is
 > the "Faster CPython" work, part of it is other things (e.g.
PEP 657 -- Fine
 > Grained Error Locations in Tracebacks).
 >
 > As a result, the set of fields of the code object is
changing. This is
 > fine, the structure is part of the internal API anyway.
 >
 > But there's a problem with two public API functions,
PyCode_New() and
 > PyCode_NewWithPosArgs(). As we have them in the main (3.11)
branch, their
 > signatures are incompatible with previous versions, and they
have to be
 > since the set of values needed to create a code object is
different. (The
 > types.CodeType constructor signature is also changed, and so
is its
 > replace() method, but these aren't part of any stable API).
 >
 > Unfortunately, PyCode_New() and PyCode_NewWithPosArgs() are
part of the PEP
 > 387 stable ABI. What should we do?
 >
 > A. We could deprecate them, keep (restore) their old
signatures, and create
 > crippled code objects (no exception table, no endline/column
tables,
 > qualname defaults to name).
 >
 > B. We could deprecate them, restore the old signatures, and
always raise an
 > error when they are called.
 >
 > C. We could just delete them.
 >
 > D. We could keep them, with modified signatures, and to heck
with ABI
 > compatibility for these two.
 >
 > E. We could get rid of PyCode_NewWithPosArgs(), update
PyCode() to add the
 > posonlyargcount (which is the only difference between the
two), and d*mn
 > the torpedoes.
 >
 > F. Like (E), but keep PyCode_NewWithPosArgs() as an alias for
PyCode_New()
 > (and deprecate it).
 >
 > If these weren't part of the stable ABI, I'd choose (E). [...]

On Tue, Aug 31, 2021 at 7:07 PM Stefan Behnel mailto:stefan...@behnel.de>> wrote:

I also vote for (E). The creation of a code object is tied to
interpreter
internals and thus shouldn't be (or have been) declared stable.

I think you're one of the few people who call those functions, and
if even you think it's okay to break backward compatibility here, I
think we should just talk to the SC to be absolved of having these
two in the stable ABI. (Petr, do you agree? Without your backing I
don't feel comfortable even asking for this.)

I think the only problem with that argument is that code objects
are
required for frames. You could argue the same way about frames,
but then it
becomes really tricky to, you know, create frames for non-Python
code.

Note there's nothing in the stable ABI to create frames. There are
only functions to *get* an existing frame, to inspect a frame, and
to eval it. In any case even if there was a stable ABI function to
create a frame from a code object, one could argue that it's
sufficient to be able to get an existing code object from e.g. a
function object.

Since we're discussing this in the context of PEP 657, I wonder
if there's
a better way to create tracebacks from C code, other than
creating fake
frames with fake code objects.

Cython uses code objects and frames for the following use cases:

[Python-Dev] Re: Should PEP 8 be updated for Python 3 only?

2021-08-26 Thread Petr Viktorin

On 26. 08. 21 9:54, Marc-Andre Lemburg wrote:

On 26.08.2021 06:07, Christopher Barker wrote:

I'm working on a PR now. It seems there is little support for keeping the
python2 content in the docs, so I'm re-writing it as though it was never there.
If someone wants to add a note about Python 2, of course that can be added
later.

Note that "moving the Python 2 content to a section at the end" is not all that
straightforward, as it is pretty mixed in with the text at this point.

But now a question -- the current text reads:

"Code in the core Python distribution should always use UTF-8"

and then:

"In the standard library, non-default encodings should be used only for
test purposes or when a comment or docstring needs to mention an author
name that contains non-ASCII characters ..."

I *think* that's a remnant of the Py2 ASCII encoding days -- but I wanted to
make sure, a bit later on, it says:

"The following policy is prescribed for the
standard library ... In addition, string literals and comments must also be in
ASCII."

For Python 2 code we mandated ASCII for the stdlib, with some exceptions
using the source code encoding for testing purposes or in case e.g.
Martin von Löwis or Marc-André Lemburg wanted to put his name into the code
without escaping part of it ;-)

Note that Python 2 defaults to ASCII as source code encoding.

With UTF-8 as standard source code encoding, this is no longer
necessary.

So the second quote can be changed to "In the standard library, non-default
source code encodings should be used only for test purposes ...".

Is that still correct for string literals and comments? And what about
docstrings?

It seems to me that if we really are utf-8, then there is no need for those
"textual" elements to be ASCII. e.g they can still contain non-ascii characters,
and escaping those makes things less readable, not more.

So I think that section should now read:

"""
Source File Encoding

Code in the core Python distribution should always use UTF-8, and should not
have an encoding declaration.

In the standard library, non-UTF-8 encodings should be used only for
test purposes.

I think the above should be limited to Python code. In C or other
source files you may well still need a source code encoding.

The following policy is prescribed for the standard library (see PEP
3131): All identifiers in the Python standard library MUST use
ASCII-only identifiers, and SHOULD use English words wherever feasible
(in many cases, abbreviations and technical terms are used which aren't
English). In comment and docstrings, authors whose names tht are not
based on the Latin alphabet (latin-1, ISO/IEC 8859-1 character set)
MUST provide a transliteration of their names in this character set.

Open source projects with a global audience are encouraged to adopt a
similar policy.
"""

But maybe we do want to keep comments, docstrings and literals as ASCII with
escapes?

No need for the stdlib, since UTF-8 is widely accepted by now
and why should people with non-ASCII names not be able to write
their true name ?

You may have noted that I rarely do... the reason is that in the
past, the accent on the "e" caused me too many problems. Perhaps
one of these days, I'll go back to adding it again :-)

I would drop the weirdly specific "(latin-1, ISO/IEC 8859-1 character
set)" note, and only keep "based on the Latin alphabet".
The Ł in Łukasz's name is not in latin-1, and I don't think it needs
different treatment than German or French names. (As opposed to a
Russian or Chinese name, where an an average English speaker isn't able
to type an approximation of the name on their keyboard.)

- Peťa Viktorin

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/E6B6INCC5IH5477XF5BGXPC3GPIEER5R/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Stable ABI – PEP 667: Consistent views of namespaces

2021-08-24 Thread Petr Viktorin


On 23. 08. 21 5:07, Guido van Rossum wrote:
On Sat, Aug 21, 2021 at 8:52 PM Nick Coghlan > wrote:

[...]

Code that uses PyEval_GetLocals() will NOT continue to operate
safely under PEP 667: all such code will raise an exception at
runtime, and need to be rewritten to use a new API with different
refcounting semantics.


Yeah, I did a double take too when I read what Mark wrote. He uses 
"safe" in a very technical sense, meaning that you get a Python 
exception but not a crash, and no leak or writing freed memory. And it's 
true, any caller to PyEval_GetLocals() should check for errors, and 
there are several error conditions that may occur. (The docs are 
incomplete, they say "returns NULL if no frame is executing" but they 
fail to mention that it sets an exception in that case and in other cases.)


But PyEval_Locals() is in the Stable ABI (though I have no idea why),


This was a case of "now is better than never" – a line had to be drawn 
somewhere, and having a clear line is better than spending years to get 
the ideal line.



For these PEPs, I think the discussion should stick to the desired 
semantics first; backwards compatibility for the Stable ABI can be 
bolted on to whatever solution comes up.
"Regular" backwards compatibility is another matter – IMO it's important 
to keep things like debuggers working as much as possible.



we have essentially two options: keep it working, or make it return an 
error. We can't delete it. And it returns a borrowed reference, which 
makes it problematic to let it return a "f_locals proxy object" since 
those proxies are not cached on the frame.


From PEP 652 "Maintaining the Stable ABI":


Future Python versions may deprecate some members of the Stable ABI. Deprecated 
members will still work, but may suffer from issues like reduced performance 
or, in the most extreme cases, memory/resource leaks.


There are many things that can be done:

- I believe we can add an extra pointer on frame objects, lazily 
populated, just for backward compatibility.

- The bad old API can introduce a reference cycle.
- We can incref the "borrowed" reference and introduce a leak.
- The bad old API can start always raising an exception. (Last on the 
list, since if you can't fix the source and recompile an extension, 
there's no workaround.)



In all cases, extension authors can fix things by moving to the new API.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LQARGPG4ROL6MJOQ4Y7CNT57TT7ON22S/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Deprecate Py_TRASHCAN_SAFE_BEGIN/END in 3.10?

2021-08-17 Thread Petr Viktorin

On 17. 08. 21 12:00, Łukasz Langa wrote:

Hi everybody,
I'd like to revive this thread as I feel like we have to do something
here but some consensus is needed first.

To recap, the current state of things is as follows:
- *in March 2000* (d724b23420f) Christian Tismer contributed the
"trashcan" patch that added Py_TRASHCAN_SAFE_BEGIN
and Py_TRASHCAN_SAFE_END macros which allow destroying nested objects
non-recursively.
- *in May 2019* (GH-11841 of BPO-35983) Antoine Pitrou merged a change
by Jeroen Demeyer which made Py_TRASHCAN_SAFE_BEGIN/END
(unintentionally?) backwards incompatible; this was released in Python
3.8.0.
- by the way, GH-11841 introduced a new pair of macros (because they
have different signatures) called simply Py_TRASHCAN_BEGIN and
Py_TRASHCAN_END.
- by that time there was already a follow-up PR open (GH-12607) to
improve backwards compatibility of the macros, as well as introduce
tests for them; this was never merged.
- *in Feb 2020* (0fa4f43db08) Victor Stinner removed the trashcan
mechanism from the limited C API (note: not ABI, those are macros) since
it accesses fields of structs not exposed in the limited C API; this was
released in Python 3.9.0.
- *in May 2020* Irit noticed that the backwards incompatibility
(BPO-40608) causes segfaults for C API code that worked fine with Python
3.7. Using the new macros requires code changes but doesn't crash.

Now, there are a couple of things we can do here:
*Option 1*: Finish GH-12607 to fix the old macros, keeping in mind this
will restore compatibility lost with Python 3.8 - 3.10 only for users of
3.11+
*Option 2*: Review and merge GH-20104 that reverts the macro changes
that make old client code segfault -- unclear what else this needs and
again, that would only fix it for users of 3.11+
*Option 3*: Abandon GH-12607 and GH-20104, instead declaring the old
macros deprecated for 3.11 and remove them in 3.13

I personally agree with Irit, voting +1 for Option 3 since the old
macros were soft-deprecated already by introducing new macros in 3.8,
and more importantly made incompatible with pre-3.8 usage.

+1.
The deprecation should follow PEP 387, which means emitting a warning. I
think a compiler warning is appropriate; that could be done by having
the macro call a deprecated function. Depending on how broken the macros
are, a compiler warning could be backported to older versions.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/KWFX6XX3HMZBQ2BYBVL7G74AIOPWO66Y/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Making code object APIs unstable

2021-08-17 Thread Petr Viktorin

I'm late to the thread, and as I read it I see everything I wanted to 
say was covered already :)

So just a few clarifications.

The stable ABI is not defined by PEP 384 or PEP 652 or by the header 
something is defined in, but by the docs:

- https://docs.python.org/dev/c-api/stable.html
and changes to it are covered in the devguide:
- https://devguide.python.org/c-api/


We have 3 API layers:

1. Internal API, guarded by Py_BUILD_CORE, can break *even in point 
releases*. (Py_BUILD_CORE means just that: things like `_PyCode_New` can 
only be used safely if you build/embed CPython yourself.)
2. Regular C-API, covered by PEP 387 (breaking changes need deprecation 
for 2 releases, or an exception from the SC); `PyCode_New*` is here now

2. Stable ABI, which is hard to change, and thankfully isn't involved here.

I can see that having `.replace()` equivalent in the C API would be 
"worth the effort of [its users having to] keeping up with CPython 
internal changes" (to quote Patrick).
Looks like we could use something between layers 1 and 2 above for 
"high-maintenance" users (like Cython): API that will work for all of 
3.11.x, but can freely break for 3.12. I don't think this needs an 
explicit API layer, though: just a note in the docs that a new 
`PyCode_NewWithAllTheWithBellsAndWhistles` is expected to change in 
point releases. But...


Guido:

  [struct rather than N arguments] is the
  API style that _PyCode_New() uses (thanks to Eric who
  IIRC pushed for this and implemented it). You gave me an idea now:
  the C equivalent to .replace() could use the same input structure;
  one can leave fields NULL that should be copied from the original
  unmodified.


From a usability point of view, that's a much better idea than a 
function that's expected to change. It would probably also be easier to 
implement than an entirely separate public API.


Nick:

  P.S. Noting an idea that won't work, in case anyone else reading
  the thread was thinking the same thing: a "PyType_FromSpec"
  style API won't help here, as the issue is that the compiler is
  now doing more work up front and recording that extra info in
  the code object for the interpreter to use. There is no way to
  synthesise that info if it isn't passed to the constructor, as
  it isn't intrinsically recorded in the opcode sequence.


I guess it might be possible to add a flag that says a piece of bytecode 
object has exception handling and so it needs an exception table, and 
have the old API raise when the flag is on.

It's probably not worth the effort, though.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/T32UC25S3R7MTAKTZRF22ZJ26K6WMFGO/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Heads up: `make` in Doc now creates a venv

2021-08-04 Thread Petr Viktorin


Hi,
A recent change "make html" in the Doc directory create a venv if one 
wasn't there before. If you don't want to download sphinx and other 
dependencies from PyPI, you'll need to adjust your workflow.



If you already have all the dependencies, the following command (in the 
CPython directory, not Doc) will build docs for you:

 sphinx-build Doc Doc/build/

The issue that added this is: https://bugs.python.org/issue44756
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MGPNI7OSA7UXNOTVDVW2I2GUMXV25FRS/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: enum in the stable ABI (Was: PEP 558: Defined semantics for locals)

2021-07-27 Thread Petr Viktorin




On 24. 07. 21 1:58, Nick Coghlan wrote:



On Sat, 24 Jul 2021, 9:37 am Larry Hastings, <mailto:la...@hastings.org>> wrote:



On 7/23/21 7:38 AM, Petr Viktorin wrote:

(In both C & C++, the size of an `enum` is implementation-defined.
That's unlikely to be a problem in practice, but one more point
against enum.)



True, but there's always the old trick of sticking in a value that
forces it to be at least 32-bit:

typedef enum {
     INVALID = 0,
     RED = 1,
     BLUE = 2,
     GREEN = 3,

     UNUSED = 1073741824
} color_t;


//arry/


My current inclination is to define the enum as "_PyLocals_KindValues", 
and then typedef "PyLocals_Kind" itself as an int. The frame API would 
then return the former, while the stable query API would return the latter.


However, I'll make a full survey of the enums currently in the stable 
ABI before making a decision, as there may be an existing approach that 
I like better.


Here's a survey: https://bugs.python.org/issue44727#msg398071

I do agree Petr's right to be cautious about this, as compilers can get 
up to some arcane shenanigans in the presence of formally undefined 
code: https://queue.acm.org/detail.cfm?id=3468263 
<https://queue.acm.org/detail.cfm?id=3468263>


The fact that the behaviour in this case is likely to be well-defined at 
the time of compilation would protect us from the weirder potential 
outcomes, but it still makes sense for us to define the query API in a 
way that tells both compilers and humans not to assume that the values 
returned by the current version of Python are the only values that will 
ever be returned by all future versions of Python.


If you ask me, I don't think C provides that much type safety to make 
enum worth it, even for the version-specific API.
But I like Larry's "old trick" better than having two different APIs. 
Thanks for that! It's a new trick for me!

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/JIBBPKHA5URUBKF43XWMUANP5ESQB4JM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] enum in the stable ABI (Was: PEP 558: Defined semantics for locals)

2021-07-23 Thread Petr Viktorin

On 22. 07. 21 12:41, Nick Coghlan wrote:

On Thu, 22 Jul 2021, 6:01 pm Petr Viktorin, <mailto:encu...@gmail.com>> wrote:

On 21. 07. 21 14:18, Nick Coghlan wrote:
 > On Mon, 19 Jul 2021 at 21:32, Petr Viktorin mailto:encu...@gmail.com>> wrote:
 >> The proposal assumes that in the future, ``PyLocals_Get``, and thus
 >> ``locals()``, will never gain another kind of return value, however
 >> unlikely that is.
 >> AFAICS, code that uses this will usually check for a single
special case
 >> and fall back (or error) for the other(s), so I think it'd be
reasonable
 >> to make this an "enum" with two values. e.g.:
 >>
 >> int PyLocals_GetReturnBehavior();  # better name?
 >> #define PyLocals_DIRECT_REFERENCE 0
 >> #define PyLocals_SHALLOW_COPY 1
 >
 > After looking at PyUnicode_Kind, PySendResult, and other already
 > public enums for inspiration, my proposed spelling is as follows:
 >
 > 
 > typedef enum {
 >      PyLocals_UNDEFINED = -1;
 >      PyLocals_DIRECT_REFERENCE = 0,
 >      PyLocals_SHALLOW_COPY = 1
 > } PyLocals_Kind;
 >
 > PyLocals_Kind PyLocals_GetKind(void);
 > PyLocals_Kind PyFrame_GetLocalsKind(PyFrameObject *);
 > 
 >
 > The PyLocals_UNDEFINED case comes from PyLocals_GetKind() needing an
 > error value to return when the query API is called with no active
 > thread state.
 >
 > I've updated the draft reference implementation to use this API, and
 > added the associated PEP changes to the review PR at
 > https://github.com/python/peps/pull/2038/files
<https://github.com/python/peps/pull/2038/files>

Please don't put the enum in the stable ABI. If we would add another
value and then an older extension would receive it, we'd get undefined
behavior.

Hmm, I was copying an example that is already in the stable ABI 
(PySendResult).

I think it's new in 3.10, though, so it should still be possible to fix 
that.

After researching a bit more, I see that casting unknown values to enum 
is only undefined/unspecified behavior in C++. But we do support C++ 
extensions, and so I'll try to get enums out of the stable ABI.

(In both C & C++, the size of an `enum` is implementation-defined. 
That's unlikely to be a problem in practice, but one more point against 
enum.)

NB. I don't have access to the actual standards; feel free to check this 
if you do!

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MRNSY5BWP7LBOA2MXHADSHM3WDNODI5O/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 558: Defined semantics for locals()

2021-07-22 Thread Petr Viktorin

On 22. 07. 21 15:03, Ethan Furman wrote:

On 7/22/21 1:01 AM, Petr Viktorin wrote:
 > On 21. 07. 21 14:18, Nick Coghlan wrote:

 >> 
 >> typedef enum {
 >>  PyLocals_UNDEFINED = -1;
 >>  PyLocals_DIRECT_REFERENCE = 0,
 >>  PyLocals_SHALLOW_COPY = 1
 >> } PyLocals_Kind;
 >>
 >> PyLocals_Kind PyLocals_GetKind(void);
 >> PyLocals_Kind PyFrame_GetLocalsKind(PyFrameObject *);
 >> 
 >
 > Please don't put the enum in the stable ABI. If we would add another 
value and then

 > an older extension would receive it, we'd get undefined behavior.

Probably a stupid question, but wouldn't the same thing happen if we 
didn't use an enum, added another option later, and on older extension 
received that newer value?

No.
Consider code like:
if (PyLocals_GetKind() == PyLocals_DIRECT_REFERENCE) {
 ...
}
where PyLocals_GetKind() is defined as above, but returns 4.

Technically it's undefined behavior, so the compiler could decide to 
wipe your disk or eat your pets in this case, but that's not very realistic.
More realistically, the compiler is free to only look at the last two 
bits of the value, so 4 will be equivalent to 0, and the comparison will 
be true. There are definitely architestures/compilers that do tricks 
similar to that.

But if PyLocals_GetKind() returns an int, 4 != 0 isn't a problem.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HFH5MG3LXYEI4YPO3MTDVXTVO7XCUXEA/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 558: Defined semantics for locals()

2021-07-22 Thread Petr Viktorin





On 21. 07. 21 14:18, Nick Coghlan wrote:

On Mon, 19 Jul 2021 at 21:32, Petr Viktorin  wrote:

The proposal assumes that in the future, ``PyLocals_Get``, and thus
``locals()``, will never gain another kind of return value, however
unlikely that is.
AFAICS, code that uses this will usually check for a single special case
and fall back (or error) for the other(s), so I think it'd be reasonable
to make this an "enum" with two values. e.g.:

int PyLocals_GetReturnBehavior();  # better name?
#define PyLocals_DIRECT_REFERENCE 0
#define PyLocals_SHALLOW_COPY 1


After looking at PyUnicode_Kind, PySendResult, and other already
public enums for inspiration, my proposed spelling is as follows:


typedef enum {
 PyLocals_UNDEFINED = -1;
 PyLocals_DIRECT_REFERENCE = 0,
 PyLocals_SHALLOW_COPY = 1
} PyLocals_Kind;

PyLocals_Kind PyLocals_GetKind(void);
PyLocals_Kind PyFrame_GetLocalsKind(PyFrameObject *);


The PyLocals_UNDEFINED case comes from PyLocals_GetKind() needing an
error value to return when the query API is called with no active
thread state.

I've updated the draft reference implementation to use this API, and
added the associated PEP changes to the review PR at
https://github.com/python/peps/pull/2038/files


Please don't put the enum in the stable ABI. If we would add another 
value and then an older extension would receive it, we'd get undefined 
behavior.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HNTBYVL3PF53HEO75ELBUCMG46HVMYV6/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 558: Defined semantics for locals()

2021-07-19 Thread Petr Viktorin


Thanks, Nick! This looks wonderful.
I do have a nitpick, below:

On 18. 07. 21 7:59, Nick Coghlan wrote:
[...]>

Changes to the stable C API/ABI
---

Unlike Python code, extension module functions that call in to the Python C API
can be called from any kind of Python scope. This means it isn't obvious from
the context whether ``locals()`` will return a snapshot or not, as it depends
on the scope of the calling Python code, not the C code itself.

This means it is desirable to offer C APIs that give predictable, scope
independent, behaviour. However, it is also desirable to allow C code to
exactly mimic the behaviour of Python code at the same scope.

To enable mimicking the behaviour of Python code, the stable C ABI would gain
the following new functions::

 PyObject * PyLocals_Get();
 int PyLocals_GetReturnsCopy();

``PyLocals_Get()`` is directly equivalent to the Python ``locals()`` builtin.
It returns a new reference to the local namespace mapping for the active
Python frame at module and class scope, and when using ``exec()`` or ``eval()``.
It returns a shallow copy of the active namespace at
function/coroutine/generator scope.

``PyLocals_GetReturnsCopy()`` returns zero if ``PyLocals_Get()`` returns a
direct reference to the local namespace mapping, and a non-zero value if it
returns a shallow copy. This allows extension module code to determine the
potential impact of mutating the mapping returned by ``PyLocals_Get()`` without
needing access to the details of the running frame object.


Since this goes in the stable ABI, I'm thinking about how extensible 
this will be in the future.


The proposal assumes that in the future, ``PyLocals_Get``, and thus 
``locals()``, will never gain another kind of return value, however 
unlikely that is.
AFAICS, code that uses this will usually check for a single special case 
and fall back (or error) for the other(s), so I think it'd be reasonable 
to make this an "enum" with two values. e.g.:


int PyLocals_GetReturnBehavior();  # better name?
#define PyLocals_DIRECT_REFERENCE 0
#define PyLocals_SHALLOW_COPY 1

Other values may be added in future versions of Python, if/when the 
Python ``locals()`` builtin is changed to return a different kind of value.


(and same for PyFrame_GetLocalsReturnsCopy)
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BTQUBHIVE766RPIWLORC5ZYRCRC4CEBL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Making PY_SSIZE_T_CLEAN not mandatory.

2021-06-22 Thread Petr Viktorin



On June 22, 2021 11:18:46 AM GMT+02:00, Henk-Jaap Wagenaar 
 wrote:
>On Tue, 22 Jun 2021 at 10:06, Petr Viktorin  wrote:
>
>> On 21. 06. 21 20:20, Guido van Rossum wrote:
>> > Okay, I think your evidence can then be discounted. Really, any app
>that
>> > relies on the publicly installed Python runs a serious risk of
>breaking
>> > when that Python gets updated, regardless of whether the ABI
>changes or
>> not.
>>
>> Unfortunately, this includes scripts for any extensible software
>> (Mayavi, GIMP, etc) -- even if Python is bundled with that software,
>> upgrading the software risks breaking the scripts.
>>
>>
>I'm confused by what you mean by this, or why it is a problem?

Not necessarily a problem, I just want to point out that there are situations 
where you need to depend on Python managed by someone else.

>If I upgrade GIMP (and it vendors some version/variant of Python), it
>is
>not unreasonable that this would break a script that I have written in
>GIMPython? (GIMP should probably mention that it has changed its Python
>and
>how in the changelog/release notes)
>
>If I upgrade my OS, and I use the system Python, scripts I have written
>might break too.
>
>(Of course, GIMP is a placeholder here, I do not actually know what it
>does
>in terms of Python (vendoring), if at all.

GIMP itself doesn't, but it's sometimes distributed in flatpak/appimage with 
its own bundled/pinned Python. (AFAIK it's usually Python 2.7, for a few 
reasons; one of them being that upgrading would break scripts)

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ABK447E74NUXN6JUSKH6Y7WIQNWHQPTS/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Making PY_SSIZE_T_CLEAN not mandatory.

2021-06-22 Thread Petr Viktorin

On 21. 06. 21 20:20, Guido van Rossum wrote:
Okay, I think your evidence can then be discounted. Really, any app that 
relies on the publicly installed Python runs a serious risk of breaking 
when that Python gets updated, regardless of whether the ABI changes or not.

Unfortunately, this includes scripts for any extensible software 
(Mayavi, GIMP, etc) -- even if Python is bundled with that software, 
upgrading the software risks breaking the scripts.

On Mon, Jun 21, 2021 at 2:46 AM Baptiste Carvello 
> wrote:

Hi,

Le 18/06/2021 à 21:00, Guido van Rossum a écrit :
 > Can you elaborate on that use case? Which two applications are you
 > thinking of, and what was your goal in driving them? This sounds
 > interesting but I haven’t encountered this myself.

Well, I'm not sure the case I was thinking of is still relevant to
anything: that was plotting 3D crystal models using crystallography
library CCTBX [1] and visualization application Mayavi [2], some 15-20
years ago. BTW, I misremembered a bit: only CCTBX insisted on using a
vendored python ("libtbx.python"), Mayavi used the system python.
Anyway, it was more pain to make Mayavi use libtbx.python, than to make
CCTBX work with the system python.

Also, I must admit that even applications embedding the system python
can have some limitations. For example, GIMP and GDB can execute python
scripts, but their API can't be "imported" from the outside. Which means
no arguments passed to the script over the command line ("sys.argv"), no
venvs, no REPL. But at least you can install additional packages (pip /
distro package manager) and limitations can be more or less hacked
around. For a sophisticated example, the debugger extension Voltron [3]
provides REPL access to GDB objects over a client-server connexion.

Cheers,
Baptiste

[1] https://cci.lbl.gov/docs/cctbx/ 
[2] https://docs.enthought.com/mayavi/mayavi/

[3] https://github.com/snare/voltron 

 > On Fri, Jun 18, 2021 at 09:44 Baptiste Carvello
 > mailto:devel2...@baptiste-carvello.net>
 > >> wrote:
 >
 >     Le 18/06/2021 à 08:50, Paul Moore a écrit :
 >     >
 >     > IMO it doesn't. However for certain applications (the sort
of thing I
 >     > was referring to) - where the user is writing their own
scripts and
 >     > the embedding API is used merely to expose an interface to
the Python
 >     > language, dynamically linking to whatever version of Python
the user
 >     > has installed can be precisely the right thing to do - the
user gets
 >     > access to the version of the language they expect, the
installed
 >     > packages they expect to see, etc.
 >
 >     As a user, I second this. When trying to drive applications
from the
 >     outside (as opposed to extending them through plugins), it is
annoying
 >     when two applications won't work together because each one
insists on
 >     using its own vendored python.
 >
 >     Of course, there are often real blockers, such as
incompatible event
 >     loops. But not always…
 >
 >     Cheers,
 >     Baptiste
 >     ___
 >     Python-Dev mailing list -- python-dev@python.org

 >     >
 >     To unsubscribe send an email to python-dev-le...@python.org

 >     >
 > https://mail.python.org/mailman3/lists/python-dev.python.org/

 >     Message archived at
 >

https://mail.python.org/archives/list/python-dev@python.org/message/PPKL7466BIG6DPCUIJURLE5ZGFNHBNSM/

 >     Code of Conduct: http://python.org/psf/codeofconduct/

 >
 > --
 > --Guido (mobile)
 >

___
Python-Dev mailing list -- python-dev@python.org

To unsubscribe send an email to python-dev-le...@python.org

https://mail.python.org/mailman3/lists/python-dev.python.org/

Message archived at

https://mail.python.org/archives/list/python-dev@python.org/message/KP3SE6UWSV3VDCJOWCXUZIBPDWFJHRLU/

[Python-Dev] Re: Making PY_SSIZE_T_CLEAN not mandatory.

2021-06-09 Thread Petr Viktorin

On 09. 06. 21 13:09, Paul Moore wrote:

On Wed, 9 Jun 2021 at 11:36, Inada Naoki wrote:

If I am wrong, can we stop keeping stable ABI at Python 3.12?
Python 4.0 won't come in foreseeable future. Stable ABI blocks Python evolution.

Conversely, the stable ABI allows projects to build cross-version
binary wheels. Not many projects do that yet, but it's definitely
something we'd like to see more of. Needing new binary builds every
version blocks users from testing new versions of Python in advance of
the release. [...]

But I do agree that we should either start keeping to the commitments
that we made around the stability of the stable ABI, or we should
abandon it properly and declare it no longer supported. Having
something that sort of works except when we accidentally broke it
doesn't help anyone.

I don't think I made actual commitments regarding the API. The docs do
say: "we recommend testing an extension with all minor Python versions
it supports".
Also, when the API breaks, you get a Python exception; if the ABI does,
you get segfaults.

So breaking the API is much less severe, but still -- please think about
the effect on users who just want their compiled extensions to keep working.

On 09. 06. 21 13:09, Paul Moore wrote:

Also, I often use the stable ABI when embedding, so that
I can replace the Python interpreter without needing to recompile my
application and redeploy new binaries everywhere (my use case is
pretty niche, though, so I wouldn't like to claim I represent a
typical user...).

I hope this use case becomes non-niche. I would love it if embedders
tell people to just use any Python they have lying around, instead of
vendoring it (or more realistically, embedding JS or Lua instead).

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/3VYTTABB2UB6HVQHASYONSYQDBHDL3OU/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: New pythoncapi_compat project adding Python 3.10 support to your C extensions without losing Python 2.7-3.9 support

2021-06-08 Thread Petr Viktorin

On 03. 06. 21 1:43, Victor Stinner wrote:

Hi,

What do you think of promoting the pythoncapi_compat project that I'm
introducing below in the "C API: Porting to Python 3.10" section of
What's New In Python 3.10?

Should this project be moved under the GitHub psf organization to have
a more "future proof" URL?

If you ask me, no.

I think we should aim to not break the C API as often. Rather than
promoting a tool to solve problems, I would much rather not create the
problems in the first place.

Sure, supporting HPy or per-interpreter GIL will be good when they're
ready to use. But at this point, they are *experiments*. I do not think
it's worth breaking the API for existing users, which get no benefit
from the changes, to run these experiments.

I would like to promote this project to prepare C extensions
maintainers for the following incompatible C API change (Py_TYPE) that
I would like to push into Python 3.11:
https://github.com/python/cpython/pull/26493
(Py_REFCNT was already converted to a static inline function in Python 3.10.)

I already made this Py_TYPE change in Python 3.10, but I had to revert
it since it broke too many projects. Since last year, I upgraded most
of these broken projects, I created the pythoncapi_compat project, and
I succeeded to use upgrade_pythoncapi.py script and copy the
pythoncapi_compat.h header file in multiple C extensions.

C extensions written with Cython are not affected. I already fixed
Cython last year to emit code compatible with my incoming incompatible
change. If it's not done yet, you only have to regenerate the C files
using a recent Cython version.

I wrote a new script which adds Python 3.10 support to your C
extensions without losing Python 2.7 support:
https://github.com/pythoncapi/pythoncapi_compat

To add Python 3.10 support to your C extension, go to its source
directory and run:

/path/to/upgrade_pythoncapi.py .

It upgrades all C files (.c) in the current directory and
subdirectories. For example, it replaces "op->ob_type" with
"Py_TYPE(op)". It creates an ".old" copy of patched files.

Use the -o option to select operations:

* -o Py_TYPE: only replace "obj->ob_type" with "Py_TYPE(obj)".
* -o all,-PyMem_MALLOC: run all operations, but don't replace
PyMem_MALLOC(...) with PyMem_Malloc(...).

The upgrade_pythoncapi.py script relies on the pythoncapi_compat.h
header file that I wrote to provide recent Python 3.9-3.11 C functions
on old Python versions. Examples: Py_NewRef() and
PyThreadState_GetFrame(). Functions are implemented as simple static
inline functions to avoid requiring to link your extension to a
dynamic library.

You can already use the new Py_NewRef() and Py_IsNone() Python 3.10
functions in your projects without losing support for Python 2.7-3.9!

The script also replaces "frame->f_back" with "_PyFrame_GetBackBorrow(frame)".

The _PyFrame_GetBackBorrow() function doesn't exist in the Python C
API, it's only provided by pythoncapi_compat.h to ease the migration
of C extensions. I advise you to replace _PyFrame_GetBackBorrow()
(borrowed reference) with PyFrame_GetBack() (strong reference).

This project is related to my PEP 620 "Hide implementation details
from the C API" which tries to make the C API more abstract to later
allow to implement new optimization in CPython and to make other
Python implementations like PyPy faster when running C extensions.

Article on the creation of the pythoncapi project:
https://vstinner.github.io/pythoncapi_compat.html

The main drawback of this project is that it uses regular expressions
to parse C code. Such "parser" can miss C code which has to be patched
manually. In my experience, additional manual changes are really rare
and take less than 1 minute on a very large C extension like numpy. >
--

This project only targets extension modules written in C by using
directly the "Python.h" API. I advise you to use Cython or HPy to no
longer be bothered with incompatible C API changes at every Python
release ;-)

* https://cython.org/
* https://hpy.readthedocs.io/

I hope that my script will facilitate migration of C extensions to HPy.

Victor

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/SY4X4MCIKIHDNBQPJ2JIBCP42D3LFPNP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Making PY_SSIZE_T_CLEAN not mandatory.

2021-06-08 Thread Petr Viktorin


On 08. 06. 21 10:05, Serhiy Storchaka wrote:

07.06.21 08:41, Hai Shi пише:

There have an another question. There have many C API defined under 
PY_SSIZE_T_CLEAN, for example: _PyArg_Parse_SizeT().
Could we remove or merge them after making PY_SSIZE_T_CLEAN not mandatory?


We should support binary compatibility to some degree, so there should
be several steps:

* Make macro PyArg_Parse an alias of _PyArg_Parse_SizeT. Keep function
PyArg_Parse.


One more thing about the stable ABI: in the future, I'd like to make it 
more useful in languages other than C. This usually means avoiding macros.
Would it make sense to expose _PyArg_Parse_SizeT as a public function, 
like PyArgT_Parse?
(The macro redirecting PyArg_Parse to this function could of course 
stay, to help C users.)



* Make function PyArg_Parse always raising an exception.


This would break extensions that use the stable ABI.
(Yes, even starting to raise RuntimeError in 3.10 broke things. And yes, 
it's not strictly an ABI issue, but it has the same effect for users: 
they still need to recompile extensions to keep them working.)



* Remove function PyArg_Parse.
* [Optionally] Now we can re-add function PyArg_Parse as an alias of
_PyArg_Parse_SizeT and remove macro PyArg_Parse.
* [Optionally in 4.0 or 5.0] Remove _PyArg_Parse_SizeT.

But we can squish several last steps in 4.0 which do not need to support
binary compatibility with 3.x.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZLS2AUQDGT3NHSBM63XEPN3TEUGRAP4Z/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Proposal: declare "unstable APIs"

2021-06-04 Thread Petr Viktorin

On 04. 06. 21 10:25, Serhiy Storchaka wrote:

03.06.21 20:10, Guido van Rossum пише:

This is not a complete thought yet, but it occurred to me that while we
have deprecated APIs (which will eventually go away), and provisional
APIs (which must mature a little before they're declared stable), and
stable APIs (which everyone can rely on), it might be good to also have
something like *unstable* APIs, which will continually change without
ever going away or stabilizing. Examples would be the ast module (since
the AST structure changes every time the grammar changes) and anything
to do with code objects and bytecode (since we sometimes think of better
ways to execute Python).

So maybe the docs should grow a standard way of saying "this is an
unstable API"?

There is already a way to specify the stable ABI (see
Doc/tools/extensions/c_annotations.py). But unfortunately this feature
is not is not used in the documentation. It needs just an amount of work
to do this, and nobody did this.

It is used, and I started the work :)
See e.g.
https://docs.python.org/3.10/c-api/sequence.html#c.PySequence_Concat

After marking all stable ABI we can extend this feature to support
halftones: provisional API, unstable API for Cython, etc.

I don't think that's necessary for the C API; the three-tier structure
we have now (see https://devguide.python.org/c-api/ ) is, IMO, sufficient.

I don't think it can be easily adapted for the Python API, though.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/4QINROTO26H5USYKUSQLRY4XTZOSV2FS/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: name for new Enum decorator

2021-05-28 Thread Petr Viktorin


On 28. 05. 21 5:24, Ethan Furman wrote:

Greetings!

The Flag type in the enum module has had some improvements, but I find 
it necessary to move one of those improvements into a decorator instead, 
and I'm having a hard time thinking up a name.


What is the behavior?  Well, a name in a flag type can be either 
canonical (it represents one thing), or aliased (it represents two or 
more things).  To use Color as an example:


     class Color(Flag):
     RED = 1    # 0001
     GREEN = 2  # 0010
     BLUE = 4   # 0100
     PURPLE = RED | BLUE    # 0101
     WHITE = RED | GREEN | BLUE # 0111

The flags RED, GREEN, and BLUE are all canonical, while PURPLE and WHITE 
are aliases for certain flag combinations.  But what if we have 
something like:


     class Color(Flag):
     RED = 1    # 0001
     BLUE = 4   # 0100
     WHITE = 7  # 0111

As you see, WHITE is an "alias" for a value that does not exist in the 
Flag (0010, or 2).  That seems like it's probably an error.  But what 
about this?


     class FlagWithMasks(IntFlag):
     DEFAULT = 0x0

     FIRST_MASK = 0xF
     FIRST_ROUND = 0x0
     FIRST_CEIL = 0x1
     FIRST_TRUNC = 0x2

     SECOND_MASK = 0xF0
     SECOND_RECALC = 0x00
     SECOND_NO_RECALC = 0x10

     THIRD_MASK = 0xF00
     THIRD_DISCARD = 0x000
     THIRD_KEEP = 0x100

Here we have three flags (FIRST_MASK, SECOND_MASK, THIRD_MASK) that are 
aliasing values that don't exist, but it seems intentional and not an 
error.


So, like the enum.unique decorator that can be used when duplicate names 
should be an error, I'm adding a new decorator to verify that a Flag has 
no missing aliased values that can be used when the programmer thinks 
it's appropriate... but I have no idea what to call it.


Any nominations?


Are you looking for a decorator for the whole Enum, or a way to mark 
individual *values* as masks?

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NDQ55INDDQMGERVYUHUYNDZ572IPD4UY/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The repr of a sentinel

2021-05-21 Thread Petr Viktorin

On 21. 05. 21 3:23, Eric V. Smith wrote:

On 5/20/2021 3:24 PM, Ronald Oussoren via Python-Dev wrote:

On 20 May 2021, at 19:10, Luciano Ramalho > wrote:

I'd like to learn about use cases where `...` (a.k.a. `Ellipsis`) is
not a good sentinel. It's a pickable singleton testable with `is`,
readily available, and extremely unlikely to appear in a data stream.
Its repr is "Ellipsis".

If you don't like the name for this purpose, you can always define a
constant (that won't fix the `repr`, obviously, but helps with source
code readability).

SENTINEL = ...

I can't think of any case where I'd rather have my own custom
sentinel, or need a special API for sentinels. Probably my fault, of
course. Please enlighten me!

One use case for a sentinel that is not a predefined (builtin)
singleton is APIs where an arbitrary user specified value can be used.

One example of this is the definition of dataclasses.field:

|dataclasses.||field|(/*/, /default=MISSING/,
/default_factory=MISSING/, /repr=True/, /hash=None/, /init=True/,
/compare=True/, /metadata=None/)

Here the “default” and “default_factory” can be an arbitrary value,
and any builtin singleton could be used. Hence the use of a custom
module-private sentinel that cannot clash with values used by users of
the module (unless those users poke at private details of the module,
but then all bets are off anyway).

That’s why I don’t particularly like the proposal of using Ellipsis as
the sanctioned sentinel value. It would be weird at best that the
default for a dataclass field can be any value, except for the builtin
Ellipsis value.

Completely agree. I'm opposed to Ellipsis as a sentinel for this reason,
at least for dataclasses. I can easily see wanting to store an Ellipsis
in a field of a dataclass that's describing a function's parameters. And
I can even see it being the default= value. Not so much
default_factory=, but they may as well be the same.

And this argument also works for any other single value.
Including the original None.

(It just might not be obvious at first, before that single value starts
being used in lots of different contexts.)

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/IA4BEPTEJXVK5UO2L7ZDQJG2Z3OYQ3VX/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The repr of a sentinel

2021-05-14 Thread Petr Viktorin

On 14. 05. 21 10:55, Victor Stinner wrote:

Hi Tal,

Would it make sense to have an unique singleton for such sentinel, a
built-in singleton like None or Ellipsis? I propose the name
"Sentinel".

Sentinel would be similar to None, but the main property would be that
"Sentinel is None" is false :-)

If you need your Sentinel to be different from one particular sentinel 
(None), you'll usually want it to be different from all the other ones 
as well.

A sentinel for an optional parameter shouldn't really be used at all 
outside of the function it's defined for. That's why it's usually 
defined as a private module-level variable.

Perhaps it would be beneficial to provide a common base class or 
factory, so we get a good repr. But I don't think another common value 
like None and Ellipsis would do much good.

The stdlib contains tons of sentinels:

* _collections_abc: __marker__
* cgitb.__UNDEF__
* configparser: _UNSET
* dataclasses: _HAS_DEFAULT_FACTORY, MISSING, KW_ONLY
* datetime.timezone._Omitted
* fnmatch.translate() STAR
* functools.lru_cache.sentinel (each @lru_cache creates its own sentinel object)
* functools._NOT_FOUND
* heapq: temporary sentinel in nsmallest() and nlargest()
* inspect._sentinel
* inspect._signature_fromstr() invalid
* plistlib._undefined
* runpy._ModifiedArgv0._sentinel
* sched: _sentinel
* traceback: _sentinel

There are different but similar use cases:

* Optional parameter: distinguish between func() and func(arg=value),
a sentinel is useful to distinguish func() from func(arg=None)
* Look into a data structure for a value and store the result in a
value, distinguish if 'result' variable was set ("result is not None"
doesn't work since None is a value). Quick example: "missing =
object(); tmsg = self._catalog.get(message, missing); if tmsg is
missing: ..."

Special cases:

* dataclases._EMPTY_METADATA = types.MappingProxyType({})
* string._sentinel_dict = {}
* enum: _auto_null = object()

Victor

On Thu, May 13, 2021 at 7:40 PM Tal Einat  wrote:

On Thu, May 13, 2021 at 7:44 PM Ethan Furman  wrote:

Consider me complaining.  ;-)

+1

An actual Sentinel class would be helpful:

  >>> class Sentinel:
  ... def __init__(self, repr):
  ... self.repr = repr
  ... def __repr__(self):
  ... return self.repr
  ...

  >>> MISSING = Sentinel('MISSING')
  >>> MISSING
  MISSING

  >>> implicit = Sentinel('')
  >>> implicit

Here is my suggestion (also posted on the related bpo-44123), which is
also simple, ensures a single instance is used, even considering
multi-threading and pickling, and has a better repr:

class Sentinel:
 def __new__(cls, *args, **kwargs):
 raise TypeError(f'{cls.__qualname__} cannot be instantiated')

class MISSING(Sentinel):
 pass

- Tal
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/URFRF634732GRICGLRPGJEJON2BYQZM4/
Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SJ45ED57TNPLFCWXAREUGRKSPTPPJYJI/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The repr of a sentinel

2021-05-13 Thread Petr Viktorin


On 13. 05. 21 11:45, Antoine Pitrou wrote:


Le 13/05/2021 à 11:40, Irit Katriel a écrit :



On Thu, May 13, 2021 at 10:28 AM Antoine Pitrou > wrote:



  I agree that  is a reasonable spelling.


I initially suggested , but now I'm not sure because it 
doesn't indicate what happens when you don't provide it (as in, what 
is the default value).  So now I'm with  or .


"" makes think of a derived class, and leaves me confused. 
"" is a bit better, but doesn't clearly say what the default 
value is, either.  So in all cases I have to read the docstring in 
addition to the function signature.




Is  the term you're looking for?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/POF7BUF5EGU37DB5F34DOVT7E6LVERX4/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Using FutureWarning for last version before deletion.

2021-05-11 Thread Petr Viktorin





On 11. 05. 21 11:08, Inada Naoki wrote:

On Tue, May 11, 2021 at 5:30 PM Petr Viktorin  wrote:


Test tools should treat DeprecationWarning as error by default [0][1].
So even if end users don't really see it, I don't consider it "hidden".



*should* is not *do*. For example, nosetests don't show DeprecationWarning.
And there are many scripts without tests.

So it is hidden for some people.



Sadly, there's not much we can do for users of nose. Nose itself is only 
tested with Python 3.5 and below.


I'm aware that there are scripts without tests. But maybe letting them 
suddenly break is the right balance between letting people know and 
annyoing everyone with unactionable warnings.


If DeprecationWarning is not enough, then we should be having a wider 
discussion, and PEP 387 should change. This particular issue should not 
be an exception to the process.


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FWFZWMSWWV5KNM3LSU5NQCSDO7YLUARU/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Using FutureWarning for last version before deletion.

2021-05-11 Thread Petr Viktorin


On 10. 05. 21 10:53, Inada Naoki wrote:

Hi, folks.

Now Python 3.11 development is open and I am removing some deprecated
stuffs carefully.

I am considering `configparser.ParseError.filename` property that is
deprecated since Python 3.2.
https://github.com/python/cpython/blob/8e8307d70bb9dc18cfeeed3277c076309b27515e/Lib/configparser.py#L315-L333

My random thoughts about it:

* It has been deprecated long enough.
* But the maintenance burden is low enough.
* If we don't remove long deprecated stuff like this, Python 4.0 will
be a big breaking change.

My proposal:

* Change DeprecationWarning to FutureWarning and wait one more version.
   * DeprecationWarning is suppressed by default to hide noise from end users.
   * But sudden breaking change is more annoying to end users.

I am not proposing to change PEP 387 "Backwards Compatibility Policy".
This is just a new convention.


Test tools should treat DeprecationWarning as error by default [0][1].
So even if end users don't really see it, I don't consider it "hidden".

Waiting one more release sounds reasonable to me, but for a slightly 
different reason: the warning should list the version the feature will 
be removed in: "3.12" rather than "future versions".



Another idea: would it be worth it to create "What's new" pages for 3.12 
and 3.13 already and fill them with planned removals? (Of course they'd 
need to be de-emphasized in the table of contents.)



[0]: 
https://www.python.org/dev/peps/pep-0565/#recommended-filter-settings-for-test-runners
[1]: 
https://docs.pytest.org/en/latest/how-to/capture-warnings.html#deprecationwarning-and-pendingdeprecationwarning

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/L73YDIYZLA5JVAP3FYWTJJH6NMBZL5OJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] PEP 652: Python 3.10 will have explicit Limited C API & Stable ABI

2021-04-29 Thread Petr Viktorin


Hello,
I've merged the main part of PEP-652 implementation. The Limited C API 
(introduced in PEP 384 and used for extension modules that work across 
Python versions without recompilation) is now explicitly defined and 
better tested.


When changing/extending the limited API:
- Stop and think! This API will  need to be suppported forever (or until 
Python 3 reaches end of life, if that comes sooner). A checklist of what 
to think about is being added to the devguide [0].

- Add/change an entry in `Misc/stable_abi.txt`.
- Run `make regen-limited-abi` [sic].

You can run related checks with `make check-limited-abi`. If that or the 
"Check if generated files are up to date" CI test gives you errors, 
please let me know!


More tests & docs are coming up.

[0]: https://cpython-devguide--682.org.readthedocs.build/c-api/
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/W5QG2I2IQXQDPMCFMRDUMGCEX5MX6PHA/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Let's Fix Class Annotations -- And Maybe Annotations Generally

2021-04-24 Thread Petr Viktorin


On 24. 04. 21 9:52, Larry Hastings wrote:

I've hit a conceptual snag in this.

What I thought I needed to do: set __annotations__= {} in the module 
dict, and set __annotations__= {} in user class dicts.  The latter was 
more delicate than the former but I think I figured out a good spot for 
both.  I have this much working, including fixing the test suite.


But now I realize (*head-slap* here): if *every* class is going to have 
annotations, does that mean builtin classes too?  StructSequence classes 
like float? Bare-metal type objects like complex?  Heck, what about type 
itself?!


My knee-jerk initial response: yes, those too.  Which means adding a new 
getsetdef to the type object.  But that's slightly complicated.  The 
point of doing this is to preserve the existing best-practice of peeking 
in the class dict for __annotations__, to avoid inheriting it.  If I'm 
to preserve that, the get/set for __annotations__ on a type object would 
need to get/set it on tp_dict if tp_dict was not NULL, and use internal 
storage somewhere if there is no tp_dict.


It's worth noticing that builtin types don't currently have 
__annotations__ set, and you can't set them. (Or, at least, float, 
complex, and type didn't have them set, and wouldn't let me set 
annotations on them.)  So presumably people using current best 
practice--peek in the class dict--aren't having problems.


So I now suspect that my knee-jerk answer is wrong.  Am I going too far 
down the rabbit hole?  Should I /just/ make the change for user classes 
and leave builtin classes untouched?  What do you think?


Beware of adding mutable state to bulit-in (C static) type objects: 
these are shared across interpreters, so changing them can “pollute” 
unwanted contexts.


This has been so for a long time [0]. There are some subinterpreter 
efforts underway that might eventually lead to making __annotations__ on 
static types easier to add, but while you're certainly welcome to 
explore the neighboring rabbit hole as well, I do think you're going in 
too far for now :)


[0] 
https://mail.python.org/archives/list/python-dev@python.org/message/KLCZIA6FSDY3S34U7A72CPSBYSOMGZG3/

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3GIZO2R2IRIN47THXRWAZKEQ5JBFRITP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 652 Accepted -- Maintaining the Stable ABI

2021-04-06 Thread Petr Viktorin


On 05. 04. 21 21:46, Pablo Galindo Salgado wrote:

Hi Petr,

Thank you for submitting PEP 652 (Maintaining the Stable ABI). After 
evaluating
the situation and discussing the PEP, the Steering Council is happy with 
the PEP
and hereby accepts it. The Steering council thinks that this is a great 
step forward
in order to have a clear definition of what goes into the Stable ABI and 
what guarantees
the Python core team offers regarding the stable ABI while offering at 
the same time a

plan to improve the maintenance and stability of the stable ABI.

We would also like to see some improvements in the official 
documentation (not only on the
devguide) regarding this topic and what guarantees do we offer 
(currently we only have a small
section about this in https://docs.python.org/3/c-api/stable.html 
 but there is a lot of 
information
and clarifications in the PEP that we would like to be also in the 
documentation).


Congratulations, Petr!

With thanks from the whole Python Steering Council,
Pablo Galindo Salgado



Thanks to you and the CS for consideration!
I'm definitely planning to update the documentation prose as well.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OPBBMHVEXJ2YZKKIM45TGL3BMISLSU56/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 654 -- Exception Groups and except* : request for feedback for SC submission

2021-02-23 Thread Petr Viktorin

On 2/23/21 1:24 AM, Irit Katriel via Python-Dev wrote:

Hi all,

We would like to request feedback on PEP 654 -- Exception Groups and 
except*.

https://www.python.org/dev/peps/pep-0654/ 

Thank you for this PEP!

Also, thank you Nathaniel (and possibly other author(s) of Trio – sadly, 
I'm not following closely enough to track contributions) for doing 
things right even though it's hard (and slow), and documenting your work 
beautifully. I'm glad to see the ideas assimilated into Python & asyncio!

The PEP reminds me of PEP 380 (yield from): it looks like syntax sugar 
for code you could already write, but once you look closer, it turns out 
that there are so many details and corner cases to keep track of, 
getting it correct is very hard.

I kept notes as I read the PEP, then deleted most as I went through 
Rejected Ideas. These remained:

> The `ExceptionGroup` class is final, i.e., it cannot be subclassed.

What's the rationale for this?

> It is possible to catch the ExceptionGroup type with except, but not 
with except* because the latter is ambiguous

What about `except *(TypeError, ExceptionGroup):`?

> Motivation: Errors in wrapper code

This use case sticks out a bit: it's the only one where ExceptionGroup 
doesn't represent joining equivalent tasks.

Consider code similar to bpo-40857:

  try:
  with TemporaryDirectory() as tempdir:
  os.rmdir(tempdir)
  n = 1 / 0
  except ArithmeticError:
  # that error can be safely ignored!
  pass

Instead of a FileNotFoundError with ArithmeticError for context you'd 
now get an ExceptionGroup. Neither is handled by `except 
ArithmeticError`. Where is the win?

> Motivation: Multiple failures when retrying an operation

This is somewhat similar to the TemporaryDirectory, except there's no 
`with` block that feels like it should be "transparent" w.r.t. user errors.

If I currently have:

try:
create_connection(*addresses)
except (Timeout, NetworkNotConnected):
# that's OK, let's try later
pass

what should happen after Python 3.10? Apart from adding a new function, 
I can see two possible changes:
- create_connection() starts always raising ExceptionGroup on error, 
breaking backwards compatibility in the error case
- create_connection() starts only raising ExceptionGroup only for 2+ 
errors, breaking backwards compatibility in the 2+ errors case

Both look like heisenbug magnets. IMO, the second one is worse; "code 
which is now *potentially* raising ExceptionGroup" (as mentioned in the 
Backwards Compatibility section; emphasis mine) should be discouraged.

Arguably, this here is a problem with the create_connection function: 
the PEP adds a better way how it could have been designed, and that is 
virtuous. Still, having it in Motivation might be misleading.

> long term plan to replace `except` by `catch`

Whoa! Is that a real plan?

--

Also, one of the examples has such a missed opportunity to use 
print(f'{e1 = }')!

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LFGOEAX4A46BVQHBX6IECOAW63KQAFGU/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: It is really necessary to check that a object has a Py_TPFLAGS_HEAPTYPE flag?

2021-02-16 Thread Petr Viktorin


On 2/16/21 11:50 AM, Андрей Казанцев wrote:

It seems technically possible to override attributes/methods of
built-in types, but the question is more if it's desirable?

The problem is that you cannot override the method not only in built-in 
types but also, for example, in `lxml.etree` classes. I wrote a module 
that changes the `type_setattro` method to mine, which does not have 
this check. And I'm wondering if there are any problems in this solution 
(in addition to philosophical ones) or everything will work as it should 
(and not as inheritance from built-in types).

Thank you for participating in the discussion.



As with built-in types, lxml.etree classes and all other static (i.e. 
non-heap) types are shared between all interpreters in a process. 
Changing them has the same issues as with built-in types.

The check for the Py_TPFLAGS_HEAPTYPE flag is correct.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VITJX3QT2YG3AN5CY4FB7OP2VLSSP4UZ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Signature discrepancies between documentation and implementation

2021-02-09 Thread Petr Viktorin

On 2/9/21 9:15 PM, Serhiy Storchaka wrote:

09.02.21 12:22, Erlend Aasland пише:

What's the recommended approach with issues like 
https://bugs.python.org/issue43094? Change the docs or the implementation? I 
did a quick search on bpo, but could not find similar past issues.

If the documentation and the C implemented function contradict about
parameter name, we are free to treat the parameter as positional-only.
User cannot pass the argument as keyword because the documented name
does not work, and the real name is not exposed to the user.

It is. Python will correct you if you try to use the documented name:

>>> import sqlite3
>>> connection = sqlite3.connect(':memory:')
>>> connection.create_function('testfunc', num_params=1, func=lambda 
arg: None)

Traceback (most recent call last):
  File "", line 1, in 
TypeError: function missing required argument 'narg' (pos 2)
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UHDNIUTPS6FESF2VRK5AJMHUZIDKZL2N/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 637 - Support for indexing with keyword arguments: request for feedback for SC submission

2021-02-03 Thread Petr Viktorin


On 2/2/21 12:36 PM, Stefano Borini wrote:

Hi all,

I would like to request feedback by python-dev on the current
implementation of PEP 637 - Support for indexing with keyword
arguments.

https://www.python.org/dev/peps/pep-0637/

The PEP is ready for SC submission and it has a prototype
implementation ready, available here (note, not reviewed, but
apparently fully functional)

https://github.com/python/cpython/compare/master...stefanoborini:PEP-637-implementation-attempt-2

(note: not sure if there's a preference for the link to be to the diff
or to the branch, let me know if you prefer I change the PEP link)

Thank you for your help.


+1 from me. This looks quite useful in certain areas and natural to use. 
I like how this makes the dunder implementations (usually libraries), 
rather than users, deal with most of the corner cases -- but still 
allows libraries that don't need this to not care.



The PEP does lack a "How to teach" section.


"Corner case 3" concludes that "best practice suggests that keyword 
subscripts should be flagged as keyword-only when possible":


def __getitem__(self, index, *, direction='north'):

If the PEP is accepted, this should be mentioned in the 
__(get|set|del)item__ documentation and shown in all relevant examples.


Looking at corner case 1, it would also be useful to nudge people to use 
positional-only arguments whenever they accept arbitrary keyword ones. 
(The same goes for function definitions, but tutorials for those are 
already written):


def __getitem__(self, index, /, **named_axes):

It would be great if what gets copied to StackOverflow is examples of 
good practices :)

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2CN5A7JTZUCXTP6OMPAXWH2ABOPX6SIS/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Bumping minimum Sphinx version to 3.2 for cpython 3.10?

2021-01-13 Thread Petr Viktorin

On 1/13/21 8:24 PM, Brett Cannon wrote:
On Wed, Jan 13, 2021 at 7:25 AM Serhiy Storchaka > wrote:

12.01.21 22:38, Julien Palard via Python-Dev пише:
 > During the development of cpython 3.10, Sphinx was bumped to 3.2.1.
 >
 > Problem is Sphinx 3 have some incompatibilities with Sphinx 2,
some that
 > we could work around, some are bit harder, so we may need to bump
 > `needs_sphinx = '3.2'` (currently it is 1.8).

Sphinx version in the current Ubuntu LTS (20.04) is 1.8.5. Would not it
cause problems with builting documentation on Ubuntu?

Why can't contributors install from PyPI? The venv created by the 
Docs/Makefile does a pip install, so I don't see why what version of 
Sphinx is packaged via APT is a critical blocker in upgrading Sphinx.

I trust the CPython core devs, and I trust my distro's processes and 
packagers, but not necessarily the PyPI maintainers of Sphinx's 
requirements.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3YHMI555EEYAQXO3NBITAGZLMKQVAN4E/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Heap types (PyType_FromSpec) must fully implement the GC protocol

2021-01-12 Thread Petr Viktorin





On 1/12/21 8:23 PM, Neil Schemenauer wrote:

On 2021-01-12, Pablo Galindo Salgado wrote:

One worry that I have in general with this move is the usage of
_PyType_GetModuleByDef to get the type object from the module
definition. This normally involves getting a TLS in every instance
creation, which can impact notably performance for some
perf-sensitive types or types that are created a lot.


I would say _PyType_GetModuleByDef is the problem.  Why do we need
to use such an ugly approach (walking the MRO) when Python defined
classes don't have the same performance issue?  E.g.

 class A:
 def b():
 pass
 A.b.__globals__

IMHO, we should be working to make types and functions defined in
extensions more like the pure Python versions.

Related, my "__namespace__" idea[1] might be helpful in reducing the
differences between pure Python modules and extension modules.
Rather than functions having a __globals__ property, which is a
dict, they would have a __namespace__, which is a module object.
Basically, functions and methods known which global namespace
(module) they have been defined in.  For extension modules, when you
call a function or method defined in the extension, it could be
passed the module instance, by using the __namespace__ property.

Maybe I'm missing some details on why this approach wouldn't work.
However, at a high level, I don't see why it shouldn't.  Maybe
performance would be an issue?  Reducing the number of branches in
code paths like CALL_FUNCTION should help.


The main difference between Python and C functions is that in C, you 
need type safety. You can't store C state in a mutable dict (or module) 
accessible from Python, because when users invalidate your C invariants, 
you get a segfault rather than a nice AttributeError.


Making methods "remember" their context does work though, and has 
already been implemented -- see PEP 573!
It uses the *defining class* instead of __namespace__, but you can get 
the module from that quite easily.


The only place it doesn't work are slot methods, which have a fixed C 
API. For example:


PyObject *tp_repr(PyObject *self);
int tp_init(PyObject *self, PyObject *args, PyObject *kwds);

There is no good way to pass the method, module object, globals() or the 
defining class to such functions.




1. https://github.com/nascheme/cpython/tree/frame_no_builtins

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TZVSCCCTUISV32U2OTE5LY7F3X5QAVCX/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Heap types (PyType_FromSpec) must fully implement the GC protocol

2021-01-12 Thread Petr Viktorin

On 1/12/21 7:48 PM, Pablo Galindo Salgado wrote:

One worry that I have in general with this move
is the usage of _PyType_GetModuleByDef to get the type object
from the module definition. This normally involves getting a TLS in 
every instance creation,

Not TLS, it's walking the MRO.

which can impact notably performance for some perf-sensitive types or types
that are created a lot.

But yes, that's right. _PyType_GetModuleByDef should not be used in 
perf-sensitive spots, at least not without profiling.
There's often an alternative, though. Do you have any specific cases 
you're concerned about?

On Tue, 12 Jan 2021 at 18:21, Neil Schemenauer > wrote:

On 2021-01-12, Victor Stinner wrote:
 > It seems like a safer approach is to continue the work on
 > bpo-40077: "Convert static types to PyType_FromSpec()".

I agree that trying to convert static types is a good idea.  Another
possible bonus might be that we can gain some performance by
integrating garbage collection with the Python object memory
allocator.  Static types frustrate that effort.

Could we have something easier to use than PyType_FromSpec(), for
the purposes of coverting existing code?  I was thinking of
something like:

     static PyTypeObject Foo_TypeStatic = {
     }
     static PyTypeObject *Foo_Type;

     PyInit_foo(void)
     {
         Foo_Type = PyType_FromStatic(_TypeStatic);
     }

The PyType_FromStatic() would return a new heap type, created by
copying the static type.  The static type could be marked as being
unusable (e.g. with a type flag).
___
Python-Dev mailing list -- python-dev@python.org

To unsubscribe send an email to python-dev-le...@python.org

https://mail.python.org/mailman3/lists/python-dev.python.org/

Message archived at

https://mail.python.org/archives/list/python-dev@python.org/message/RPG2TRQLONM2OCXKPVCIDKVLQOJR7EUU/

Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HOCGUW3S6AXBSQ5BWX5KYPFVXEGWQJ6H/
Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QYXDVMTI5CBKQOGYC557ER45IZZLJZGS/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Heap types (PyType_FromSpec) must fully implement the GC protocol

2021-01-12 Thread Petr Viktorin


On 1/12/21 7:16 PM, Neil Schemenauer wrote:

On 2021-01-12, Victor Stinner wrote:

It seems like a safer approach is to continue the work on
bpo-40077: "Convert static types to PyType_FromSpec()".


I agree that trying to convert static types is a good idea.  Another
possible bonus might be that we can gain some performance by
integrating garbage collection with the Python object memory
allocator.  Static types frustrate that effort.

Could we have something easier to use than PyType_FromSpec(), for
the purposes of coverting existing code?  I was thinking of
something like:

 static PyTypeObject Foo_TypeStatic = {
 }
 static PyTypeObject *Foo_Type;

 PyInit_foo(void)
 {
 Foo_Type = PyType_FromStatic(_TypeStatic);
 }


The PyType_FromStatic() would return a new heap type, created by
copying the static type.  The static type could be marked as being
unusable (e.g. with a type flag).


Unfortunately, it's not just the creation that needs to be changed.
You also need to decref Foo_Type somewhere.

Your example is for "single-phase init" modules (pre-PEP 489). Those 
don't have a dealloc hook, so they will leak memory (e.g. in multiple 
Py_Initialize/Py_Finalize cycles).


Multi-phase init (PEP 489) allows multiple module instances of extension 
modules. Assigning PyType_FromStatic's result to a static pointer would 
mean that every instance of the module will create a new type, and 
overwrite any existing one. And the deallocation will either leave a 
dangling pointer or NULL the pointer for other module instances.


So, you need to make the type part of the module state, so that the 
module has proper ownership of the type. And that means you need to 
access the type from the module state any time you need to use it.


At that point, IMO, PyType_FromStatic saves you so little work that it's 
not worth supporting a third variation of type creation code.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WAKYSLYIUZN7NPCE6G6SRRCJK5RELJQ3/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Heap types (PyType_FromSpec) must fully implement the GC protocol

2021-01-12 Thread Petr Viktorin

On 1/12/21 4:34 PM, Antoine Pitrou wrote:

On Tue, 12 Jan 2021 15:22:36 +0100
Petr Viktorin wrote:

On 1/11/21 5:26 PM, Victor Stinner wrote:

Hi,

There are multiple PEPs covering heap types. The latest one refers to
other PEPs: PEP 630 "Isolating Extension Modules" by Petr Viktorin.
https://www.python.org/dev/peps/pep-0630/#motivation

The use case is to embed multiple Python instances (interpreters) in
the same application process, or to embed Python with multiple calls
to Py_Initialize/Py_Finalize (sequentially, not in parallel). Static
types are causing different issues for these use cases.

If a type is immutable and has no references to heap-allocated objects,
it could stay as a static type.
The issue is that very many types don't fit that. For example, if some
method needs to raise a module-specific exception, that's a reference to
a heap-allocated type, because custom exceptions generally aren't static.

Aren't we confusing two different things here?

- a mutable *type*, i.e. a type with mutable state attached to itself
(not to instances)

- a mutable *instance*, where the mutable state is per-instance

While it's very common for custom exceptions to have mutable instance
state (e.g. a backend-specific error number), I can't think of any
custom exception that has mutable state attached to the exception
*type*.

You're right, exception types *could* generally be static. However, the
most common API for creating them, PyErr_NewException[WithDoc], creates
heap types.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/ZSQ4HGBIMFBOVXBUKB7C7UPIJPW76OLA/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Heap types (PyType_FromSpec) must fully implement the GC protocol

2021-01-12 Thread Petr Viktorin




On 1/12/21 4:09 PM, Victor Stinner wrote:

On Tue, Jan 12, 2021 at 3:28 PM Petr Viktorin  wrote:

If a type is immutable and has no references to heap-allocated objects,
it could stay as a static type.
The issue is that very many types don't fit that. For example, if some
method needs to raise a module-specific exception, that's a reference to
a heap-allocated type, because custom exceptions generally aren't static.
(...)
I don't see why we would need to destroy immutable static objects. They
don't need to be freed.


I'm not sure of your definition of "immutable" here. At the C level,
many immutable Python objects are mutable. For example, a str instance
*can* be modified with the C level, and computing hash()
modifies the object as well (the internal cached hash value).

Any type contains at least one Python object: the __mro__ tuple. Most
types also contain a __subclass__ dictionary (by default, it's NULL).
These objects are created at Python startup, but not destroyed at
Python exit. See also tp_bases (tuple) and tp_dict (dict).


Ah, right. __subclasses__ is the reason these need to be heap types (if 
they allow subclassing, which – isn't).
If __mro__ is a tuple of static types, it could probably be made static 
as well; hashes could be protected by a lock.




I tried once to "finalize" static types, but it didn't go well:

* https://github.com/python/cpython/pull/20763
* https://bugs.python.org/issue1635741#msg371119

It doesn't look to be safe to clear static types. Many functions rely
on the fact that static types are "always there" and are never
finalized. Also, only a few static types are cleared by my PR: many
static types are left unchanged. For example, static types of the _io
module. It seems like a safer approach is to continue the work on
bpo-40077: "Convert static types to PyType_FromSpec()".


Yes, seems so. And perhaps this has enough subtle details to want a PEP?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3TLWPQT76RZ2Q6HEUKFURB3J45AXYWFE/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Heap types (PyType_FromSpec) must fully implement the GC protocol

2021-01-12 Thread Petr Viktorin


On 1/11/21 5:26 PM, Victor Stinner wrote:

Hi,

There are multiple PEPs covering heap types. The latest one refers to
other PEPs: PEP 630 "Isolating Extension Modules" by Petr Viktorin.
https://www.python.org/dev/peps/pep-0630/#motivation

The use case is to embed multiple Python instances (interpreters) in
the same application process, or to embed Python with multiple calls
to Py_Initialize/Py_Finalize (sequentially, not in parallel). Static
types are causing different issues for these use cases.


If a type is immutable and has no references to heap-allocated objects, 
it could stay as a static type.
The issue is that very many types don't fit that. For example, if some 
method needs to raise a module-specific exception, that's a reference to 
a heap-allocated type, because custom exceptions generally aren't static.




Also, it's not possible to destroy static types at Python exit, which
goes against the on-going effort to destroy all Python objects at exit
(bpo-1635741).


I don't see why we would need to destroy immutable static objects. They 
don't need to be freed.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/E6NSBPPMCJV5KPZCZJOLDAO74VUK25X6/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Distro packagers: PEP 615 and the tzdata dependency

2020-11-25 Thread Petr Viktorin

On 11/24/20 7:50 PM, Brett Cannon wrote:
If enough people were interested we could create a "Distributors" 
category on discuss.python.org .

I'd join :)

On Tue, Nov 24, 2020 at 9:08 AM Tianon Gravi > wrote:

 > I'd love to have an easy way to keep them in the loop.

I'm one of the maintainers on
https://github.com/docker-library/python

(which is what results in https://hub.docker.com/_/python
), and I'd
love to have an easy way to keep myself in the loop too! O:)

Is there a lower-frequency mailing list where things like this are
normally posted that I could follow?
(I don't want to be a burden, although we'd certainly really love to
have more upstream collaboration on that repo -- we do our best to
represent upstream as correctly/accurately as possible, but we're not
experts!)

 > would it make sense to add a packaging section to our
documentation or
 > to write an informational PEP?

FWIW, I love the idea of an explicit "packaging" section in the docs
(or a PEP), but I've maintained that for other projects before and
know it's not always easy or obvious. :)

♥,
- Tianon
   4096R / B42F 6819 007F 00F8 8E36  4FD4 036A 9C25 BF35 7DD4

PS. thanks doko for giving me a link to this thread! :D

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/66HPNHT576JKSFOQXJTCACX5JRNERMWV/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Speeding up CPython

2020-10-21 Thread Petr Viktorin

On 10/21/20 1:40 PM, Mark Shannon wrote:

Hi Petr,

On 21/10/2020 11:49 am, Petr Viktorin wrote:
Let me explain an impression I'm getting. It is *just one aspect* of
my opinion, one that doesn't make sense to me. Please tell me where it
is wrong.

In the C API, there's a somewhat controversial refactoring going on,
which involves passing around tstate arguments. I'm not saying [the
first discussion] was perfect, and there are still issues, but,
however flawed the "do-ocracy" process is, it is the best way we found
to move forward. No one who can/wants to do the work has a better
solution.

Later, Mark says there is an even better way – or at least, a less
intrusive one! In [the second discussion], he hints at it vaguely
(from that limited info I have, it involves switching to C11 and/or
using compiler-specific extensions -- not an easy change to do). But
frustratingly, Mark doesn't reveal any actual details, and a lot of
the complaints are about churn and merge conflicts.
And now, there's news -- the better solution won't be revealed unless
the PSF pays for it!

There's no secret. C thread locals are well documented.
I even provided a code example last time we discussed it.

You reminded me of it yesterday ;)
https://godbolt.org/z/dpSo-Q

At the risk of going off topic: That's for GCC. As far as I know, MSVC
uses something like __declspec( thread ).

What are the options for generic C99 compilers, other than staying slow?

The "even faster" solution I mentioned yesterday, is as I stated
yesterday to use an aligned stack.

If you wanted more info, you could have asked :)

First, you ensure that the stack is in a 2**N aligned block.
Assuming that the C stack grows down from the top, then the threadstate
struct goes at the bottom. It's probably a good idea to put a guard page
between the C stack and the threadstate struct.

The struct's address can then be found by masking off the bottom N bits
from the stack pointer.
This approach uses 0 registers and cost 1 ALU instruction. Can't get
cheaper than that :)

It's not portable and probably a pain to implement, but it is fast.

But it doesn't matter how it's implemented. The implementation is hidden
behind `PyThreadState_GET()`, it can be changed to use a thread local,

or to some fancy aligned stack, without the rest of the codebase changing.

Not portable and hard to implement is a pain for maintenance –
especially porting CPython to new compilers/platforms/situations.

The alternative is changing the codebase, which (it seems from the
discussions) would give us comparable performance, everywhere, and the
result can be maintained by many more people.

That's a very bad situation to be in for having discussions:
basically, either we disregard Mark and go with the not-ideal
solution, or virtually all work on changing the C API and internal
structures is blocked.

The existence of multiple interpreters should be orthogonal to speeding
up those interpreters, provided the separation is clean and well designed.

But it should be clean and well designed anyway, IMO.

I sense a similar thing happening here:
https://github.com/ericsnowcurrently/multi-core-python/issues/69 --

The title of that issue is 'Clarify what is a "sub-interpreter" and what
is an "interpreter"'?

there's a vague proposal to do things very differently, but I find it

This?
https://github.com/ericsnowcurrently/multi-core-python/issues/69#issuecomment-712837899

I'll continue there.

hard to find anything actionable. I would like to change my plans to
align with Mark's fork, or to better explain some of the
non-performance reasons for recent/planned changes. But I can't,
because details are behind a paywall.

Let's make this very clear.
My objections to the way multiple interpreters is being implemented has
very little to do speeding up the interpreter and entirely to do with
long term maintenance and ultimate success of the project.

Obviously, I would like it if multiple interpreters didn't slowdown
CPython.

But that has always been the case.

Thank you for clearing my doubts!

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/W5JF77HVMIJ3Q5RSL3R2TOJGZ4JEWJRS/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Speeding up CPython

2020-10-21 Thread Petr Viktorin

Let me explain an impression I'm getting. It is *just one aspect* of my
opinion, one that doesn't make sense to me. Please tell me where it is
wrong.

In the C API, there's a somewhat controversial refactoring going on,
which involves passing around tstate arguments. I'm not saying [the
first discussion] was perfect, and there are still issues, but, however
flawed the "do-ocracy" process is, it is the best way we found to move
forward. No one who can/wants to do the work has a better solution.

Later, Mark says there is an even better way – or at least, a less
intrusive one! In [the second discussion], he hints at it vaguely (from
that limited info I have, it involves switching to C11 and/or using
compiler-specific extensions -- not an easy change to do). But
frustratingly, Mark doesn't reveal any actual details, and a lot of the
complaints are about churn and merge conflicts.
And now, there's news -- the better solution won't be revealed unless
the PSF pays for it!

That's a very bad situation to be in for having discussions: basically,
either we disregard Mark and go with the not-ideal solution, or
virtually all work on changing the C API and internal structures is blocked.

I sense a similar thing happening here:
https://github.com/ericsnowcurrently/multi-core-python/issues/69 --
there's a vague proposal to do things very differently, but I find it
hard to find anything actionable. I would like to change my plans to
align with Mark's fork, or to better explain some of the non-performance
reasons for recent/planned changes. But I can't, because details are
behind a paywall.

[the first discussion]:
https://mail.python.org/archives/list/python-dev@python.org/thread/PQBGECVGVYFTVDLBYURLCXA3T7IPEHHO/#Q4IPXMQIM5YRLZLHADUGSUT4ZLXQ6MYY

[the second discussion]:
https://mail.python.org/archives/list/python-dev@python.org/thread/KGBXVVJQZJEEZD7KDS5G3GLBGZ6XNJJX/#WOKAUQYDJDVRA7SJRJDEAHXTRXSVPNMO

On 10/20/20 2:53 PM, Mark Shannon wrote:

Hi everyone,

CPython is slow. We all know that, yet little is done to fix it.

I'd like to change that.
I have a plan to speed up CPython by a factor of five over the next few
years. But it needs funding.

I am aware that there have been several promised speed ups in the past
that have failed. You might wonder why this is different.

Here are three reasons:
1. I already have working code for the first stage.
2. I'm not promising a silver bullet. I recognize that this is a
substantial amount of work and needs funding.
3. I have extensive experience in VM implementation, not to mention a
PhD in the subject.

My ideas for possible funding, as well as the actual plan of
development, can be found here:

https://github.com/markshannon/faster-cpython

I'd love to hear your thoughts on this.

Cheers,
Mark.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/RDXLCH22T2EZDRCBM6ZYYIUTBWQVVVWH/

Code of Conduct: http://python.org/psf/codeofconduct/

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/7DKURFZ3JEZTKCUAUDCPR527FUBYMY7N/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Taking over xxlimited for PEP 630

2020-09-08 Thread Petr Viktorin


Hello,
The "xxlimited" module (Modules/xxlimited.c) was added as part of PEP 
384 (Defining a Stable ABI), and is undocumented. As far as I can tell, 
it was added partly to test the stable ABI, and partly as an example of 
how to write a module (like "xx" from xxmodule.c).
In the last few years the module has not seen much maintenance, and I 
believe it's no longer a good example to follow: it works, but there are 
now better ways to do things.


I would like to take over maintenance of the module and make it into an 
example of how to write a low-level C extension with isolated module 
state, as described in PEP 630 (Isolating Extension Modules) -- an 
informational PEP that I plan to convert to a HOWTO doc when everything 
is ready.


Please let me know if you think this isn't a good idea, or if there's 
something I'm missing.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FO3YPG3YLG2XF5FKHICJHNINSPY4OHEL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Plan to remove Py_UNICODE APis except PEP 623.

2020-07-02 Thread Petr Viktorin


On 2020-07-02 14:57, Victor Stinner wrote:

Le jeu. 2 juil. 2020 à 14:44, Barry Scott  a écrit :

It's not obvious to me why the latin1 encoding is in this list as its just one 
of all the 8-bit char sets.
Why is it needed?


The Latin-1 (ISO 8859-1) charset is kind of special: it maps bytes
0x00-0xFF to Unicode characters U+-U+00FF and decoding from latin1
cannot fail.


This apparently makes it useful for not-quite-text, not-quite-bytes 
protocols like HTTP. In particular, WSGI (PEP ) uses latin-1 for 
headers.




It was commonly used as the locale encoding in Europe 10 years ago,
but nowadays most Linux distributions use UTF-8 as the locale
encoding.

I'm also fine with restricting the list to 3 encodings: ASCII, UTF-8
and Windows ANSI code page.


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DQI2UW5WOQ3EMHRP5VEGDG3MIU364I6K/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 620: Hide implementation details from the C API

2020-06-30 Thread Petr Viktorin


On 2020-06-30 02:46, Victor Stinner wrote:

You missed the point of the PEP: "It becomes possible to experiment
with more advanced optimizations in CPython than just
micro-optimizations, like tagged pointers."


I don't think experiments are a good motivation.

When the C API is broken, everyone that uses it pays the price -- they 
have to update their code. They pay the price even if the experiment 
fails, or if it's never started in the first place.


Can we treat the C API not as a place for experiments, but as a stable 
foundation to build on?


For example, could we only deprecate the bad parts, but not remove them 
until the experiments actually show that they are preventing a 
beneficial change?


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FVATSACB5QPTZS6YLSH5YCHHODJNBLA6/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 622: Structural Pattern Matching

2020-06-26 Thread Petr Viktorin

On 2020-06-26 16:54, Stéfane Fermigier wrote:
[...]

Here's one example:

https://github.com/clojure/core.match (in particular:
https://github.com/clojure/core.match/wiki/Understanding-the-algorithm ).

Alson some insights from
https://softwareengineering.stackexchange.com/questions/237023/pattern-matching-in-clojure-vs-scala

In this video I
watched recently, Rich Hickey comments that he likes the
destructuring part of languages like Scala, but not so much the
pattern matching part, and he designed Clojure accordingly. That
probably explains why the pattern matching is in a library and not
as robust, although the kind of problems seen in the post you
mentioned are clearly bugs.

What Rich Hickey mentions as an alternative to pattern matching is
multimethods . Most languages let
you do polymorphic dispatch based on type. Some languages let you
also do it based on a value. Using multimethods, Clojure lets you do
it based on any arbitrary function. That's a pretty powerful concept.

It comes down to the principle that programmers using a language
should use the language's own best idioms. Trying to write
Scala-like code in Clojure is going to have its difficulties, and
vice versa.

It does look like the PEP tries to do two different things: "switch"
instead of if/elif, and destructuring.

Would it be useful to introduce an operator for "isinstance", if it's so
commonly used? Are the calls to it (in the relevant codebase) actually
used in complex code that needs destructuring, or could we live with
this (IS_A being a placeholder for bikeshedding, of course):

if shape IS_A Point:
x, y = shape
...
elif shape IS_A Rectangle:
x, y, w, h = shape
...
elif shape IS_A Line:
x, y = line.start
if line.start == line.end:
print(f"Zero length line at {x}, {y}")

or:

queue: Union[Queue[int], Queue[str]]
if queue IS_A IntQueue:
# Type-checker detects unreachable code
...

There aren't many convincing examples for destructuring in the PEP, IMO.

The "mapping pattern" one could be rewritten as:

if route := config.get('route'):
process_route(route)
if subconfig := config.pop(constants.DEFAULT_PORT):
process_config(sub_config, config)

Sequence destructuring examples ([_] for "short sequence") don't seem
too useful. Would they actually improve lots of existing code?

Complex object /tree destructuring (like the is_tuple) is painful in
Python, but then again, the new syntax also becomes quite inscrutable
for complex cases.

Is code like the is_tuple example in the Rationale actually common?

The "Sealed classes as algebraic data types" example looks like a good
candidate for a dump() method or PEP 443 single dispatch, both of which
should be amenable to static analysis.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/57RTPCDQX56WZKQC42KXYS2EHQWLALCN/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: New optional dependency - TkDND

2020-06-23 Thread Petr Viktorin


On 2020-06-23 12:14, Elisha Paine wrote:

Hi all,

I am looking at getting TkDND support into tkinter, and have created
issue40893. However, we are currently considering the practicalities of
adding a new optional dependency to Python and I was hoping someone
could help answer the question of: is there a procedure for this?

The problem is that third-party distributors need to be notified of the
change and edit the package details accordingly. The current idea is
just to make it very clear on the “what’s new” page, however this would
not guarantee it would be seen, so I am very open to ideas/opinions.


Making it clear on “what’s new” would work best for me (as a Fedora 
packager). I don't think there are many other choices.


Obligatory question: What's the advantage of having TkDND bindings in 
the standard library, as opposed to making it a third-party package?
(And perhaps, in the stdlib, making it easier to wrap Tk packages in 
general?)



(FWIW, I read linux-sig but didn't get to respond before the python-dev 
post. So I'm replying here.)

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YA2QQMAKGREPJYPPMRDD3V7Y5JMVHA2G/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 620: Hide implementation details from the C API

2020-06-23 Thread Petr Viktorin





On 2020-06-22 14:10, Victor Stinner wrote:

Hi,

PEP available at: https://www.python.org/dev/peps/pep-0620/


This PEP is the result of 4 years of research work on the C API:
https://pythoncapi.readthedocs.io/

It's the third version. The first version (2017) proposed to add a
"new C API" and advised C extensions maintainers to opt-in for it: it
was basically the same idea as PEP 384 limited C API but in a
different color. Well, I had no idea of what I was doing :-) The
second version (April 2020) proposed to add a new Python runtime built
from the same code base as the regular Python runtime but in a
different build mode, the regular Python would continue to be fully
compatible.

I wrote the third version, the PEP 620, from scratch. It now gives an
explicit and concrete list of incompatible C API changes, and has
better motivation and rationale sections. The main PEP novelty is the
new pythoncapi_compat.h header file distributed with Python to provide
new C API functions to old Python versions, the second novelty is the
process to reduce the number of broken C extensions.

Whereas PEPs are usually implemented in a single Python version, the
implementation of this PEP is expected to be done carefully over
multiple Python versions. The PEP lists many changes which are already
implemented in Python 3.7, 3.8 and 3.9. It defines a process to reduce
the number of broken C extensions when introducing the incompatible C
API changes listed in the PEP. The process dictates the rhythm of
these changes.



PEP: 620
Title: Hide implementation details from the C API
Author: Victor Stinner 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 19-June-2020
Python-Version: 3.10

Abstract


Introduce C API incompatible changes to hide implementation details.

Once most implementation details will be hidden, evolution of CPython
internals would be less limited by C API backward compatibility issues.
It will be way easier to add new features.

It becomes possible to experiment with more advanced optimizations in CPython
than just micro-optimizations, like tagged pointers.

Define a process to reduce the number of broken C extensions.

The implementation of this PEP is expected to be done carefully over
multiple Python versions. It already started in Python 3.7 and most
changes are already completed. The `Process to reduce the number of
broken C extensions`_ dictates the rhythm.


Motivation
==

The C API blocks CPython evolutions
---

Adding or removing members of C structures is causing multiple backward
compatibility issues.

Adding a new member breaks the stable ABI (PEP 384), especially for
types declared statically (e.g. ``static PyTypeObject MyType =
{...};``).


PyTypeObject is explicitly not part of the stable ABI, see PEP 384: 
https://www.python.org/dev/peps/pep-0384/#structures
I don't know why Py_TPFLAGS_HAVE_FINALIZE was added, but it wasn't for 
the PEP 384 stable ABI.


Can you find a different example, so users are not misled?


In Python 3.4, the PEP 442 "Safe object finalization" added
the ``tp_finalize`` member at the end of the ``PyTypeObject`` structure.
For ABI backward compatibility, a new ``Py_TPFLAGS_HAVE_FINALIZE`` type
flag was required to announce if the type structure contains the
``tp_finalize`` member. The flag was removed in Python 3.8 (`bpo-32388
`_).

The ``PyTypeObject.tp_print`` member, deprecated since Python 3.0
released in 2009, has been removed in the Python 3.8 development cycle.
But the change broke too many C extensions and had to be reverted before
3.8 final release. Finally, the member was removed again in Python 3.9.

C extensions rely on the ability to access directly structure members,
indirectly through the C API, or even directly.


I think you want to remove a "directly" from that sentence.


Modifying structures
like ``PyListObject`` cannot be even considered.

The ``PyTypeObject`` structure is the one which evolved the most, simply
because there was no other way to evolve CPython than modifying it.

In the C API, all Python objects are passed as ``PyObject*``: a pointer
to a ``PyObject`` structure. Experimenting tagged pointers in CPython is
blocked by the fact that a C extension can technically dereference a
``PyObject*`` pointer and access ``PyObject`` members. Small "objects"
can be stored as a tagged pointer with no concrete ``PyObject``
structure.


I think this would be confusing to people who don't already know what 
you mean. May I suggest:
A C extension can technically dereference a ``PyObject*`` pointer and 
access ``PyObject`` members. This prevents experiments like tagged 
pointers (storing small values as ``PyObject*`` which does not point to 
a valid ``PyObject`` structure).



Replacing Python garbage collector with a tracing garbage collector
would also need to remove ``PyObject.ob_refcnt`` reference counter,
whereas currently ``Py_INCREF()`` and

[Python-Dev] Re: Cython and incompatible C API changes

2020-06-17 Thread Petr Viktorin

On 2020-06-17 12:03, Victor Stinner wrote:

re: [Python-Dev] When can we remove wchar_t* cache from string?

Le mar. 16 juin 2020 à 21:19, Steve Dower a écrit :

On 16Jun2020 1641, Inada Naoki wrote:

* This change doesn't affect to pure Python packages.
* Most of the rest uses Cython. Since I already report an issue to Cython,
regenerating with new Cython release fixes them.

The precedent set in our last release with tp_print was that
regenerating Cython releases was too much to ask.

Unless we're going to overrule that immediately, we should leave
everything there and give users/developers a full release cycle with
updated Cython version to make new releases without causing any breakage.

I already made changes in Python 3.10 which require again to
regenerate C code generated by Cython:
https://docs.python.org/dev/whatsnew/3.10.html#id2

Py_TYPE(), Py_REFCNT() and Py_SIZE() can no longer be used as l-value.
These changes also broke numpy. I helped to fix Cython and numpy (and
they are already fixed).

Those are not all the projects that were broken by the change -- they're
just the most popular ones. Right?

You can expect further incompatible changes in the C API. For example,
I would like to make the PyThreadState structure opaque, whereas
Cython currently accesses directly to PyThreadState members.

There is an ongoing discussion about always requiring to run Cython
when installing a C extension which uses Cython.

Do you have a link to that discussion?

Maybe we can find a way to use pre-generated C files for Python up to
version N, but require to run Cython for Python newer than version N?
It would prevent to require running Cython on stable Python versions,
but help to upgrade to newer Python and also test the "next Python"
(current master branch).

Note: if the Py_TYPE() & cie changes are causing too many issues, we
can also reconsider to postpone/revert these changes. IMO it's
important that we remain able to push incompatible changes to the C
API, because there are multiple known flaws in the C API.

If PEP 387 (Backwards Compatibility Policy) is accepted, all the
incompatible changes changes will require a two-year deprecation period.
Right?

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/FWJPUTHFPOSNYXRNDYIO3VUNFPGWK5QW/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Should we be making so many changes in pursuit of PEP 554?

2020-06-17 Thread Petr Viktorin

On 2020-06-16 20:28, Mark Shannon wrote:

On 16/06/2020 1:24 pm, Nick Coghlan wrote:
Multiprocessing serialisation overheads are abysmal. With enough OS
support you can attempt to mitigate that via shared memory mechanisms
(which Davin added to the standard library), but it's impossible to
get the overhead of doing that as low as actually using the address
space of one OS process.

What does "multiprocessing serialisation" even mean? I assume you mean
the overhead of serializing objects for communication between processes.

The cost of serializing an object has absolutely nothing to do with
which process the interpreter is running in.

Separate interpreters within a single process will still need to
serialize objects for communication.

The overhead of passing data through shared memory is the same for
threads and processes. It's just memory.

Can we please stick to facts and not throw around terms like "abysmal"
with no data whatsoever to back it up.

I'd like to get back to the facts. Let me quote the original mail from
this thread:

On 2020-06-05 16:32, Mark Shannon wrote:

While I'm in favour of PEP 554, or some similar model for parallelism in
Python, I am opposed to the changes we are currently making to support it.

Which changes?
There are several efforts in this general space. Personally, I also
don't agree with them all. And I think the reason I wasn't able to
formulate too many replies to you is that we don't have a common
understanding of what is being discussed, and of the modivations behind
the changes.

You seem to try convince everyone that multiple processes are better (at
isolation, and at performance) than multiple interpreters in one
process. And I see the point: if you can live with the restriction of
multiple processes, they probably are a better choice!
But I don't think PEPs 554, 489, 573, etc. are about choosing between
multiprocessing and multiple interpreters; they're about making multiple
interpreters better than they currently are.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/MTPO3VZRTRR6R23X462QXJ2X74E5YX22/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Should we be making so many changes in pursuit of PEP 554?

2020-06-17 Thread Petr Viktorin


On 2020-06-16 19:20, Guido van Rossum wrote:
Has anybody brought up the problem yet that if one subinterpreter 
encounters a hard crash (say, it segfaults due to a bug in a C extension 
module), all subinterpreters active at that moment in the same process 
are likely to lose all their outstanding work, without a chance of recovery?


(Of course once we have locks in shared memory, a crashed process 
leaving a lock behind may also screw up everybody else, though perhaps 
less severely.)


Not really.
Asyncio has the same problem; has anyone brought this issue up there? 
(Granted, asyncio probably didn't uncover too many issues in extension 
modules, but if it did, I assume they would get fixed.)


If you're worried about segfaults, then you should use multiple 
processes. That will always give you better isolation. But I don't think 
it's a reason to stop improving interpreter isolation.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AGMIWNEGH2PJN473EGGB7J4LY4RYFQA5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Improving inspect capabilities for classes

2020-06-17 Thread Petr Viktorin


On 2020-06-17 09:02, Serhiy Storchaka wrote:

16.06.20 21:02, Guido van Rossum пише:
It would certainly be much easier to get through the review process. 
Adding a `__filename__` (why not `__file__`?) attribute to classes is 
a major surgery, presumably requiring a PEP, and debating the pros and 
cons and performance implications and fixing a bunch of tests that 
don't expect this attribute, and so on. Adding an imperfect solution 
to inspect.getsource() would only require the cooperation of whoever 
maintains the inspect module.


If add the file name of the sources as a class attribute, we need also 
to add the line number (or the range of line numbers) of the class 
definition. Otherwise inspect.getsource() will still be ambiguous.


That, or even the entire __code__ of the internal function that sets up 
the class. That has all the needed information.


I did a small experiment with this, and indeed it breaks tests that 
either don't expect the attribute or expect anything with __code__ is a 
function: 
https://github.com/python/cpython/commit/3fddc0906f2e7b92ea0f7ff040560a10372f91ec


You can actually do this in pure Python, just to see what breaks. See 
the attachment.



Also, 
specifying the file name does not help in case of REPL or compiling a 
string, so maybe you need to attach full source text to a class?


You get the same problem with functions, already. But Jupyter Notebook 
apparently works around this issue.
import builtins

orig_build_class = builtins.__build_class__

def build_class(func, *args, **kwds):
result = orig_build_class(func, *args, **kwds)
result.__code__ = func.__code__
return result

builtins.__build_class__ = build_class


##
# Demo:

import inspect

class C():
def hello(self):
print('Hello world!')

print(inspect.getsource(C))

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Q3EZXN6KLA6WPWEN5HV3T5NGC5PMFB62/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Removal of _Py_ForgetReference from public header in 3.9 issue

2020-06-15 Thread Petr Viktorin

On 2020-06-14 22:10, cpyt...@nicwatson.org wrote:

Please excuse if this is the wrong mailing list. I couldn't find one for module
maintainers.

This is relevant to capi-...@python.org; let's continue here.

I maintain an open source Python module in C. I'm trying to verify for the first time
that the module still works with cpython 3.9. This module does *not* use the
"limited" C API.

In building my module against 3.9b3, I'm getting a missing declaration warning
on _Py_ForgetReference. My module builds and passes test fine, this is just a
compiler warning issue.

What does the _Py_ForgetReference function do? The [documentation] says
it's only for use in the interpereter core, so I'd assume it's .

[documentation]: https://docs.python.org/3/c-api/refcounting.html

The change that caused this was made in:

commit f58bd7c1693fe041f7296a5778d0a11287895648
Author: Victor Stinner
Date: Wed Feb 5 13:12:19 2020 +0100

bpo-39542: Make PyObject_INIT() opaque in limited C API (GH-18363)
...

I definitely need the _Py_ForgetReference call for a particularly hairy error
condition (https://github.com/jnwatson/py-lmdb/blob/master/lmdb/cpython.c#L888
if you're curious). In fact, my tests will seg fault if I don't have that call
and trace refs is enabled.

I can't follow the reasoning behind the code easily. Why do you use
_Py_ForgetReference and PyObject_Del, instead of Py_DECREF(self)?

Should I put an #ifdef Py_TRACE_REFS around the call? Ignore it? What do you
think is the proper resolution?

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/4EOCN7P4HI56GQ74FY3TMIKDBIPGKL2G/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: My take on multiple interpreters (Was: Should we be making so many changes in pursuit of PEP 554?)

2020-06-10 Thread Petr Viktorin

On 2020-06-10 04:43, Inada Naoki wrote:

On Tue, Jun 9, 2020 at 10:28 PM Petr Viktorin wrote:

Relatively recently, there is an effort to expose interpreter creation &
finalization from Python code, and also to allow communication between
them (starting with something rudimentary, sharing buffers). There is
also a push to explore making the GIL per-interpreter, which ties in to
moving away from process-global state. Both are interesting ideas, but
(like banishing global state) not the whole motivation for
changes/additions.

Some changes for per interpreter GIL doesn't help sub interpreters so much.
For example, isolating memory allocator including free list and
constants between
sub interpreter makes sub interpreter fatter.
I assume Mark is talking about such changes.

Now Victor proposing move dict free list per interpreter state and the code
looks good to me. This is a change for per interpreter GIL, but not
for sub interpreters.
https://github.com/python/cpython/pull/20645

Should we commit this change to the master branch?
Or should we create another branch for such changes?

I think that most of all, the changes aimed at breaking up the GIL need
a PEP, so that everyone knows what the changes are actually about -- and
especially so that everyone knows the changes are happening.

Note that neither PEP 554 (which itself isn't accepted yet) nor PEP 573
is related to breaking up the GIL.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/I4DURF74SJZ3PEILBWDVR2XHOZQKRZRH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] My take on multiple interpreters (Was: Should we be making so many changes in pursuit of PEP 554?)

2020-06-09 Thread Petr Viktorin

On 2020-06-05 16:32, Mark Shannon wrote:

Hi,

There have been a lot of changes both to the C API and to internal
implementations to allow multiple interpreters in a single O/S process.

These changes cause backwards compatibility changes, have a negative
performance impact, and cause a lot of churn.

While I'm in favour of PEP 554, or some similar model for parallelism in
Python, I am opposed to the changes we are currently making to support it.

What are sub-interpreters?
--

A sub-interpreter is a logically independent Python process which
supports inter-interpreter communication built on shared memory and
channels. Passing of Python objects is supported, but only by copying,
not by reference. Data can be shared via buffers.

Here's my biased take on the subject:

Interpreters are contexts in which Python runs. They contain
configuration (e.g. the import path) and runtime state (e.g. the set of
imported modules). An interpreter is created at Python startup
(Py_InitializeEx), and you can create/destroy additional ones with
Py_NewInterpreter/Py_EndInterpreter.

This is long-standing API that is used, most notably by mod_wsgi.

Many extension modules and some stdlib modules don't play well with the
existence of multiple interpreters in a process, mainly because they use
process-global state (C static variables) rather than some more granular
scope.
This tends to result in nasty bugs (C-level crashes) when multiple
interpreters are started in parallel (Py_NewInterpreter) or in sequence
(several Py_InitializeEx/Py_FinalizeEx cycles). The bugs are similar in
both cases.

Whether Python interpreters run sequentially or in parallel, having them
work will enable a use case I would like to see: allowing me to call
Python code from wherever I want, without thinking about global state.
Think calling Python from an utility library that doesn't care about the
rest of the application it's used in. I personally call this "the Lua
use case", because light-weight, worry-free embedding is an area where
Python loses to Lua. (And JS as well—that's a relatively recent
development, but much more worrying.)

The part I have been involved in is moving away from process-global
state. Process-global state can be made to work, but it is much safer to
always default to module-local state (roughly what Python-language's
`global` means), and treat process-global state as exceptions one has to
think through. The API introduced in PEPs 384, 489, 573 (and future
planned ones) aims to make module-local state possible to use, then
later easy to use, and the natural default.

I am not too fond of the term "sub-interpreters", because it implies
some kind of hierarchy. Of course, if interpreter creation is exposed to
Python, you need some kind of "parent" to start the "child" and get its
result when done. Also, due to some practical issues you might (sadly,
currently) need some notion of "the main interpreter". But ideally, we
can make interpreters entirely independent to allow the "Lua use case".
In the end-game of these efforts, I see Py_NewInterpreter transparently
calling Py_InitializeEx if global state isn't set up yet, and similarly,
Py_EndInterpreter turning the lights off if it's the last one out.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/NLITVUIZQSUJ2F6XDTPMD7IP7FGTMNBA/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: [RELEASE] Python 3.9.0a6 is now available for testing

2020-04-29 Thread Petr Viktorin


On 2020-04-29 16:34, ro...@reportlab.com wrote:

While testing 3.9a6 in the reportlab package I see this difference from 3.8.2; 
I built from source using the standard configure make dance. Is this a real 
change?


Hi,
This is a known issue, currently discussed in 
https://bugs.python.org/issue40246


Thanks for reporting it, though!


robin@minikat:~/devel/reportlab/REPOS/reportlab/tests
$ python
Python 3.8.2 (default, Apr  8 2020, 14:31:25)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

norm=lambda m: m+(m and(m[-1]!='\n'and'\n'or'')or'\n')


robin@minikat:~/devel/reportlab/REPOS/reportlab/tests
$ python39
Python 3.9.0a6 (default, Apr 29 2020, 07:46:29)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

norm=lambda m: m+(m and(m[-1]!='\n'and'\n'or'')or'\n')

   File "", line 1
 norm=lambda m: m+(m and(m[-1]!='\n'and'\n'or'')or'\n')
  ^
SyntaxError: invalid string prefix



robin@minikat:~/devel/reportlab/REPOS/reportlab/tests
$ python39 -X oldparser
Python 3.9.0a6 (default, Apr 29 2020, 07:46:29)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

norm=lambda m: m+(m and(m[-1]!='\n'and'\n'or'')or'\n')

   File "", line 1
 norm=lambda m: m+(m and(m[-1]!='\n'and'\n'or'')or'\n')
^
SyntaxError: invalid string prefix



robin@minikat:~/devel/reportlab/REPOS/reportlab/tests
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PCQD2REYQ7GT6GVY2FLYEASVKRS756HO/
Code of Conduct: http://python.org/psf/codeofconduct/


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QLQXGGOHFQXISPXZONYBLWN4VPCJN3BA/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Adding a "call_once" decorator to functools

2020-04-28 Thread Petr Viktorin


On 2020-04-28 00:26, Steve Dower wrote:

On 27Apr2020 2311, Tom Forbes wrote:
Why not? It's a decorator, isn't it? Just make it check for number of 
arguments at decoration time and return a different object.


It’s not that it’s impossible, but I didn’t think the current 
implementation doesn’t make it easy 


This is the line I'd change: 
https://github.com/python/cpython/blob/cecf049673da6a24435acd1a6a3b34472b323c97/Lib/functools.py#L763 



At this point, you could inspect the user_function object and choose a 
different wrapper than _lru_cache_wrapper if it takes zero arguments. 
Though you'd likely still end up with a lot of the code being replicated.


Making a stdlib function completely change behavior based on a function 
signature feels a bit too magic to me.
I know lots of libraries do this, but I always thought of it as a cool 
little hack, good for debugging and APIs that lean toward being simple 
to use rather than robust. The explicit `call_once` feels more like API 
that needs to be supported for decades.



You're probably right to go for the C implementation. If the Python 
implementation is correct, then best to leave the inefficiencies there 
and improve the already-fast version.


Looking at 
https://github.com/python/cpython/blob/master/Modules/_functoolsmodule.c 
it seems the fast path for no arguments could be slightly improved, but 
it doesn't look like it'd be much. (I'm deliberately not saying how I'd 
improve it in case you want to do it anyway as a learning exercise, and 
because I could be wrong :) )


Equally hard to say how much more efficient a new API would be, so 
unless it's written already and you have benchmarks, that's probably not 
the line of reasoning to use. An argument that people regularly get this 
wrong and can't easily get it right with what's already there is most 
compelling - see the recent removeprefix/removesuffix discussions if you 
haven't.


Cheers,
Steve

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/V3R7DZDPCO4WZPRMZXZAGNA5VXU7OKF5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 554 for 3.9 or 3.10?

2020-04-24 Thread Petr Viktorin


On 2020-04-22 08:05, Glenn Linderman wrote:

On 4/21/2020 10:26 PM, Greg Ewing wrote:

And if I understand correctly, you won't get any nice "This
module does not support subinterpreters" exception if you
import an incompatible module -- just an obscure crash,
probably of the core-dumping variety. 


This sounds fixable: modules that support subinterpreters should set a 
flag saying so, and the either the load of a non-flagged module when 
subinterpreters are in use, or the initiation of a subinterpreter when a 
non-flagged module has been loaded, should raise.



There was talk about making a orthogonal flag just for opting into 
subinterpreter support. But what you use now as such a flag is 
multi-phase initialization: 
https://docs.python.org/3/c-api/module.html?highlight=pymodule_fromdefandspec#multi-phase-initialization


(Though I like the PEP 489 wording, "expected to support subinterpreters 
and multiple Py_Initialize/Py_Finalize cycles correctly", better than 
what ended up in the docs.)



The simplest strategy to support subinterpreters correctly is to refuse 
to create the extension module more than once per process (with an 
appropriate error, of course).

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5J5XTTDUKZWEVDSU3V67W3FYOY7VA7UO/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 554 for 3.9 or 3.10?

2020-04-21 Thread Petr Viktorin


On 2020-04-21 11:01, Antoine Pitrou wrote:

On Mon, 20 Apr 2020 19:21:21 -0600
Eric Snow  wrote:

Honest question: how many C extensions have process-global state that
will cause problems under subinterpreters?  In other words, how many
already break in mod_wsgi?


A slightly tricky question is what happens if a PyObject survives
longer than the subinterpreter that created it.

For example, in PyArrow, we allow passing a Python buffer-like object
as a C++ buffer to C++ APIs.  The C++ buffer could conceivably be kept
around by C++ code for some time.  When the C++ buffer is destroyed,
Py_DECREF() is called on the Python object (I figure that we would have
to switch to the future interpreter-aware PyGILState API -- when will
it be available?). 




But what happens if the interpreter which created
the Python object is already destroyed at this point?


That's a good question. What happens today if someone calls Py_Finalize 
while the buffer is still around?


I don't think you need to treat *sub*interpreters specially.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AVJO43PRTX4UCMLI2PNHMZWWBEGZS353/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Moving threadstate to thread-local storage.

2020-04-02 Thread Petr Viktorin


On 2020-03-24 16:31, Mark Shannon wrote:

Hi,

As an experiment, I thought I would try moving the thread state (what 
you get from _PyThreadState_GET() ) to TLS.


https://github.com/python/cpython/compare/master...markshannon:threadstate_in_tls 



It works, passing all the tests, and seems sound.

It is a small patch (< 50 lines) and doesn't increase the overall code 
size.


My branch is GCC/Clang only, so will need a bit of extra code for 
Windows. It should only need a few more lines; I haven't done it as I 
don't have a Windows machine to test it on.


What about other compilers?

AFAIK, __thread is a a non-standard name for a C11+ feature. Is there a 
good way to do this in C99?

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ER5FMM2ZDSK2CT7OMDLFPIGIDKG4YHPJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Accepting PEP 573 (Module State Access from C Extension Methods)

2020-03-25 Thread Petr Viktorin


On 2020-03-23 17:43, Stefan Behnel wrote:

As (first-time) BDFL delegate, I accept PEP 573 for Python 3.9,
"Module State Access from C Extension Methods"

https://www.python.org/dev/peps/pep-0573/

Petr, Nick, Eric and Marcel, thank you for your work and intensive
discussions on this PEP, and also to everyone else who got involved on
mailing lists, sprints and conferences.

It was a long process with several iterations, much thinking, rethinking
and cutting down along the way, Python 3.7 *and* 3.8 being missed, but 3.9
now finally being hit. Together with several other improvements to the
C-API in the upcoming release, this will help making extension modules less
"different" and easier to adapt for subinterpreters.


That's great news! Thank you!
I'll schedule some time in the coming weeks to get the implementation 
ready for review.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PLNR6FZPDWY6AXLZNLVL445MSHTJOBPW/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Last call for comments on PEP 573 (Module State Access from C Extension Methods)

2020-03-11 Thread Petr Viktorin

On 2020-03-10 19:21, Stefan Behnel wrote:

Hi Petr!

Petr Viktorin schrieb am 14.01.20 um 14:37:

It also includes a more drastic change: it removes the MRO walker from the
proposal.
Reflecting on the feedback, it became clear to me that a MRO walker, as it
was described, won't give correct results in all cases: specifically, is a
slot is overridden by setting a special method from Python code, the walker
won't be able to find module. Think something like:
c_add = Class.__add__ # where nb_add uses the MRO walker
Class.__add__ = lambda *args: "evil"
c_add(Class(), 0) # Exception: Defining type has not been found.

This can be solved, but it will need a different design and more
discussion. I'd rather defer it to the future.
Meanwhile, extension authors can use their own MRO walker if they're OK
with some surprising edge cases.

I read the last update. I can't say I'm happy about the removal since I was
seeing the MRO walker function as a way to hide internals so that extension
authors can start using it and CPython can adapt the internals later. But I
do see that there are issues with it, and I accept your choice to keep the
PEP even more minimal than it already was.

Are there any more points to discuss?

Not to my knowledge.

If not, I would soon like to accept
the PEP, so that we can focus more on the implementation and usage.

Thank you!
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/5UKDYM3JMZ4ZRDNTXRZSYGUVXLHAVBM5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Unpacking native bools in the struct module: Is Python relying on undefined behavior?

2020-02-27 Thread Petr Viktorin


On 2020-02-27 17:14, Serge Guelton wrote:

On Thu, Feb 27, 2020 at 10:51:39AM -0500, Charalampos Stratakis wrote:

Hello folks,

I recently observed a failure on the s390x fedora rawhide buildbot, on the 
clang builds, when clang got updated to version 10:
 https://bugs.python.org/issue39689

The call:
 struct.unpack('>?', b'\xf0')
means to unpack a "native bool", i.e. native size and alignment. Internally, 
this does:

 static PyObject *
 nu_bool(const char *p, const formatdef *f)
 {
 _Bool x;
 memcpy((char *), p, sizeof x);
 return PyBool_FromLong(x != 0);
 }

i.e., copies "sizeof x" (1 byte) of memory to a temporary buffer x, and then 
treats that as _Bool.

While I don't have access to the C standard, I believe it says that assignment of a true 
value to _Bool can coerce to a unique "true" value. It seems that if a char 
doesn't have the exact bit pattern for true or false, casting to _Bool is undefined 
behavior. Is that correct?

Clang 10 on s390x seems to take advantage of this: it probably only looks at 
the last bit(s) so a _Bool with a bit pattern of 0xf0 turns out false.
But the tests assume that 0xf0 should unpack to True.


I don't think it's specific to Clang 9, or the s390x arch. Have a look to

 https://godbolt.org/z/3n-LqN

clang indeed just checks for the lowest bit. Is it correct? I think so. _Bool
can only holds two value, 0 and 1, [0] which is different from an int whose 
value is
true or false whether its different or equal to 0. GCC and Clang agree on that:

 https://godbolt.org/z/koc4Pb

So yeah, according to that rule, the value set in `p` wasn't from a _Bool if it
has the 0xf0 value. So you're re-interepreting memory between two different 
types type-punning, and that's UB.

Quick and obvious fix:

  static PyObject *
  nu_bool(const char *p, const formatdef *f)
  {
  char x;
  memcpy((char *), p, sizeof x);
  return PyBool_FromLong(x != 0);
  }


(This assumes size of _Bool is the same as size of char, which I guess 
is also UB? But I guess we can add a build-time assertion for that, and 
say we don't support platforms where that's not the case.)



So thanks! I'm left with a question for CPython's struct experts, which 
is better kept to the bug tracker: 
https://bugs.python.org/issue39689#msg362815

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/364VZPYLOTVTXD6SXH4T4E36K25WM4B2/
Code of Conduct: http://python.org/psf/codeofconduct/

< 1 2 3 4 >

101 - 200 of 328 matches

Mail list logo