[issue32973] Importing the same extension module under multiple names breaks non-reinitialisable extension modules

2018-08-25 Thread Stefan Behnel


Stefan Behnel  added the comment:

FYI, I've updated Cython's module import checks to include an interpreter 
check. This (multi-file) test shows the new behaviour, which is to raise an 
ImportError on module creation when it detects a different interpreter than 
during the initial import:

https://github.com/cython/cython/blob/master/tests/run/reimport_from_subinterpreter.srctree

The checks are implemented here (and called a bit further down in the module 
create function):
https://github.com/cython/cython/blob/4ce754271ff4cfbd8df2b278e812154fb1b02319/Cython/Utility/ModuleSetupCode.c#L909-L932

I also added a test that should match the problem discussed here, which makes 
what I described appear as a viable solution (or work-around):

https://github.com/cython/cython/blob/master/tests/run/reimport_from_package.srctree

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32973] Importing the same extension module under multiple names breaks non-reinitialisable extension modules

2018-08-25 Thread Stefan Behnel

Stefan Behnel  added the comment:

Well, first of all, it's better than a crash. :)

Secondly, I'm sure NumPy doesn't currently support subinterpreters, just like 
most other extension modules. If I'm not mistaken, an interpreter switch can be 
detected through the interpreter state pointer [1] in the thread state, and 
extension modules that lack subinterpreter support can consider a change an 
error for them. since then something is trying to re-import the module into a 
different interpreter. That's not entirely safe since addresses can be reused, 
which I guess was the reason for adding an ID [2] in Py3.7, but that's only 
available in Py3.7+, not in Py3.5. So, the interpreter address is probably as 
good as it gets for Py<3.7.

[1] https://docs.python.org/3/c-api/init.html#c.PyThreadState
[2] https://docs.python.org/3/c-api/init.html#c.PyInterpreterState_GetID

Note: I'm not trying to keep anyone from implementing subinterpreter support 
here – just showing a way to keep things working and improving gradually as 
long as there is no full support for PEP 489, extension module reloading and 
subinterpreters, so that users don't have to go all the way in one step.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32973] Importing the same extension module under multiple names breaks non-reinitialisable extension modules

2018-08-25 Thread Petr Viktorin


Petr Viktorin  added the comment:

That's quite problematic, since then you're sharing a mutable object across 
interpreters.
The user can store any attribute on module objects, including e.g. Python 
functions that reference their original interpreter's global state, but become 
callable in other interpreters.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32973] Importing the same extension module under multiple names breaks non-reinitialisable extension modules

2018-08-25 Thread Stefan Behnel


Stefan Behnel  added the comment:

I think the best work-around for now is to implement a bit of PEP 489, 
including a module create function that always returns the same static module 
reference instead of creating a new one after the first call, and a module exec 
function that simply returns if the module has already been initialised. 
Keeping a global pointer to the module instance around should work. This is 
what the current Cython master branch does (also to make use of some of the 
nice features that PEP 489 brings).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32973] Importing the same extension module under multiple names breaks non-reinitialisable extension modules

2018-03-04 Thread Thomas Wouters

Thomas Wouters  added the comment:

Re: Petr: we can't expect extension module authors to retroactively fix 
released modules. We can't even expect everyone to fix this for future 
releases; moving away from globals (which may not be specific to the Python 
extension module) may be a lot of effort. Just look at how much work it's 
taking to move CPython itself to stop using globals in so many places. And 
while it may be necessary for sub-interpreters (which is only the case for 
global state that's Python objects), many people just don't care about 
sub-interpreters.

Re: Stefan: the init function pointer isn't known until much later in the 
current process, and calculated from the module name. There is not currently a 
way to import a module with a different init function pointer. I don't think 
this warrants that much of a rewrite.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32973] Importing the same extension module under multiple names breaks non-reinitialisable extension modules

2018-03-02 Thread Stefan Behnel

Stefan Behnel  added the comment:

> change the extension module cache to key on filename and init function name

... or on the pointer to the PyInit function. If that's the same, we obviously 
have the same extension module. If it differs, even for the same module name, 
then other globals of the modules will probably also be distinct.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32973] Importing the same extension module under multiple names breaks non-reinitialisable extension modules

2018-03-02 Thread Stefan Behnel

Change by Stefan Behnel :


--
components: +Extension Modules
nosy: +scoder

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32973] Importing the same extension module under multiple names breaks non-reinitialisable extension modules

2018-03-02 Thread Petr Viktorin

Petr Viktorin  added the comment:

Well, PEP 489 basically punts this to module authors: generally, C globals are 
bad, but if you do have global state, please manage it, keeping in mind that 
multiple module objects can be created from the extension.

That's required to make everything work with subinterpreters.

See: 
https://www.python.org/dev/peps/pep-0489/#subinterpreters-and-interpreter-reloading

CCing Marcel, who's working on PEP 489-related stuff now.

--
nosy: +Dormouse759

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32973] Importing the same extension module under multiple names breaks non-reinitialisable extension modules

2018-03-02 Thread Eric Snow

Eric Snow  added the comment:

PEP 489 ("Multi-phase extension module initialization") is relevant here, so 
I've nosied Petr.

--
nosy: +encukou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32973] Importing the same extension module under multiple names breaks non-reinitialisable extension modules

2018-02-28 Thread Gregory P. Smith

Change by Gregory P. Smith :


--
versions: +Python 3.6, Python 3.7, Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32973] Importing the same extension module under multiple names breaks non-reinitialisable extension modules

2018-02-28 Thread Gregory P. Smith

Change by Gregory P. Smith :


--
nosy: +gregory.p.smith

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32973] Importing the same extension module under multiple names breaks non-reinitialisable extension modules

2018-02-28 Thread Thomas Wouters

New submission from Thomas Wouters :

This is a continuation, of sorts, of issue16421; adding most of that issue's 
audience to the noisy list.

When importing the same extension module under multiple names that share the 
same basename, Python 3 will call the extension module's init function multiple 
times. With extension modules that do not support re-initialisation, this 
causes them to trample all over their own state. In the case of numpy, this 
corrupts CPython internal data structures, like builtin types.

Simple reproducer:
% python3.6 -m venv numpy-3.6
% numpy-3.6/bin/python -m pip install numpy
% PYTHONPATH=./numpy-3.6/lib/python3.6/site-packages/numpy/core/ 
./numpy-3.6/bin/python -c "import numpy.core.multiarray, multiarray; u'' < 1"
Traceback (most recent call last):
  File "", line 1, in 
Segmentation fault

(The corruption happens because PyInit_multiarray initialises subclasses of 
builtin types, which causes them to share some data (e.g. tp_as_number) with 
the base class: 
https://github.com/python/cpython/blob/master/Objects/typeobject.c#L5277. 
Calling it a second time then copies data from a different class into that 
shared data, corrupting the base class: 
https://github.com/python/cpython/blob/master/Objects/typeobject.c#L4950. The 
Py_TPFLAGS_READY flag is supposed to protect against this, but 
PyInit_multiarray resets the tp_flags value. I ran into this because we have 
code that vendors numpy and imports it in two different ways.)

The specific case of numpy is somewhat convoluted and exacerbated by dubious 
design choices in numpy, but it is not hard to show that calling an extension 
module's PyInit function twice (if the module doesn't support reinitialisation 
through PEP 3121) is bad: any C globals initialised in the PyInit function will 
be trampled on.

This was not a problem in Python 2 because the extension module cache worked 
based purely on filename. It was changed in response to issue16421, but the 
intent there appears to be to call *different* PyInit methods in the same 
module. However, because PyInit functions are based off of the *basename* of 
the module, not the full module name, a different module name does not mean a 
different init function name.

I think the right approach is to change the extension module cache to key on 
filename and init function name, although this is a little tricky: the init 
function name is calculated much later in the process. Alternatively, key it on 
filename and module basename, rather than full module name.

--
messages: 313064
nosy: Arfrever, amaury.forgeotdarc, asvetlov, brett.cannon, eric.snow, eudoxos, 
ncoghlan, pitrou, r.david.murray, twouters, vstinner
priority: normal
severity: normal
status: open
title: Importing the same extension module under multiple names breaks 
non-reinitialisable extension modules
type: behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com