New submission from Daniel <danie...@fixedit.ai>:

The documentation 
(https://docs.python.org/3/c-api/init.html#c.Py_NewInterpreter) states:


For modules using single-phase initialization, e.g. PyModule_Create(), the 
first time a particular extension is imported, it is initialized normally, and 
a (shallow) copy of its module’s dictionary is squirreled away. When the same 
extension is imported by another (sub-)interpreter, a new module is initialized 
and filled with the contents of this copy; the extension’s init function is not 
called. Objects in the module’s dictionary thus end up shared across 
(sub-)interpreters, which might cause unwanted behavior (see Bugs and caveats 
below).


This does however seem to have changed (sometime between 3.6.9 and 3.10.0). 
Consider the following code:


#include <Python.h>

/*
 * Create a module "my_spam" that uses single-phase initialization
 */
static PyModuleDef EmbModule = {
      PyModuleDef_HEAD_INIT, "my_spam", NULL, -1,
      NULL,
      NULL, NULL, NULL, NULL
};

/*
 * According to the docs this function is only called once when dealing with
 * subinterpreters, the next time a shallow copy of the initial state is
 * returned. This does however not seem to be the case in Python 3.10.0..
 */
static PyObject* PyInit_emb(void) {
  PyObject *module = PyModule_Create(&EmbModule);
  PyModule_AddObject(module, "test", PyDict_New());

  printf("Init my_spam module %p\n", module);
  return module;
}

/*
 * Main program
 */

int main(int argc, char *argv[]) {
  PyImport_AppendInittab("my_spam", &PyInit_emb);
  Py_Initialize();

  // Save the main state
  PyThreadState *mainstate = PyThreadState_Get();

  // Create two new interpreters
  PyThreadState *inter1 = Py_NewInterpreter();
  PyThreadState *inter2 = Py_NewInterpreter();

  // Import the my_spam module into the first subinterpreter
  // and change the global variable of it
  PyThreadState_Swap(inter1);
  PyRun_SimpleString("import sys; print(sys.version_info)");
  PyRun_SimpleString("import my_spam; print('my_spam.test: ', my_spam.test)");
  PyRun_SimpleString("my_spam.test[1]=1; print('my_spam.test: ', 
my_spam.test)");

  // Import the my_spam module into the second subinterpreter
  // and change the global variable of it
  PyThreadState_Swap(inter2);
  PyRun_SimpleString("import sys; print(sys.version_info)");
  PyRun_SimpleString("import my_spam; print('my_spam.test: ', my_spam.test)");
  PyRun_SimpleString("my_spam.test[2]=2; print('my_spam.test: ', 
my_spam.test)");

  // Close the subinterpreters
  Py_EndInterpreter(inter2);
  PyThreadState_Swap(inter1);
  Py_EndInterpreter(inter1);

  // Swap back to the main state and finalize python
  PyThreadState_Swap(mainstate);
  if (Py_FinalizeEx() < 0) {
      exit(120);
  }

  return 0;
}



Compiled with python 3.6.9 this does act according to the documentation:


$ gcc test_subinterpreters.c 
-I/home/daniel/.pyenv/versions/3.6.9/include/python3.6m 
-L/home/daniel/.pyenv/versions/3.6.9/lib -lpython3.6m && 
LD_LIBRARY_PATH=/home/daniel/.pyenv/versions/3.6.9/lib ./a.out
sys.version_info(major=3, minor=6, micro=9, releaselevel='final', serial=0)
Init my_spam module 0x7ff7a63d1ef8
my_spam.test:  {}
my_spam.test:  {1: 1}
sys.version_info(major=3, minor=6, micro=9, releaselevel='final', serial=0)
my_spam.test:  {1: 1}
my_spam.test:  {1: 1, 2: 2}


But compiled with 3.10.0 the module is reinitialized and thus objects in the 
module are not shared between the subinterpreters:


$ gcc test_subinterpreters.c 
-I/home/daniel/.pyenv/versions/3.10.0/include/python3.10 
-L/home/daniel/.pyenv/versions/3.10.0/lib -lpython3.10 && 
LD_LIBRARY_PATH=/home/daniel/.pyenv/versions/3.10.0/lib ./a.out
sys.version_info(major=3, minor=10, micro=0, releaselevel='final', serial=0)
Init my_spam module 0x7f338a9a9530
my_spam.test:  {}
my_spam.test:  {1: 1}
sys.version_info(major=3, minor=10, micro=0, releaselevel='final', serial=0)
Init my_spam module 0x7f338a9a96c0
my_spam.test:  {}
my_spam.test:  {2: 2}


To me the new behavior seems nicer, but at the very least the documentation 
should be updated. It also seems like if this could break integrations, albeit 
it is an unlikely "feature" to rely on.

----------
components: C API
messages: 408209
nosy: daniel-falk
priority: normal
severity: normal
status: open
title: Single-phase initialized modules gets initialized multiple times in 
3.10.0
type: behavior
versions: Python 3.10

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue46036>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to