Nick Coghlan <ncogh...@gmail.com> added the comment:

As Petr notes, as long as all subinterpreters share the GIL, and share str 
instances, then the existing _Py_IDENTIFIER mechanism will work fine for both 
single phase and multi-phase initialisation.

However, that constraint also goes the other way: as long as we have modules 
that use the existing _Py_IDENTIFIER mechanism, then subinterpreters *must* 
share str instances, and hence *must* share the GIL.

Hence the "enhancement" classification: there's nothing broken right now, but 
if we're ever going to achieve the design goal of using subinterpreters to 
exploit multiple CPU cores without the overhead of running multiple full 
interpreter processes, we're going to need to design a different way of 
handling this.

Something to keep in mind with `_Py_IDENTIFIER` and any replacement API: the 
baseline for performance comparisons is 
https://docs.python.org/3/c-api/unicode.html#c.PyUnicode_InternFromString

The reason multi-phase initialisation makes this more complicated is that it 
means we can't use the memory addresses of C process globals as unique 
identifiers any more, since more than one module object may be created from the 
same C shared library.

However, if we assume we've moved to per-module state storage (to get unique 
memory addresses back), then we can largely re-use the existing 
`_Py_IDENTIFIER` machinery to make the lookup as fast as possible, while still 
avoiding conflicts between subinterpreters.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue39465>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to