[Python-ideas] The future of Python parallelism. The GIL. Subinterpreters. Actors.
In the past I have personally viewed Python as difficult to use for parallel applications, which need to do multiple things simultaneously for increased performance: * The old Threads, Locks, & Shared State model is inefficient in Python due to the GIL, which limits CPU usage to only one thread at a time (ignoring certain functions implemented in C, such as I/O). * The Actor model can be used with some effort via the “multiprocessing” module, but it doesn’t seem that streamlined and forces there to be a separate OS process per line of execution, which is relatively expensive. I was thinking it would be nice if there was a better way to implement the Actor model, with multiple lines of execution in the same process, yet avoiding contention from the GIL. This implies a separate GIL for each line of execution (to eliminate contention) and a controlled way to exchange data between different lines of execution. So I was thinking of proposing a design for implementing such a system. Or at least get interested parties thinking about such a system. With some additional research I notice that [PEP 554] (“Multiple subinterpeters in the stdlib”) appears to be putting forward a design similar to the one I described. I notice however it mentions that subinterpreters currently share the GIL, which would seem to make them unusable for parallel scenarios due to GIL contention. I'd like to solicit some feedback on what might be the most efficient way to make forward progress on efficient parallelization in Python inside the same OS process. The most promising areas appear to be: 1. Make the current subinterpreter implementation in Python have more complete isolation, sharing almost no state between subinterpreters. In particular not sharing the GIL. The "Interpreter Isolation" section of PEP 554 enumerates areas that are currently shared, some of which probably shouldn't be. 2. Give up on making things work inside the same OS process and rather focus on implementing better abstractions on top of the existing multiprocessing API so that the actor model is easier to program against. For example, providing some notion of Channels to communicate between lines of execution, a way to monitor the number of Messages waiting in each channel for throughput profiling and diagnostics, Supervision, etc. In particular I could do this by using an existing library like Pykka or Thespian and extending it where necessary. Thoughts? [PEP 554]: https://www.python.org/dev/peps/pep-0554/ -- David Foster | Seattle, WA, USA ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Calling python from C completely statically
Hey there, Yes, the part of having the pyd modules built in in library is already done. I followed the instructions in the README. What I would like to know now is how to embed the non frozen python (py) modules. Can you guys please point me in the right direction. Thank you On Sun, Jul 8, 2018 at 7:07 AM Nick Coghlan wrote: > On 8 July 2018 at 21:34, Paul Moore wrote: > > This question is probably more appropriate for python-list, but yes, > > you certainly can do this. The "Embedded" distributions of Python for > > Windows essentially do this already. IIRC, they are only available for > > Python 3.x, so you may find you have some hurdles to overcome if you > > have to remain on Python 2.7, but in principle it's possible. > > > > One further point you may need to consider - a proportion of the > > standard library is in the form of shared C extensions (PYDs in > > Windows - I don't know if you're using Windows or Unix from what you > > say above). You can't (without significant extra work) load a binary > > extension direct from a zip file, so you'll need to either do that > > extra work (which is platform specific and fragile, I believe) or be > > prepared to ship supporting DLLs alongside your application (this is > > the approach the embedded distribution takes). > > That's the part of the problem that Alberto's static linking solves - > all of the standard library's extension modules are top level ones (at > least as far as I am aware), so we support building them as statically > linked builtin modules instead (we just don't do it routinely, because > we don't have any great desire to make the main executable even larger > than it already is). > > Cheers, > Nick. > > -- > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia > -- Alberto García Illera GPG Public Key: https://goo.gl/twKUUv ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Calling python from C completely statically
On 8 July 2018 at 21:34, Paul Moore wrote: > This question is probably more appropriate for python-list, but yes, > you certainly can do this. The "Embedded" distributions of Python for > Windows essentially do this already. IIRC, they are only available for > Python 3.x, so you may find you have some hurdles to overcome if you > have to remain on Python 2.7, but in principle it's possible. > > One further point you may need to consider - a proportion of the > standard library is in the form of shared C extensions (PYDs in > Windows - I don't know if you're using Windows or Unix from what you > say above). You can't (without significant extra work) load a binary > extension direct from a zip file, so you'll need to either do that > extra work (which is platform specific and fragile, I believe) or be > prepared to ship supporting DLLs alongside your application (this is > the approach the embedded distribution takes). That's the part of the problem that Alberto's static linking solves - all of the standard library's extension modules are top level ones (at least as far as I am aware), so we support building them as statically linked builtin modules instead (we just don't do it routinely, because we don't have any great desire to make the main executable even larger than it already is). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Re: [Python-ideas] Calling python from C completely statically
On 8 July 2018 at 09:01, Alberto Garcia wrote: > Hi, > > I've been working for a while on having the entire python interpreter with > all his modules statically linked in a binary that can be called with > arbitrary argument which will be passed to the python interpreter. > > I've been able to statically compile Python in a single binary with no > dependencies for Python2.7 and using using this code to call the interpreter > with arbitrary code: > > #define Py_NO_ENABLE_SHARED > #include > > int main(int argc, char** argv) > { > Py_NoSiteFlag = 1; > Py_InitializeEx(0); > PyRun_SimpleString(argv[1]); > Py_Finalize(); > } > > This program compiles and does not rely in any dependency but the "non > frozen" (.py) modules are not loaded so when I do a simple `import random` I > get: > Traceback (most recent call last): > File "", line 1, in > ImportError: No module named random > > I've created a bug issue (https://bugs.python.org/issue34057) speaking about > it and Nick Colghlan pointed: > > " cx_freeze is an illustrative example to look at in that regard, as it > preconfigures the interpreter to be able to find the cx_freeze generated zip > archive that has the program's Python modules in it: > https://github.com/anthony-tuininga/cx_Freeze/blob/master/source/bases/Common.c > > The technique that cx_freeze doesn't use yet is to combine the statically > linked Python binary and the generated zip archive into a single file > (similar to what zipapp does), and adjust the sys.path definition inside the > binary to refer back to the executable itself (since executable files can > have arbitrary content appended, while zip files can have arbitrary content > *pre*pended). " > > As I understand, cx_freeze creates a zip file with the dependencies for a > specific python code and creates a zip with them. My question is, could I > create a zip file with every standard module (plus some extra that I may > install with pip) and using that zip file from C? If so, how can I do that? > > I'm happy to work on this if I get indications. > > BTW: I do not want to convert a python code snippet to exe or anything like > that but calling python from C in an absolutely standalone version. This question is probably more appropriate for python-list, but yes, you certainly can do this. The "Embedded" distributions of Python for Windows essentially do this already. IIRC, they are only available for Python 3.x, so you may find you have some hurdles to overcome if you have to remain on Python 2.7, but in principle it's possible. One further point you may need to consider - a proportion of the standard library is in the form of shared C extensions (PYDs in Windows - I don't know if you're using Windows or Unix from what you say above). You can't (without significant extra work) load a binary extension direct from a zip file, so you'll need to either do that extra work (which is platform specific and fragile, I believe) or be prepared to ship supporting DLLs alongside your application (this is the approach the embedded distribution takes). Hope this helps, Paul ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
[Python-ideas] Calling python from C completely statically
Hi, I've been working for a while on having the entire python interpreter with all his modules statically linked in a binary that can be called with arbitrary argument which will be passed to the python interpreter. I've been able to statically compile Python in a single binary with no dependencies for Python2.7 and using using this code to call the interpreter with arbitrary code: #define Py_NO_ENABLE_SHARED #include int main(int argc, char** argv) { Py_NoSiteFlag = 1; Py_InitializeEx(0); PyRun_SimpleString(argv[1]); Py_Finalize(); } This program compiles and does not rely in any dependency but the "non frozen" (.py) modules are not loaded so when I do a simple `import random` I get: Traceback (most recent call last): File "", line 1, in ImportError: No module named random I've created a bug issue (https://bugs.python.org/issue34057) speaking about it and Nick Colghlan pointed: " cx_freeze is an illustrative example to look at in that regard, as it preconfigures the interpreter to be able to find the cx_freeze generated zip archive that has the program's Python modules in it: https://github.com/anthony-tuininga/cx_Freeze/blob/master/source/bases/Common.c The technique that cx_freeze doesn't use yet is to combine the statically linked Python binary and the generated zip archive into a single file (similar to what zipapp does), and adjust the sys.path definition inside the binary to refer back to the executable itself (since executable files can have arbitrary content appended, while zip files can have arbitrary content *pre*pended). " As I understand, cx_freeze creates a zip file with the dependencies for a specific python code and creates a zip with them. My question is, could I create a zip file with every standard module (plus some extra that I may install with pip) and using that zip file from C? If so, how can I do that? I'm happy to work on this if I get indications. BTW: I do not want to convert a python code snippet to exe or anything like that but calling python from C in an absolutely standalone version. Thank you -- Alberto García Illera GPG Public Key: https://goo.gl/twKUUv ___ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/