Re: [Python-Dev] Adding a threadlocal to the Python interpreter
On 18 May 2016 at 23:20, Daniel Holthwrote: > I would like to take another stab at adding a threadlocal "str(bytes) raises > an exception" to the Python interpreter, but I had a very hard time > understanding both how to add a threadlocal value to either the interpreter > state or the threadlocal dict that is part of that state, and then how to > access the same value from both Python and CPython code. The structs were > there but it was just hard to understand. Can someone explain it to me? Christian covered the C aspects of the API, while the general purpose Python aspects live in the threading module. However, the Python level thread-local API doesn't provide direct access to the thread state dict. Instead, it provides access to subdicts stored under per-object keys in that dict, keyed as "thread.local.": * Key definition in local_new: https://hg.python.org/cpython/file/tip/Modules/_threadmodule.c#l705 * Subdict creation in _ldict: https://hg.python.org/cpython/file/tip/Modules/_threadmodule.c#l810 Getting access to state stored that way from C is *possible*, but significantly less convenient than access the thread state directly. What that means is that any time we want to expose thread local state to both C and Python code, it will generally be the responsibility of the C code to both manage the key in the thread state dict (or the field in the thread state struct), and also to provide a Python API for access that state. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyGC_Collect ignores state of `enabled`
On 19 May 2016 at 05:04, Ethan Furmanwrote: > On 05/18/2016 11:52 AM, Neil Schemenauer wrote: >> The whole finalize/shutdown logic of the CPython interpreter could >> badly use some improvement. Currently it is a set of ugly hacks >> piled on top of each other. Now that we have PEP 3121, >> >> Extension Module Initialization and Finalization >> https://www.python.org/dev/peps/pep-3121/ >> >> we should be able to cleanup this mess. PEP 3121 is insufficient, since a lot of extension modules can't (or at least haven't) adopted it in practice. https://www.python.org/dev/peps/pep-0489/ has some more background on that (since it was the first step towards tackling the problem in a different way that extension module authors may be more likely to actually adopt) >> PyImport_Cleanup() is the >> main area of trouble. I don't think we should not be clearing >> sys.modules and we should certainly not be clearing module dicts. >> >> If there is some whippersnapper out there who wants to get their >> hands dirty with Python internals, fixing PyImport_Cleanup() would >> be a juicy project. > > Is there an issue filed for it? It isn't really any one issue, since PyImport_Cleanup aims to tolerate misbehaving modules across multiple Py_Initialize/Finalize cycles within a single process, and hence tries as hard as it can to forcibly break reference cycles and clean up resource allocations. Switching it over to the suggested PyGC_CollectIfEnabled() API should be fine though - it will just need to be documented that calling Py_Initialize again in the same process is unsupported if you have the GC disabled during a previous call to Py_Finalize. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding Type[C] to PEP 484
FYI, a few people gave useful feedback on my draft text, and I've now pushed it to PEP 484. All new text is one section: https://www.python.org/dev/peps/pep-0484/#the-type-of-class-objects Next I'm going to implement it. Subscribe to this issue if you want to follow along: https://github.com/python/typing/issues/107 -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding a threadlocal to the Python interpreter
On 2016-05-18 15:20, Daniel Holth wrote: > I would like to take another stab at adding a threadlocal "str(bytes) > raises an exception" to the Python interpreter, but I had a very hard > time understanding both how to add a threadlocal value to either the > interpreter state or the threadlocal dict that is part of that state, > and then how to access the same value from both Python and CPython code. > The structs were there but it was just hard to understand. Can someone > explain it to me? Python has a two important states related to threads. The PyInterpreterState contains the state of an interpreter instance: sys module, loaded modules and a couple of additional settings. Usually there is just one interpreter state in a Python process. Additional interpreter states are used to implement subinterpreters. Each C thread, that wants to run Python code, must have a PyThreadState. The thread state contains a reference to a PyInterpreterState. Each PyThreadState has a PyObject *dict member. You can stick Python objects into the dict. The interpreter cleans up the dict when it reaps a thread. How performance critical is your code? Does the interpreter have to check the value of the thread local frequently? In that case you should add a new member to typedef struct _ts PyThreadState in pystate.h right before /* XXX signal handlers should also be here */. Otherwise you can simply use PyThreadState_GetDict(). It returns a Python dict object that is local to the current thread. You can simply use a fixed key like in Modules/_decimal/_decimal.c. Christian ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Speeding up CPython 5-10%
No problem, I did not think you were attacking me or find your response rude. On Wed, May 18, 2016, at 01:06 PM, Cesare Di Mauro wrote: > If you feel like I've attacked you, I apologize: it wasn't my > intention. Please, don't get it personal: I only reported my honest > opinion, albeit after a re-read it looks too rude, and I'm sorry > for that. > > Regarding the post-bytecode optimization issues, they are mainly > represented by the constant folding code, which is still in the > peephole stage. Once it's moved to the proper place (ASDL/AST), then > such kind of issues with the stack calculations disappear, whereas the > remaining ones can be addressed by a fix of the current > stackdepth_walk function. > > And just to be clear, I've nothing against your code. I simply think > that, due to my experience, it doesn't fit in CPython. > > Regards > Cesare > > 2016-05-18 18:50 GMT+02:00: >> __ >> Your criticisms may very well be true. IIRC though, I wrote that pass >> because what was available was not general enough. The >> stackdepth_walk function made assumptions that, while true of code >> generated by the current cpython frontend, were not universally true. >> If a goal is to move this calculation after any bytecode >> optimization, something along these lines seems like it will >> eventually be necessary. >> >> Anyway, just offering things already written. If you don't feel it's >> useful, no worries. >> >> >> On Wed, May 18, 2016, at 11:35 AM, Cesare Di Mauro wrote: >>> 2016-05-17 8:25 GMT+02:00 : In the project https://github.com/zachariahreed/byteasm I mentioned on the list earlier this month, I have a pass that to computes stack usage for a given sequence of bytecodes. It seems to be a fair bit more agressive than cpython. Maybe it's more generally useful. It's pure python rather than C though. >>> IMO it's too big, resource hungry, and slower, even if you convert >>> it in C. >>> >>> If you take a look at the current stackdepth_walk function which >>> CPython uses, it's much smaller (not even 50 lines in simple C code) >>> and quite efficient. >>> >>> Currently the problem is that it doesn't return the maximum depth of >>> the tree, but it updates the intermediate/current maximum, and >>> *then* it uses it for the subsequent calculations. So, the depth >>> artificially grows, like in the reported cases. >>> >>> It doesn't require a complete rewrite, but spending some time for >>> fine-tuning it. >>> >>> Regards >>> Cesare >> ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyGC_Collect ignores state of `enabled`
On 05/18/2016 11:52 AM, Neil Schemenauer wrote: Benjamin Peterson wrote: Adding PyGC_CollectIfEnabled() and calling it in Py_Finalize is probably fine. I don't think the contract of PyGC_Collect itself (or gc.collect() for that matter) should be changed. You might want to disable GC but invoke it yourself. Yes, that sounds okay to me. I poked around at the calls to PyGC_Collect() and _PyGC_CollectNoFail(). The cyclic garbage collector gets invoked at least three times during shutdown. Once by Py_FinalizeEx() and two times by PyImport_Cleanup(). That seems a bit exessively expensive to me. The collection time can be significant for programs with a lot of "container" objects in memory. The whole finalize/shutdown logic of the CPython interpreter could badly use some improvement. Currently it is a set of ugly hacks piled on top of each other. Now that we have PEP 3121, Extension Module Initialization and Finalization https://www.python.org/dev/peps/pep-3121/ we should be able to cleanup this mess. PyImport_Cleanup() is the main area of trouble. I don't think we should not be clearing sys.modules and we should certainly not be clearing module dicts. If there is some whippersnapper out there who wants to get their hands dirty with Python internals, fixing PyImport_Cleanup() would be a juicy project. Is there an issue filed for it? -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyGC_Collect ignores state of `enabled`
Benjamin Petersonwrote: > Adding PyGC_CollectIfEnabled() and calling it in Py_Finalize is probably > fine. I don't think the contract of PyGC_Collect itself (or gc.collect() > for that matter) should be changed. You might want to disable GC but > invoke it yourself. Yes, that sounds okay to me. I poked around at the calls to PyGC_Collect() and _PyGC_CollectNoFail(). The cyclic garbage collector gets invoked at least three times during shutdown. Once by Py_FinalizeEx() and two times by PyImport_Cleanup(). That seems a bit exessively expensive to me. The collection time can be significant for programs with a lot of "container" objects in memory. The whole finalize/shutdown logic of the CPython interpreter could badly use some improvement. Currently it is a set of ugly hacks piled on top of each other. Now that we have PEP 3121, Extension Module Initialization and Finalization https://www.python.org/dev/peps/pep-3121/ we should be able to cleanup this mess. PyImport_Cleanup() is the main area of trouble. I don't think we should not be clearing sys.modules and we should certainly not be clearing module dicts. If there is some whippersnapper out there who wants to get their hands dirty with Python internals, fixing PyImport_Cleanup() would be a juicy project. Neil ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Speeding up CPython 5-10%
If you feel like I've attacked you, I apologize: it wasn't my intention. Please, don't get it personal: I only reported my honest opinion, albeit after a re-read it looks too rude, and I'm sorry for that. Regarding the post-bytecode optimization issues, they are mainly represented by the constant folding code, which is still in the peephole stage. Once it's moved to the proper place (ASDL/AST), then such kind of issues with the stack calculations disappear, whereas the remaining ones can be addressed by a fix of the current stackdepth_walk function. And just to be clear, I've nothing against your code. I simply think that, due to my experience, it doesn't fit in CPython. Regards Cesare 2016-05-18 18:50 GMT+02:00: > Your criticisms may very well be true. IIRC though, I wrote that pass > because what was available was not general enough. The stackdepth_walk > function made assumptions that, while true of code generated by the current > cpython frontend, were not universally true. If a goal is to move this > calculation after any bytecode optimization, something along these lines > seems like it will eventually be necessary. > > Anyway, just offering things already written. If you don't feel it's > useful, no worries. > > > On Wed, May 18, 2016, at 11:35 AM, Cesare Di Mauro wrote: > > 2016-05-17 8:25 GMT+02:00 : > > In the project https://github.com/zachariahreed/byteasm I mentioned on > the list earlier this month, I have a pass that to computes stack usage > for a given sequence of bytecodes. It seems to be a fair bit more > agressive than cpython. Maybe it's more generally useful. It's pure > python rather than C though. > > > IMO it's too big, resource hungry, and slower, even if you convert it in C. > > If you take a look at the current stackdepth_walk function which CPython > uses, it's much smaller (not even 50 lines in simple C code) and quite > efficient. > > Currently the problem is that it doesn't return the maximum depth of the > tree, but it updates the intermediate/current maximum, and *then* it uses > it for the subsequent calculations. So, the depth artificially grows, like > in the reported cases. > > It doesn't require a complete rewrite, but spending some time for > fine-tuning it. > > Regards > Cesare > > > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Speeding up CPython 5-10%
Your criticisms may very well be true. IIRC though, I wrote that pass because what was available was not general enough. The stackdepth_walk function made assumptions that, while true of code generated by the current cpython frontend, were not universally true. If a goal is to move this calculation after any bytecode optimization, something along these lines seems like it will eventually be necessary. Anyway, just offering things already written. If you don't feel it's useful, no worries. On Wed, May 18, 2016, at 11:35 AM, Cesare Di Mauro wrote: > 2016-05-17 8:25 GMT+02:00: >> In the project https://github.com/zachariahreed/byteasm I >> mentioned on >> the list earlier this month, I have a pass that to computes >> stack usage >> for a given sequence of bytecodes. It seems to be a fair bit more >> agressive than cpython. Maybe it's more generally useful. It's pure >> python rather than C though. >> > IMO it's too big, resource hungry, and slower, even if you convert > it in C. > > If you take a look at the current stackdepth_walk function which > CPython uses, it's much smaller (not even 50 lines in simple C code) > and quite efficient. > > Currently the problem is that it doesn't return the maximum depth of > the tree, but it updates the intermediate/current maximum, and *then* > it uses it for the subsequent calculations. So, the depth artificially > grows, like in the reported cases. > > It doesn't require a complete rewrite, but spending some time for fine- > tuning it. > > Regards > Cesare ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Speeding up CPython 5-10%
2016-05-17 8:25 GMT+02:00: > In the project https://github.com/zachariahreed/byteasm I mentioned on > the list earlier this month, I have a pass that to computes stack usage > for a given sequence of bytecodes. It seems to be a fair bit more > agressive than cpython. Maybe it's more generally useful. It's pure > python rather than C though. > IMO it's too big, resource hungry, and slower, even if you convert it in C. If you take a look at the current stackdepth_walk function which CPython uses, it's much smaller (not even 50 lines in simple C code) and quite efficient. Currently the problem is that it doesn't return the maximum depth of the tree, but it updates the intermediate/current maximum, and *then* it uses it for the subsequent calculations. So, the depth artificially grows, like in the reported cases. It doesn't require a complete rewrite, but spending some time for fine-tuning it. Regards Cesare ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Adding a threadlocal to the Python interpreter
I would like to take another stab at adding a threadlocal "str(bytes) raises an exception" to the Python interpreter, but I had a very hard time understanding both how to add a threadlocal value to either the interpreter state or the threadlocal dict that is part of that state, and then how to access the same value from both Python and CPython code. The structs were there but it was just hard to understand. Can someone explain it to me? Thanks, Daniel Holth ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com