Stefan Behnel <stefan...@behnel.de> added the comment:

Well … yes.

The exception fields are performance critical, and we try hard to make them 
visible to the C compiler so that swapping around exception state eats up as 
little CPU time as possible.

You could argue that profiling and tracing are less critical, but any 
nanosecond that is avoided while not tracing a function adds up to making the 
rest of the program faster, so I'd argue that that's performance critical, too. 
Profiling definitely is, because it should have as little impact on the code 
profile as possible. There is a huge difference between having the CPU 
pre-fetch a pointer and looking at the value, compared to calling into a C 
function and guessing what the result might be.

The trashcan is only used during deallocation, so … well, I guess it could be 
replaced by a different API, but that's a bit tricky due to the bracket nature 
of the current macros.

I also just noticed that "Py_EnterRecursiveCall" and "Py_LeaveRecursiveCall" 
are on your list. We use them in our inlined call helper functions, which 
mostly duplicate CPython functionality. Looking at these macros now, I find it 
a bit annoying that they call "PyThreadState_GET()" directly, rather than 
accepting one as input. Looking up the current thread-state is a non-local, 
atomic operation that can be surprisingly costly, and I've invested quite some 
work into reducing these lookups in Cython. Although it's probably not too bad 
around a call into an external function…

So, yeah, we do care about the thread state being readily available. :)

Could you explain what benefit you are expecting from hiding the thread state?

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue35949>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to