STINNER Victor <vstin...@python.org> added the comment:

I wrote a quick & dirty local patch to define again _Py_NewReference() and 
_Py_Dealloc() as inline function in object.h before Py_DECREF(). I failed to 
see a clear win in term of performance.

Microbenchmark:

./python -m pyperf timeit --duplicate=4096 'o=object(); o=None' -l 512

Result if _Py_Dealloc() is inlined again:

Mean +- std dev: [opaque] 69.3 ns +- 1.5 ns -> [dealloc] 67.5 ns +- 1.5 ns: 
1.03x faster (-3%)

Result if _Py_Dealloc() and _Py_NewReference() are inlined again:

Mean +- std dev: [opaque] 69.3 ns +- 1.5 ns -> [dealloc_newref] 66.1 ns +- 1.3 
ns: 1.05x faster (-5%)

It's a matter of 3.2 nanoseconds. Honestly, I don't think that it's worth it to 
bother with that. I expect way more siginificant speedup with more advanced 
optimizations like using a tracing GC or tagged pointers, and these 
optimizations require to better hide implementation details.

_Py_Dealloc() was converted to a regular function was mistake when I moved code 
to cpython/object.h and nobody noticed.

For all these reasons, I wrote PR 18361 to remove the unused _Py_Dealloc() 
macro, rather than trying to inline it again.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue39543>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to