STINNER Victor <vstin...@python.org> added the comment:
I wrote a quick & dirty local patch to define again _Py_NewReference() and _Py_Dealloc() as inline function in object.h before Py_DECREF(). I failed to see a clear win in term of performance. Microbenchmark: ./python -m pyperf timeit --duplicate=4096 'o=object(); o=None' -l 512 Result if _Py_Dealloc() is inlined again: Mean +- std dev: [opaque] 69.3 ns +- 1.5 ns -> [dealloc] 67.5 ns +- 1.5 ns: 1.03x faster (-3%) Result if _Py_Dealloc() and _Py_NewReference() are inlined again: Mean +- std dev: [opaque] 69.3 ns +- 1.5 ns -> [dealloc_newref] 66.1 ns +- 1.3 ns: 1.05x faster (-5%) It's a matter of 3.2 nanoseconds. Honestly, I don't think that it's worth it to bother with that. I expect way more siginificant speedup with more advanced optimizations like using a tracing GC or tagged pointers, and these optimizations require to better hide implementation details. _Py_Dealloc() was converted to a regular function was mistake when I moved code to cpython/object.h and nobody noticed. For all these reasons, I wrote PR 18361 to remove the unused _Py_Dealloc() macro, rather than trying to inline it again. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue39543> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com