STINNER Victor added the comment:

pyobject_fastcall-4.patch combines a lot of changes. I wrote it to experiment 
_PyObject_FastCall().

I will rewrite these changes as a patch serie with smaller changes.

The design of PyObject_FastCall*() is to be short and simple.

Changes:

* Add _PyObject_FastCall(), _PyFunction_FastCall(), _PyCFunction_FastCall(): 
similar to the _PyObject_FastCallKeywords() but without keyword arguments. 
Without keyword arguments, the checks on keyword arguments can be removed, and 
it allows to reduce the stack usage: one less parameter to C functions.

* Move the slow path out of _PyObject_FastCallDict() and 
_PyObject_FastCallKeywords() in a new subfunction which is tagged with 
_Py_NO_INLINE to reduce the stack consumption of _PyObject_FastCallDict() and 
_PyObject_FastCallKeywords() in their fast paths, which are the most code paths.

* Move all "call" functions into a new Objects/call.c function. This change 
should help code placement to enhance the usage of the CPU caches (especially 
the CPU L1 instruction cache). In my long experience with benchmarking last 
year, I notice huge performance differences caused by code placement. See my 
blog post:
https://haypo.github.io/analysis-python-performance-issue.html
Sadly, _Py_HOT_FUNCTION didn't fix the issue completely.

* _PyObject_FastCallDict() and _PyObject_FastCallKeywords() "call" 
_PyObject_FastCall(). In fact, _PyObject_FastCall() is inlined manually in 
these functions, against, to minimize the stack usage. Similar change in 
_PyCFunction_FastCallDict() and PyCFunction_Call().

* Since the slow path is moved to a subfunction, I removed _Py_NO_INLINE from 
_PyStack_AsTuple() to allow the compiler to inline it if it wants to ;-) Since 
the stack usage is better with the patch, it seems like this strange has no 
negative impact on the stack usage.

* Optimize PyMethodDescr_Type (call _PyMethodDescr_FastCallKeywords()) in 
_PyObject_FastCallKeywords()

* I moved Py_EnterRecursiveCall() closer to the final function call, to 
simplify error handling. It would be nice to fix the issue #29306 before :-/

* I had to copy/paste the null_error() function from Objects/abstract.c. I 
don't think that it's worth it to add a _PyErr_NullError() function shared by 
abstract.c and call.c. Compilers are even able to merge duplicated functions ;-)

* A few other changes.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue29465>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to