Hello, recently I have been playing a bit with cpyext, so see if there are low haning fruits to be taken to improve the performance.
I didn't get any real result but I think it's interesting to share my findings. The benchmark I'm using is here: https://github.com/antocuni/cpyext-benchmarks it contains a simple C extension defining three methods, one for each METH_NOARGS, METH_O and METH_VARARGS flags. So first, the results with CPython and PyPy 5.8: $ python bench.py noargs : 0.78 secs onearg : 0.89 secs varargs: 1.05 secs $ pypy bench.py noargs : 1.67 secs onearg : 2.13 secs varargs: 4.89 secs Then, I tried my cpyext-jit branch; this branch does two things: 1) it makes cpyext visible to the JIT, and add enough @jit.dont_look_inside so that it actually compiles 2) merges part of the cpyext-callopt branch, up to rev 9cbc8bd76297 (more on this later): this adds fast paths for METH_NOARGS and METH_O to avoid going through the slow __args__.unpack(): $ pypy-cpyext-jit bench.py noargs : 0.30 secs onearg : 0.31 secs varargs: 4.90 secs So, apparently this is enough to greatly speedup the calls, and be even faster than CPython. Note that "onearg" calls "simple.onearg(None)". However, things become more complicated as soon as I start passing various kind of objects to onearg(): $ pypy bench_oneargs.py # pypy 5.8 onearg(None): 2.09 secs onearg(1) : 2.07 secs onearg(i) : 4.98 secs onearg(i%2) : 4.92 secs onearg(X) : 2.13 secs onearg((1,)): 2.30 secs onearg((i,)): 9.80 secs $ pypy-cpyext-jit bench_oneargs.py onearg(None): 0.30 secs onearg(1) : 0.30 secs onearg(i) : 2.52 secs onearg(i%2) : 2.56 secs onearg(X) : 0.30 secs onearg((1,)): 0.30 secs onearg((i,)): 7.45 secs so, the call optimization still helps, but as soon as we need to convert one object from pypy to cpython we are horribly slow. However, it is interesting to note that: 1) if we pass a constant object, we are fast: None, 1, (1,) 2) if we pass X (which is a global X=100), we are still fast 3) any other object which is created on the fly is slow Looking at the traces, they look more or less the same in the three cases, so I don't really understand what is the difference. Finally, about the branch cpyext-callopt, which was started in Leysin by Richard, Armin and me: I am not sure to fully understand the purpose of dbba78b270fd: apparently, the optimization done in 9cbc8bd76297 seems to work well, so what am I missing? ciao, Anto
_______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev