STINNER Victor added the comment:

I spent the last 3 months on making the CPython benchmark suite more stable and 
enhance my procedure to run benchmarks to ensure that benchmarks are more 

See my articles:

I forked and enhanced the benchmark suite to use my perf module to run 
benchmarks in multiple processes:

I ran this better benchmark suite on fastcall-2.patch on my laptop. The result 
is quite good: 
$ python3 -m perf compare_to ref.json fastcall.json -G  --min-speed=5
Slower (4):
- fastpickle/pickle_dict: 326 us +- 15 us -> 350 us +- 29 us: 1.07x slower
- regex_effbot: 49.4 ms +- 1.3 ms -> 53.0 ms +- 1.2 ms: 1.07x slower
- fastpickle/pickle: 432 us +- 8 us -> 457 us +- 10 us: 1.06x slower
- pybench.ComplexPythonFunctionCalls: 838 ns +- 11 ns -> 884 ns +- 8 ns: 1.05x 

Faster (13):
- spectral_norm: 289 ms +- 6 ms -> 250 ms +- 5 ms: 1.16x faster
- pybench.SimpleIntFloatArithmetic: 622 ns +- 9 ns -> 559 ns +- 10 ns: 1.11x 
- pybench.SimpleIntegerArithmetic: 621 ns +- 10 ns -> 560 ns +- 9 ns: 1.11x 
- pybench.SimpleLongArithmetic: 891 ns +- 12 ns -> 816 ns +- 10 ns: 1.09x faster
- pybench.DictCreation: 852 ns +- 13 ns -> 788 ns +- 16 ns: 1.08x faster
- pybench.ForLoops: 10.8 ns +- 0.3 ns -> 9.99 ns +- 0.23 ns: 1.08x faster
- pybench.NormalClassAttribute: 1.85 us +- 0.02 us -> 1.72 us +- 0.04 us: 1.08x 
- pybench.SpecialClassAttribute: 1.86 us +- 0.02 us -> 1.73 us +- 0.03 us: 
1.07x faster
- pybench.NestedForLoops: 21.9 ns +- 0.3 ns -> 20.7 ns +- 0.3 ns: 1.05x faster
- pybench.SimpleListManipulation: 501 ns +- 4 ns -> 476 ns +- 5 ns: 1.05x faster
- elementtree/process: 192 ms +- 3 ms -> 183 ms +- 2 ms: 1.05x faster
- elementtree/generate: 225 ms +- 5 ms -> 214 ms +- 4 ms: 1.05x faster
- hexiom2/level_25: 21.3 ms +- 0.3 ms -> 20.3 ms +- 0.1 ms: 1.05x faster

Benchmark hidden because not significant (84): (...)

Most benchmarks are not significant which is expected since fastcall-2.patch is 
really the most simple patch to start the work on "FASTCALL", it doesn't really 
implement any optimization, it only adds a new infrastructure to implement new 

A few benchmarks are faster (only benchmarks at least 5% faster are shown using 

4 benchmarks are slower, but the slowdown should be temporarily: new 
optimizations should these benchmarks slower. See the issue #26814 for more a 
concrete implementation and a lot of benchmark results if you don't trust me :-)

I consider that benchmarks proved that there is no major slowdown, so 
fastcall-2.patch can be merged to be able to start working on real 


Python tracker <>
Python-bugs-list mailing list

Reply via email to