Serhiy Storchaka <[email protected]> added the comment:
Results of microbenchmarks:
$ ./python -m perf timeit -s 'a = list(range(1000))' -- 'for i in a: pass'
Mean +- std dev: 6.31 us +- 0.09 us
$ ./python -m perf timeit -s 'a = list(range(1000))' -- '
for i in a:
try: pass
finally: pass
'
Unpatched: Mean +- std dev: 16.3 us +- 0.2 us
PR 2827: Mean +- std dev: 16.2 us +- 0.2 us
PR 4682: Mean +- std dev: 16.2 us +- 0.2 us
PR 5006: Mean +- std dev: 14.5 us +- 0.4 us
$ ./python -m perf timeit -s 'a = list(range(1000))' -- '
for i in a:
try: continue
finally: pass
'
Unpatched: Mean +- std dev: 24.0 us +- 0.5 us
PR 2827: Mean +- std dev: 11.9 us +- 0.1 us
PR 4682: Mean +- std dev: 12.0 us +- 0.1 us
PR 5006: Mean +- std dev: 19.0 us +- 0.3 us
$ ./python -m perf timeit -s 'a = list(range(1000))' -- '
for i in a:
while True:
try: break
finally: pass
'
Unpatched: Mean +- std dev: 25.9 us +- 0.5 us
PR 2827: Mean +- std dev: 11.9 us +- 0.1 us
PR 4682: Mean +- std dev: 12.0 us +- 0.1 us
PR 5006: Mean +- std dev: 18.9 us +- 0.1 us
PR 2827 and PR 4682 have the same performance. The overhead of the finally
block is smaller in PR 5006, perhaps because BEGIN_FINALLY pushes 1 NULL
instead of 6 NULLs. CALL_FINALLY adds 4.5 ns in the latter too examples. This
overhead could be decreased by using special cache for Python integers that
represent return addresses or using separate stack for return addresses. But
this looks as an overkill to me now. 4.5 ns is pretty small overhead, the
simple `i = i` have the same timing.
----------
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue17611>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com