New submission from neonene <nicesal...@gmail.com>:
pyperformance on Windows shows some gap between 3.10a7 and 3.10b1. The following are the ratios compared with 3.10a7 (the higher the slower). ------------------------------------------------- Windows x64 | PGO release official-binary ----------------+-------------------------------- 20210405 | 3.10a7 | 1.00 1.24 1.00 (PGO?) 20210408-07:58 | b98eba5 | 0.98 20210408-10:22 | * PR25244 | 1.04 20210503 | 3.10b1 | 1.07 1.21 1.07 ------------------------------------------------- Windows x86 | PGO release official-binary ----------------+-------------------------------- 20210405 | 3.10a7 | 1.00 1.25 1.27 (release?) 20210408-07:58 | b98eba5bc | 1.00 20210408-10:22 | * PR25244 | 1.11 20210503 | 3.10b1 | 1.14 1.28 1.29 Since PR25244 (28d28e053db6b69d91c2dfd579207cd8ccbc39e7), _PyEval_EvalFrameDefault() in ceval.c has seemed to be unoptimized with PGO (msvc14.29.16.10). At least the functions below have become un-inlined there at all. (1) _Py_DECREF() (from Py_DECREF,Py_CLEAR,Py_SETREF) (2) _Py_XDECREF() (from Py_XDECREF,SETLOCAL) (3) _Py_IS_TYPE() (from PyXXX_CheckExact) (4) _Py_atomic_load_32bit_impl() (from CHECK_EVAL_BREAKER) I tried in vain other linker options like thread-safe-profiling, agressive-code-generation, /OPT:NOREF. 3.10a7 can inline them in the eval-loop even if profiling only test_array.py. I measured overheads of (1)~(4) on my own build whose eval-loop uses macros instead of them. ----------------------------------------------------------------- Windows x64 | PGO patched overhead in eval-loop ----------------+------------------------------------------------ 3.10a7 | 1.00 20210802 | 3.10rc1 | 1.09 1.05 4% (slow 43, fast 5, same 10) 20210831-20:42 | 863154c | 0.95 0.90 5% (slow 48, fast 3, same 7) (3.11a0+) | ----------------------------------------------------------------- Windows x86 | PGO patched overhead in eval-loop ----------------+------------------------------------------------ 3.10a7 | 1.00 20210802 | 3.10rc1 | 1.15 1.13 2% (slow 29, fast 14, same 15) 20210831-20:42 | 863154c | 1.05 1.02 3% (slow 44, fast 7, same 7) (3.11a0+) | ---------- components: C API, Interpreter Core, Windows files: 310rc1_confirm_overhead.patch keywords: patch messages: 401143 nosy: Mark.Shannon, neonene, pablogsal, paul.moore, steve.dower, tim.golden, vstinner, zach.ware priority: normal severity: normal status: open title: Performance regression 3.10b1 and later on Windows type: performance versions: Python 3.10, Python 3.11 Added file: https://bugs.python.org/file50263/310rc1_confirm_overhead.patch _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue45116> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com