[issue45116] Performance regression 3.10b1: inlining issue in the big _PyEval_EvalFrameDefault() function with Visual Studio (MSC)
neonene added the comment: >What exactly does "pgo hard reject" mean? In my recognition, "pgo hard reject" is based on the PGOptimizer's heuristic, "reject" is related to the probe count (hot/cold). https://developercommunity.visualstudio.com/t/1531987#T-N1535774 And there was a reply from MSVC team, closing the issue. MSVC won't be fixed in the near future. https://developercommunity.visualstudio.com/t/1595341#T-N1695626 >From the reply and my investigation, 3.11 would need the following: 1. Some callsites such as tp_* pointer should not inline its fastpaths in the eval switch-case. They often conflict. Each pointer needs to be wrapped with a function or maybe _PyEval_EvalFrameDefault needs to be enclosed with "inline_depth(0)" pragma. 2. __assume(0) should be replaced with other function, inside the eval switch-case or in the inlined paths of callees. This is critical with PGO. 3. For inlining, use __forceinline / macro / const function pointer. MSVC's stuck can be avoided in many ways, when force-inlining in the evalloop a ton of Py_DECREF()s, unless tp_dealloc does not create a inlined callsite: void _Py_Dealloc(PyObject *op) { ... #pragma inline_depth(0) // effects from here, PGO accepts only 0. (*dealloc)(op); // conflicts when inlined. } #pragma inline_depth() // can be reset only outside the func. * Virtual Call Speculation: https://docs.microsoft.com/en-us/cpp/build/profile-guided-optimizations?view=msvc-170#optimizations-performed-by-pgo * The profiler runs under /GENPROFILE:PATH option, but at the big ceval-func, the optimizer merges the profiles into one like /GENPROFILE:NOPATH mode. https://docs.microsoft.com/en-us/cpp/build/reference/genprofile-fastgenprofile-generate-profiling-instrumented-build?view=msvc-170#arguments * __assume(0) (Py_UNREACHABLE): https://devblogs.microsoft.com/cppblog/visual-studio-2017-throughput-improvements-and-advice/#remove-usages-of-__assume -- ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47182] _PyUnicode_Fini should invalidate ucnhash_capi capsule pointer
Change by neonene : -- nosy: +neonene nosy_count: 5.0 -> 6.0 pull_requests: +30375 pull_request: https://github.com/python/cpython/pull/32313 ___ Python tracker <https://bugs.python.org/issue47182> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47103] Copy pgort140.dll when building for PGO
Change by neonene : -- nosy: +neonene nosy_count: 4.0 -> 5.0 pull_requests: +30226 pull_request: https://github.com/python/cpython/pull/32146 ___ Python tracker <https://bugs.python.org/issue47103> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43166] Unused letters in Windows-specific pragma optimize
Change by neonene : -- nosy: +neonene nosy_count: 7.0 -> 8.0 pull_requests: +30111 pull_request: https://github.com/python/cpython/pull/32023 ___ Python tracker <https://bugs.python.org/issue43166> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43271] AMD64 Windows10 3.x crash with Windows fatal exception: stack overflow
Change by neonene : -- nosy: +neonene, pablogsal nosy_count: 8.0 -> 10.0 pull_requests: +30112 pull_request: https://github.com/python/cpython/pull/32023 ___ Python tracker <https://bugs.python.org/issue43271> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46841] Inline bytecode caches
neonene added the comment: UNPACK_SEQUENCE's slowdown is already filed? https://speed.python.org/timeline/#/?exe=12=unpack_sequence=4=50=off=on=on I hit the gap at 424ecab on Windows. -- nosy: +neonene ___ Python tracker <https://bugs.python.org/issue46841> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1: inlining issue in the big _PyEval_EvalFrameDefault() function with Visual Studio (MSC)
Change by neonene : -- pull_requests: +29588 pull_request: https://github.com/python/cpython/pull/31459 ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1: inlining issue in the big _PyEval_EvalFrameDefault() function with Visual Studio (MSC)
Change by neonene : -- pull_requests: +29570 pull_request: https://github.com/python/cpython/pull/31436 ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46427] Correct MSBuild's configuration for _freeze_module.exe
neonene added the comment: > + This is bad if ARM64 machine takes the blank value as not "ARM64" but "ARM", as "ARM" tools are not necessary to install. Then, I agree with the proposal of the OP (PR28491) below: > Would it be acceptable if a new host platform property is added to > the project file and keep x64 or x86 as default (depends on target) > but allow users to configure a different host platform to allow > native arm64 compilation? -- ___ Python tracker <https://bugs.python.org/issue46427> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46427] Correct MSBuild's configuration for _freeze_module.exe
neonene added the comment: > When cross-compiling, tools that are executed as part of the build need to be > built for the tool platform, not the target platform. My PR does not against that at this point, as proposed codes are based on your PR28322 (09b4ad11f323f8702cde795e345b75e0fbb1a9a5). If we now need to prepare for future MSVC *on* ARM, then current _freeze_module configurations in "pcbuild.sln" also need to be reconsidered: {19C0C13F-47CA-4432-AFF3-799A296A4DDC}.Debug|ARM.ActiveCfg = Debug|Win32 {19C0C13F-47CA-4432-AFF3-799A296A4DDC}.Debug|ARM.Build.0 = Debug|Win32 {19C0C13F-47CA-4432-AFF3-799A296A4DDC}.Debug|ARM64.ActiveCfg = Debug|x64 {19C0C13F-47CA-4432-AFF3-799A296A4DDC}.Debug|ARM64.Build.0 = Debug|x64 {19C0C13F-47CA-4432-AFF3-799A296A4DDC}.PGInstrument|ARM.ActiveCfg = Release|Win32 {19C0C13F-47CA-4432-AFF3-799A296A4DDC}.PGInstrument|ARM.Build.0 = Release|Win32 {19C0C13F-47CA-4432-AFF3-799A296A4DDC}.PGInstrument|ARM64.ActiveCfg = Release|x64 {19C0C13F-47CA-4432-AFF3-799A296A4DDC}.PGInstrument|ARM64.Build.0 = Release|x64 {19C0C13F-47CA-4432-AFF3-799A296A4DDC}.PGUpdate|ARM.ActiveCfg = Release|Win32 {19C0C13F-47CA-4432-AFF3-799A296A4DDC}.PGUpdate|ARM64.ActiveCfg = Release|x64 {19C0C13F-47CA-4432-AFF3-799A296A4DDC}.Release|ARM.ActiveCfg = Release|Win32 {19C0C13F-47CA-4432-AFF3-799A296A4DDC}.Release|ARM.Build.0 = Release|Win32 {19C0C13F-47CA-4432-AFF3-799A296A4DDC}.Release|ARM64.ActiveCfg = Release|x64 {19C0C13F-47CA-4432-AFF3-799A296A4DDC}.Release|ARM64.Build.0 = Release|x64 Anyway, what I care about is the usage of "PreferredToolArchitecture" property in the current configuration. The property has nothing to do with whether the host is ARM* or not. Another property will do in the future. When building x86 python with 64bit compiler (set PreferredToolArchitecture=x64), _freeze_module gets a x64 executable. The following change is acceptable? - $(PreferredToolArchitecture) + They are the same with no envvar. _freeze_module is always 32bit, though. -- ___ Python tracker <https://bugs.python.org/issue46427> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46427] Correct MSBuild's configuration for _freeze_module.exe
neonene added the comment: Defenition in general_advanced.xml These options above are corresponded to the following folders in my case: Microsoft Visual Studio\.\VC\Tools\MSVC\\bin\Hostx86 Microsoft Visual Studio\.\VC\Tools\MSVC\\bin\Hostx64 And Each has the 4 children below that contain cl.exe/link.exe/etc...: arm arm64 x64 x86 -- ___ Python tracker <https://bugs.python.org/issue46427> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46427] Correct MSBuild's configuration for _freeze_module.exe
neonene added the comment: >This also rolls _freeze_module.exe's architecture back to x64 Correcting: from x86 back to x64 In my recognition, only Win32 _freeze_module.exe is built currently and run on non-ARM machines to generate the code for Win32/x64/ARM/ARM64 targets. -- ___ Python tracker <https://bugs.python.org/issue46427> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46427] Correct MSBuild's configuration for _freeze_module.exe
Change by neonene : -- keywords: +patch pull_requests: +28873 stage: -> patch review pull_request: https://github.com/python/cpython/pull/30673 ___ Python tracker <https://bugs.python.org/issue46427> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46427] Correct MSBuild's configuration for _freeze_module.exe
New submission from neonene : In pcbuild.proj, "PreferredToolArchitecture" property looks misused, which I think is useful giving us two selections of a compiler (32bit or 64bit) for any target architecture (Win32/x64/ARM/ARM64). I think the property can be unused there. This means a partial revert of PR28491, whose description I cannot reproduce. This also rolls _freeze_module.exe's architecture back to x64 when the target platform is x64 or ARM64. -- components: Build messages: 410891 nosy: neonene priority: normal severity: normal status: open title: Correct MSBuild's configuration for _freeze_module.exe type: behavior versions: Python 3.11 ___ Python tracker <https://bugs.python.org/issue46427> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46362] os.path.abspath() needs more normalization on Windows
Change by neonene : -- pull_requests: +28793 pull_request: https://github.com/python/cpython/pull/30595 ___ Python tracker <https://bugs.python.org/issue46362> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46362] os.path.abspath() needs more normalization on Windows
neonene added the comment: Basically, PR30571 aims for compatibility with 3.10 and earlier. Using Windows API is the easiest and the same way as them: import os.path paths = [ r'C:\CON', r'C:\PRN', r'C:\AUX', r'C:\NUL', r'C:\COM1', r'C:\COM2', r'C:\COM3', r'C:\COM9', r'C:\LPT1', r'C:\LPT2', r'C:\LPT3', r'C:\LPT9', r'C:\foo. . .', ] for path in paths: print(os.path.abspath(path)) """ 3.11 before C:\CON C:\PRN C:\AUX C:\NUL C:\COM1 C:\COM2 C:\COM3 C:\COM9 C:\LPT1 C:\LPT2 C:\LPT3 C:\LPT9 C:\foo. . . 3.11 after \\.\CON \\.\PRN \\.\AUX \\.\NUL \\.\COM1 \\.\COM2 \\.\COM3 \\.\COM9 \\.\LPT1 \\.\LPT2 \\.\LPT3 \\.\LPT9 C:\foo 3.10.1 \\.\CON \\.\PRN \\.\AUX \\.\NUL \\.\COM1 \\.\COM2 \\.\COM3 \\.\COM9 \\.\LPT1 \\.\LPT2 \\.\LPT3 \\.\LPT9 C:\foo """ -- ___ Python tracker <https://bugs.python.org/issue46362> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46287] UNC path normalisation issues on Windows
neonene added the comment: > PathCchSkipRoot() doesn't recognize forward slash as a path separator, I opened issue46362 and PR30571 about the mentioned abspath() behaviors. -- ___ Python tracker <https://bugs.python.org/issue46287> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46362] os.path.abspath() needs more normalization on Windows
Change by neonene : -- keywords: +patch pull_requests: +28772 stage: -> patch review pull_request: https://github.com/python/cpython/pull/30571 ___ Python tracker <https://bugs.python.org/issue46362> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46362] os.path.abspath() needs more normalization on Windows
New submission from neonene : 3.11a3+ introduced the C version of abspath(), which shows incompletely normalized absolute path (see msg410068): >>> os.path.abspath(r'\\spam\\eggs. . .') 'spameggs. . .' >>> os.path.abspath('C:\\spam. . .') 'C:\\spam. . .' >>> os.path.abspath('C:\\nul') 'C:\\nul' The design is efficient on startup with getpath_abspath(), but ntpath.abspath()'s result after startup should be more normalized. -- components: Windows messages: 410456 nosy: neonene, paul.moore, steve.dower, tim.golden, zach.ware priority: normal severity: normal status: open title: os.path.abspath() needs more normalization on Windows type: behavior versions: Python 3.11 ___ Python tracker <https://bugs.python.org/issue46362> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46287] UNC path normalisation issues on Windows
neonene added the comment: Regarding https://github.com/python/cpython/pull/30362#issuecomment-1005496892 _Py_abspath/_getfullpathname does not always call GetFullPathNameW on 3.11. Python 3.10.1 >>> nt._getfullpathname('.\\C:spameggs. . .') '.\\C:\\spam\\eggs' Python 3.11.0a3 >>> nt._getfullpathname('.\\C:spameggs. . .') '.\\C:spameggs. . .' ------ nosy: +neonene ___ Python tracker <https://bugs.python.org/issue46287> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46208] os.path.normpath change between 3.11.0a2 and 3.11.0a3+
neonene added the comment: >Here's a branch with a passing ntpath.normpath test and a failing >posixpath.normpath test: I took the test cases for my PR, thanks. On Windows machine, ntpath fails and posixpath passes. It seems that the passing one is tested with pure python code. -- ___ Python tracker <https://bugs.python.org/issue46208> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46208] os.path.normpath change between 3.11.0a2 and 3.11.0a3+
Change by neonene : -- keywords: +patch nosy: +neonene nosy_count: 3.0 -> 4.0 pull_requests: +28576 stage: -> patch review pull_request: https://github.com/python/cpython/pull/30362 ___ Python tracker <https://bugs.python.org/issue46208> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46217] 3.11 build failure on Win10: new _freeze_module changes?
neonene added the comment: The flag is not for Win8.1 and available starting in Win10 1703 with v10.0.15021 SDK. -- nosy: +neonene ___ Python tracker <https://bugs.python.org/issue46217> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46123] _freeze_module on Windows can be built faster with no optimization
Change by neonene : -- keywords: +patch pull_requests: +28400 stage: -> patch review pull_request: https://github.com/python/cpython/pull/30181 ___ Python tracker <https://bugs.python.org/issue46123> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46123] _freeze_module on Windows can be built faster with no optimization
New submission from neonene : In Makefile.pre.in, LTO is disabled when building _freeze_module. MSVC can also cut the build time of _freeze_module.exe in half without the optimization. -- components: Build, Windows messages: 408841 nosy: neonene, paul.moore, steve.dower, tim.golden, zach.ware priority: normal severity: normal status: open title: _freeze_module on Windows can be built faster with no optimization type: performance versions: Python 3.11 ___ Python tracker <https://bugs.python.org/issue46123> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40915] multiple problems with mmap.resize() in Windows
Change by neonene : -- nosy: +neonene nosy_count: 6.0 -> 7.0 pull_requests: +28391 pull_request: https://github.com/python/cpython/pull/30175 ___ Python tracker <https://bugs.python.org/issue40915> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45582] Rewrite getpath.c in Python
Change by neonene : -- pull_requests: +28238 pull_request: https://github.com/python/cpython/pull/30014 ___ Python tracker <https://bugs.python.org/issue45582> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45582] Rewrite getpath.c in Python
Change by neonene : -- pull_requests: +28166 pull_request: https://github.com/python/cpython/pull/29941 ___ Python tracker <https://bugs.python.org/issue45582> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45582] Rewrite getpath.c in Python
Change by neonene : -- pull_requests: +28154 pull_request: https://github.com/python/cpython/pull/29930 ___ Python tracker <https://bugs.python.org/issue45582> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45582] Rewrite getpath.c in Python
neonene added the comment: PGO-instrumented binary seems not to specify the stdlib directory on PR29041. I can run it with PYTHONPATH set. Python path configuration: PYTHONHOME = 'C:\Py311\' PYTHONPATH = (not set) program name = 'C:\Py311\PCbuild\amd64\instrumented\python.exe' isolated = 0 environment = 1 user site = 1 import site = 1 is in build tree = 1 stdlib dir = 'C:\Py311\PCbuild\Lib' sys._base_executable = 'C:\\py311\\PCbuild\\amd64\\instrumented\\python.exe' sys.base_prefix = 'C:\\py311\\' sys.base_exec_prefix = 'C:\\py311\\' sys.platlibdir = 'DLLs' sys.executable = 'C:\\py311\\PCbuild\\amd64\\instrumented\\python.exe' sys.prefix = 'C:\\py311\\' sys.exec_prefix = 'C:\\py311\\' sys.path = [ 'C:\\py311\\PCbuild\\amd64\\instrumented\\python311.zip', 'C:\\py311\\PCbuild\\Lib', 'C:\\py311\\PCbuild\\amd64\\instrumented', ] Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding Python runtime state: core initialized ModuleNotFoundError: No module named 'encodings' -- ___ Python tracker <https://bugs.python.org/issue45582> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45582] Rewrite getpath.c in Python
Change by neonene : -- nosy: +neonene nosy_count: 6.0 -> 7.0 pull_requests: +28130 pull_request: https://github.com/python/cpython/pull/29906 ___ Python tracker <https://bugs.python.org/issue45582> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1: inlining issue in the big _PyEval_EvalFrameDefault() function with Visual Studio (MSC)
neonene added the comment: I requested the MSVC team to reconsider the inlining issues, including __forceinline. https://developercommunity.visualstudio.com/t/1595341 The stuck at link due to __forceinline can be avoided by completing the _Py_DECREF optimization outside _PyEval_EvalFrameDefault: static inline void // no __forceinline _Py_DECREF_impl(...) { ... } static __forceinline void _Py_DECREF(...) { // no conditional branch in the function _Py_DECREF_impl(...); } In _PyEval_EvalFrameDefault, wrapping the callees like above seems better for performance than just specifying __forceinline under the current MSVC. -- ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1: inlining issue in the big _PyEval_EvalFrameDefault() function with Visual Studio (MSC)
neonene added the comment: In the eval-loop of PR29565, inlining seems to be enabled within about 70 op-brahches, trained with 44 tests. log & source: ceval_PR29565_split_func.c (not for performance) -- Added file: https://bugs.python.org/file50452/ceval_PR29565_split_func.c ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1: inlining issue in the big _PyEval_EvalFrameDefault() function with Visual Studio (MSC)
neonene added the comment: >This essentially disables PGO. Thank you for the suggestion. I'll take another experimental aproach to reduce the size of 3.11 evalfunc for stronger validation. >@neonene what's the importance of PR29565? While we are talking about function size, I would like to use around PR29565 for consistent reporting. I think any commit is okay to reproduce the issue. And please ignore the patch to build.bat. -- ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1: inlining issue in the big _PyEval_EvalFrameDefault() function with Visual Studio (MSC)
neonene added the comment: Here are the 3 steps to reproduce with minimal pgo training. (vs2019) 1. Download the source archive of PR29565 and extract. https://github.com/python/cpython/archive/6a84d61c55f2e543cf5fa84522d8781a795bba33.zip 2. Apply the following patch. == --- PCbuild/build.bat +++ PCbuild/build.bat @@ -66 +66 @@ -set pgo_job=-m test --pgo +set pgo_job=-c"pass" --- PCbuild/pyproject.props +++ PCbuild/pyproject.props @@ -47,2 +47,3 @@ /utf-8 %(AdditionalOptions) + /d2inlinelogfull:_PyEval_EvalFrameDefault %(AdditionalOptions) == 3. Build [Rebuild] PCbuild\build --no-tkinter --pgo > build.log [-r] According to the inlining section in the log, any function that has one or more conditional expressions got "reject" from inliner. > Inlinee for function _PyEval_EvalFrameDefault > -_Py_EnsureFuncTstateNotNULL (pgo hard reject) > ... > _Py_INCREF (pgu decision) > _Py_INCREF (pgu decision) > -_Py_XDECREF (pgo hard reject) > -_Py_XDECREF (pgo hard reject) > -_Py_DECREF (pgo hard reject) > -_Py_DECREF (pgo hard reject) > ... Profiling scores can be shown on VS2019 Command Prompt. pgomgr PCbuild\amd64\python311.pgd /summary [/detail] > largefile.txt * pgomgr.exe (or profile itself) has an issue. https://developercommunity.visualstudio.com/t/1560909 Unused opcodes in this training ROT_THREE, DUP_TOP_TWO, UNARY_POSITIVE, UNARY_NEGATIVE, BINARY_OP_ADD_FLOAT, UNARY_INVERT, BINARY_OP_MULTIPLY_INT, BINARY_OP_MULTIPLY_FLOAT, GET_LEN, MATCH_MAPPING, MATCH_SEQUENCE, MATCH_KEYS, LOAD_ATTR_SLOT, LOAD_METHOD_CLASS, GET_AITER, GET_ANEXT, BEFORE_ASYNC_WITH, END_ASYNC_FOR, STORE_ATTR_SLOT, STORE_ATTR_WITH_HINT, GET_YIELD_FROM_ITER, PRINT_EXPR, YIELD_FROM, GET_AWAITABLE, LOAD_ASSERTION_ERROR, SETUP_ANNOTATIONS, UNPACK_EX, DELETE_ATTR, DELETE_GLOBAL, ROT_N, COPY, DELETE_DEREF, LOAD_CLASSDEREF, MATCH_CLASS, SET_UPDATE, DO_TRACING I managed to activate inliner experimentally by removing the 36 op-cases from switch and merging/removing many macros. Static instruction counts of _PyEval_EvalFrameDefault() PR29565 : 6882 (down to 4400 with above change) PR29482 : 7035 PR29482~1 : 7742 3.10.0+ : 3980 (well inlined sharing DISPATCH macro) 3.10.0: 5559 3.10b1: 5680 3.10a7: 4117 (well inlined) -- ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1: inlining issue in the big _PyEval_EvalFrameDefault() function with Visual Studio (MSC)
neonene added the comment: I still have the issue in current main and PR29565 with msvc2022 (v142 or v143 toolset). -- ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1: inlining issue in the big _PyEval_EvalFrameDefault() function with Visual Studio (MSC)
neonene added the comment: msg402954 >https://github.com/faster-cpython/tools According to the suggested stats and pgomgr.exe, I experimentally moved LOAD_FAST and LOAD_CONST cases out of switch as below. if (opcode == LOAD_FAST) { ... DISPATCH(); } if (opcode == LOAD_CONST) { ... DISPATCH(); } switch (opcode) { x64 performance results after patched (msvc2019) Good inliner ver. 3.10.0+1.03x faster than before 28d28e0~1 1.04x faster 3.8.12 1.03x faster Bad inliner ver. (too big evalfunc. Has msvc2022 increased the capacity?) 3.10.0/rc2 1.00x faster 3.11a1+1.02x faster It seems to me since quite a while ago the optimizer has stopped at some place after successful inlining. So the performance may be sensitive to code changes and it could be possible to detect where the optimization is aborted. (Benchmarks: switch-case_unarranged_bench.txt) -- Added file: https://bugs.python.org/file50363/switch-case_unarranged_bench.txt ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows: Py_DECREF() not inlined in PGO build
neonene added the comment: 3.10.0 official binary is as slow as rc2. Many files are not updated in the source archive or b494f5935c92951e75597bfe1c8b1f3112fec270, so I'm not sure if the delay is intentional or not. We have no choice except waiting for 3.10.1. -- ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows: Py_DECREF() not inlined in PGO build
neonene added the comment: PR28475 is not in the official source archive. https://www.python.org/ftp/python/3.10.0/Python-3.10.0.tar.xz I'll check later whether official binary has the fix. -- ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows: Py_DECREF() not inlined in PGO build
neonene added the comment: _PyEval_EvalFrameDefault() may also need to be divided. -- ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows: Py_DECREF() not inlined in PGO build
neonene added the comment: @pablogsal I'm OK with more effective fixes in 3.10.1 and later. Thanks all, thanks kj and malin for many help. -- ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows: Py_DECREF() not inlined in PGO build
neonene added the comment: I submitted 2 drafts in a hurry. Sorry for short explanations. I'll add more reports. -- ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows: Py_DECREF() not inlined in PGO build
Change by neonene : -- pull_requests: +27001 pull_request: https://github.com/python/cpython/pull/28631 ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows: Py_DECREF() not inlined in PGO build
Change by neonene : -- pull_requests: +27000 pull_request: https://github.com/python/cpython/pull/28630 ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows: Py_DECREF() not inlined in PGO build
neonene added the comment: I have another fix. -- ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows: Py_DECREF() not inlined in PGO build
neonene added the comment: 3.10rc2 Python/ceval.c 1306: #define DISPATCH() \ 1307: { \ 1308: if (trace_info.cframe.use_tracing OR_DTRACE_LINE OR_LLTRACE) { \ 1309: goto tracing_dispatch; \ Among the 44 pgo-tests, only test_patma.TestTracing hits the condition above. On Windows, it seems that skipping it tightens the profile of PR28475 a bit. Additional tests such as test_threading(.ThreadTests.test_frame_tstate_tracing) might also cause some amount of variation or vice versa. 3.10rc2 x64 PGO: 1.00 + PR28475 with TestTracing : 1.05x faster (slow 3, fast 46, same 9) without : 1.06x faster (slow 5, fast 52, same 1) with TestTracing : 1.00 without : 1.01x faster (slow 19, fast 27, same 12) (Details: PR28475_skip1test_bench.txt) Does test_patma.TestTracing need training for match-case performance? -- Added file: https://bugs.python.org/file50296/PR28475_skip1test_bench.txt ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows: Py_DECREF() not inlined in PGO build
neonene added the comment: PR28475 PGO is 2% slower than the patch I pasted on msg401743. The function sizes are almost the same (+1:goto,+1:label), and there is no performance gap between release builds. I suspect the following. 1. PGO is too sensitive to a function size at near the limit. 2. PR28475 is not fully covered by 44 tests. (msg401346) -- ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows: Py_DECREF() not inlined in PGO build
neonene added the comment: To be fair, the slowdowns between PR25244 and b1 seems to be an accumulation of "1.00x slower" of every commit. I don't know after b1. -- ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows: Py_DECREF() not inlined in PGO build
neonene added the comment: I built 3.10rc2 PGO with PR28475 applied, and posted the inliner's log. In the log, the 4-callees mentioned above are now inlined, which were "hard reject"ed before. As for the performance, a few reporters may be needed, but it's not necessary for them to care about noises in the apparent gap. 310rc2 x64 PGO : 1.00 + PR28475 build1 bench1 : 1.05x faster (slower 7, faster 43, nochange 8) bench2 : 1.05x faster (slower 2, faster 43, nochange 13) build2: 1.05x faster (slower 4, faster 45, nochange 9) 310rc2 x64 release : 1.00 + PR28475 : 1.01x faster (slower 14, faster 25, nochange 19) Is Windows involved in the faster-cpython project? If so, the project should be provided with Windows machines for validation. -- Added file: https://bugs.python.org/file50291/PR28475_inline.log ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows: Py_DECREF() not inlined in PGO build
neonene added the comment: >release with the performance regression I'm OK with the option. The limitation of PGO seems to me a bit weird and it might be unexpected for MSVC team. -- ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows: Py_DECREF() not inlined in PGO build
neonene added the comment: > (32-bit: "1.07", 64-bit: "1.14": "higher the slower" wrote neonene) 32-bit and 64-bit are in reverse. I compared b1 and a7 because this can be confirmed by anyone with official binary. If 7% of my patch has little to do with the gap, then I will be happy that 3.10 can be far faster. >How can I build Python with PGO on Windows? Try the following, PCbuild\build.bat -p x64 --no-tkinter --pgo Before building, your object.h needs to replace static inline int Py_ALWAYS_INLINE with static Py_ALWAYS_INLINE int In my case, pgo got stuck on linking with the object.h. I'm waiting the reply from developercommunity. -- ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows: Py_DECREF() not inlined in PGO build
neonene added the comment: I reported this issue to developercommunity of microsoft. https://developercommunity.visualstudio.com/t/1531987 -- ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows: Py_DECREF() not inlined in PGO build
neonene added the comment: With msvc 16.10.3 and 16.11.2 (latest), PR25244 told me the amount of code in _PyEval_EvalFrameDefault() is over the limit of PGO. In the old version of _PyEval_EvalFrameDefault (b98eba5), the same issue can be caused adding any-code anywhere with more than 20 expressions/statements. For example, at the top/middle/end of the function, repeating "if (0) {}" 10times, or "if (0) {19 statements}". As for python3.9.7, more than 800 expressions/statements. Here is just a workaround for 3.10rc2 on windows. == --- Python/ceval.c +++ Python/ceval.c @@ -1306,9 +1306 @@ -#define DISPATCH() \ -{ \ -if (trace_info.cframe.use_tracing OR_DTRACE_LINE OR_LLTRACE) { \ -goto tracing_dispatch; \ -} \ -f->f_lasti = INSTR_OFFSET(); \ -NEXTOPARG(); \ -DISPATCH_GOTO(); \ -} +#define DISPATCH() goto tracing_dispatch @@ -1782,4 +1774,9 @@ tracing_dispatch: { +if (!(trace_info.cframe.use_tracing OR_DTRACE_LINE OR_LLTRACE)) { +f->f_lasti = INSTR_OFFSET(); +NEXTOPARG(); +DISPATCH_GOTO(); +} int instr_prev = f->f_lasti; f->f_lasti = INSTR_OFFSET(); == This patch becomes ineffective just adding one expression to DISPATCH macro as below #define DISPATCH() {if (1) goto tracing_dispatch;} And this approach is not sufficient for 3.11 with bigger eval-func. I don't know a cl/link option to lift such restriction of function size. 3.10rc2 x86 pgo : 1.00 patched : 1.09x faster (slower 5, faster 48, not significant 5) 3.10rc2 x64 pgo : 1.00 (roughly the same speed as official bin) patched : 1.07x faster (slower 5, faster 47, not significant 6) patched(/Ob3) : 1.07x faster (slower 7, faster 45, not significant 6) x64 results are posted. Fixing inlining rejection also made __forceinline buildable with normal processing time and memory usage. -- Added file: https://bugs.python.org/file50280/310rc2_benchmarks.txt ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows
neonene added the comment: According to: https://docs.microsoft.com/en-us/cpp/build/profile-guided-optimizations?view=msvc-160 PGO seems to override /Ob3. Around this issue, I posted benchmark on issue44381. On python building, /Ob3 works for only non-pgo-supported dlls like, _ctypes_test _freeze_importlib _msi _testbuffer _testcapi _testconsole _testembed _testimportmultiple _testinternalcapi _testmultiphase _uuid liblzma pylauncher pyshellext pywlauncher sqlite3 venvlauncher venvwlauncher winsound I use this option in _msvccompiler.py for my pyd. I will try and report when PGO with /Ob3 makes difference in the log. -- ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows
Change by neonene : Added file: https://bugs.python.org/file50276/x64_b98e.log ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows
Change by neonene : Added file: https://bugs.python.org/file50275/x64_28d2.log ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows
Change by neonene : Added file: https://bugs.python.org/file50274/pyproject_inlinestat.patch ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows
Change by neonene : Added file: https://bugs.python.org/file50273/b98e-no-inline-in-the-others.diff ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows
neonene added the comment: Thanks for all suggestions. I focused on my bisected commit and the previous. I run pyperformance with 4 functions never inlined in the sections below. _Py_DECREF() _Py_XDECREF() _Py_IS_TYPE() _Py_atomic_load_32bit_impl() are (1) never inlined in _PyEval_EvalFrameDefault(). (2) never inlined in the other funcitons. (3) never inlined in all functions. slow downs [4-funcs never inlined section] -- Windows x64 PGO (44job)(*)(1)(2)(3) rebuildnone eval others all -- b98eba5 (4 funcs inlined in eval) 1.00 1.05 1.09 1.14 PR25244 (not inlined in eval) 1.06 1.07 1.18 1.17 pyperf compare_to upper lower: (*) 1.06x slower (slower 45, faster 4, not not significant 9) (1) 1.02x slower (slower 33, faster 13, not not significant 12) (2) 1.08x slower (slower 48, faster 6, not not significant 4) (3) 1.03x slower (slower 39, faster 6, not not significant 13) -- Windows x86 PGO (44job)(*)(1)(2)(3) rebuildnone eval others all -- b98eba5 (4 funcs inlined in eval) 1.00 1.03 1.06 1.15 PR25244 (not inlined in eval) 1.13 1.13 1.22 1.24 pyperf compare_to upper lower: (*) 1.13x slower (slower 54, faster 2, not not significant 2) (1) 1.10x slower (slower 47, faster 3, not not significant 8) (2) 1.14x slower (slower 54, faster 1, not not significant 3) (3) 1.08x slower (slower 43, faster 3, not not significant 12) In both x64 and x86, it looks column (2) and (*) has similar gaps. So, I would like to simply focus on the eval-loop. I built PGO with "/d2inlinestats" and "/d2inlinelogfull:_PyEval_EvalFrameDefault" according to the blog. I posted logs. As for PR25244, the logsize is 3x smaller than the previous and pgo rejects the 4 funcs above. I will look into it later. Collecting: > Before the PR, it took 10x~ longer to link than without __forceinline > function. Current build is 10x~ shorter than before to link. Before the PR, __forceinline had no impact to me. -- Added file: https://bugs.python.org/file50271/b98e-no-inline-in-all.diff ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows
Change by neonene : Added file: https://bugs.python.org/file50272/b98e-no-inline-in-eval.diff ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows
neonene added the comment: @vstinner: __forceinline suggestion Since PR25244 (mentioned above), it seems link.exe has got to get stuck on python310.dll. Before the PR, it took 10x~ longer to link than without __forceinline function. I can confirm with _Py_DECREF() and _Py_XDECREF() and one training-job (the more fucntions forced/jobs used, the slower to link). Have you tried __forceinline on PGO ? > I don't understand how to read the table. Overhead field is the output of pyperf command, not subtraction (the answers are the same just luckily). ex) 3.10rc1x86 PGO: PGO : pyperf compare_to 3.10a7 left patched : pyperf compare_to 3.10a7 right overhead : pyperf compare_to right left are 1.15x slower (slower 52, faster 4, not significant 2) 1.13x slower (slower 50, faster 4, not significant 4) 1.02x slower (slower 29, faster 14, not significant 15) > I'm not sure if PGO builds are reproducible, MSVC does not produce the same code. Inlining (all or nothing) might be a quite special case in the hottest section. I suspect the profiler doesn't work well only for _PyEval_EvalFrameDefault(), including branch/align optimization. So my posted macro or inlining is just for a mesureing, not the solution. -- ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows
Change by neonene : Added file: https://bugs.python.org/file50264/ceval_310rc1_patched.c ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45116] Performance regression 3.10b1 and later on Windows
New submission from neonene : pyperformance on Windows shows some gap between 3.10a7 and 3.10b1. The following are the ratios compared with 3.10a7 (the higher the slower). - Windows x64 | PGO release official-binary + 20210405| 3.10a7 | 1.00 1.241.00 (PGO?) 20210408-07:58 | b98eba5 | 0.98 20210408-10:22 | * PR25244 | 1.04 20210503| 3.10b1 | 1.07 1.211.07 - Windows x86 | PGO release official-binary + 20210405| 3.10a7 | 1.00 1.251.27 (release?) 20210408-07:58 | b98eba5bc | 1.00 20210408-10:22 | * PR25244 | 1.11 20210503| 3.10b1 | 1.14 1.281.29 Since PR25244 (28d28e053db6b69d91c2dfd579207cd8ccbc39e7), _PyEval_EvalFrameDefault() in ceval.c has seemed to be unoptimized with PGO (msvc14.29.16.10). At least the functions below have become un-inlined there at all. (1) _Py_DECREF() (from Py_DECREF,Py_CLEAR,Py_SETREF) (2) _Py_XDECREF()(from Py_XDECREF,SETLOCAL) (3) _Py_IS_TYPE()(from PyXXX_CheckExact) (4) _Py_atomic_load_32bit_impl() (from CHECK_EVAL_BREAKER) I tried in vain other linker options like thread-safe-profiling, agressive-code-generation, /OPT:NOREF. 3.10a7 can inline them in the eval-loop even if profiling only test_array.py. I measured overheads of (1)~(4) on my own build whose eval-loop uses macros instead of them. - Windows x64 | PGO patched overhead in eval-loop + 3.10a7 | 1.00 20210802| 3.10rc1 | 1.09 1.054% (slow 43, fast 5, same 10) 20210831-20:42 | 863154c | 0.95 0.905% (slow 48, fast 3, same 7) (3.11a0+)| - Windows x86 | PGO patched overhead in eval-loop + 3.10a7 | 1.00 20210802| 3.10rc1 | 1.15 1.132% (slow 29, fast 14, same 15) 20210831-20:42 | 863154c | 1.05 1.023% (slow 44, fast 7, same 7) (3.11a0+)| -- components: C API, Interpreter Core, Windows files: 310rc1_confirm_overhead.patch keywords: patch messages: 401143 nosy: Mark.Shannon, neonene, pablogsal, paul.moore, steve.dower, tim.golden, vstinner, zach.ware priority: normal severity: normal status: open title: Performance regression 3.10b1 and later on Windows type: performance versions: Python 3.10, Python 3.11 Added file: https://bugs.python.org/file50263/310rc1_confirm_overhead.patch ___ Python tracker <https://bugs.python.org/issue45116> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44381] Allow enabling control flow guard in Windows build
neonene added the comment: I'd like to leave my pyperformance (x64) results here. cpython: ae5259171b8ef62165e061b9dea7ad645a5131a2 (2021-8-23) 1) release + CFG : 1.00x 2) release + CFG,/Ob3 : 1.05x faster | 41 faster | 9 slower | 8 not significant 3) release (default) : 1.07x faster | 52 faster | 4 slower (regex_v8, |regex_effbot, |nbody, |hexiom) | 2 not significant 4) release + /Ob3 : 1.11x faster | 56 faster | 1 slower (regex_v8) | 1 not significant (regex_dna) 5) PGO + CFG : 1.15x faster | 53 faster | 2 slower (regex_dna, |pidigits) | 3 not significant 6) PGO + CFG,/Ob3 : 1.15x faster | 54 faster | 1 slower (regex_dna) | 3 not significant 7) PGO (default) : 1.21x faster | 56 faster | 1 slower (regex_dna) | 1 not significant (regex_effbot) 8) PGO + /Ob3 : 1.21x faster | 57 faster | 1 slower (regex_dna) | 0 not significant -- nosy: +neonene ___ Python tracker <https://bugs.python.org/issue44381> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44878] Clumsy dispatching on interpreter entry.
neonene added the comment: FYI, PR27727 ("Remove loop...") seems to be a bit slower than the previous commit (f08e6d1bb3c5655f184af88c6793e90908bb6338) on my Windows build (msvc14.29.16.10). pyperformance shows that Windows x64 PGO: 34 slower, 11 faster, 13 not significant, Geometric mean: 1.02x slower Windows x86 PGO: 28 slower, 17 faster, 13 not significant, Geometric mean: 1.02x slower Undoing PR27727 on current cpython-main branch also get speed-ups by 1-2% on average. -- nosy: +neonene ___ Python tracker <https://bugs.python.org/issue44878> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44479] Windows build doesn't regenerate some files
neonene added the comment: When building, some pull-requests invoke regeneration of test_frozenmain.h. On PGO mode, MSVC tries to call instrumented python and stops with "pgort140.dll not found" error. Would it be OK to run python in externals folder instead ? -- ___ Python tracker <https://bugs.python.org/issue44479> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44479] Windows build doesn't regenerate some files
Change by neonene : -- nosy: +neonene nosy_count: 7.0 -> 8.0 pull_requests: +25688 pull_request: https://github.com/python/cpython/pull/27146 ___ Python tracker <https://bugs.python.org/issue44479> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44575] Windows installer prohibits different patches for the same version
neonene added the comment: To debug pure python, use embeddable pythons in different folders and copy Lib folder from source archive instead of using python3.9.zip. When using msvc (python3*.lib), I think it's enouth to install python as follows or build from source. 1.Copy Installed Python to any folder. 2.Uninstall Python. 3.Install Python with different minor version. -- nosy: +neonene ___ Python tracker <https://bugs.python.org/issue44575> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43105] [Windows] Can't import extension modules resolved via relative paths in sys.path
neonene added the comment: After this contribution, when using module at the root dir (maybe bad manners), the followings are expected behaviors? (1) relative drive in sys.path -> bytecode is not put in __pycache__ folder. >>> import sys >>> sys.path.append('F:') # flash device, etc... >>> import foo >>> foo.__file__ 'F:foo.py' >>> foo.__cached__ 'F:foo.cpython-311.pyc' (2) absolute drive in sys.path -> __pycache__ is under current dir, not absolute. >>> import sys >>> sys.path.append('F:\\') >>> import foo >>> foo.__file__ 'F:\\foo.py' >>> foo.__cached__ 'F:__pycache__\\foo.cpython-311.pyc' -- nosy: +neonene ___ Python tracker <https://bugs.python.org/issue43105> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42986] pegen parser: Crash on SyntaxError with f-string on Windows
neonene added the comment: For me, I confirmed no crash with PR 24279. Thanks for the fix in no time. -- ___ Python tracker <https://bugs.python.org/issue42986> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42986] pegen parser: Crash on SyntaxError with f-string on Windows
New submission from neonene : On Windows, Python master crashes using f-string (which has an invalid char with braces) on line 3 and after. It seems the issue is from commit (e5fe509054183bed9aef42c92da8407d339e8af8). I tried 1) exec("f'{.}'") 2) exec("\nf'{.}'") 3) exec("\n\nf'{.}'") commands and results are 1) expected >>> exec("f'{.}'") Traceback (most recent call last): File "", line 1, in File "", line 1 (.) ^ SyntaxError: f-string: invalid syntax 2) unexpected (caret indicates nothing) >>> exec("\nf'{.}'") Traceback (most recent call last): File "", line 1, in File "", line 2 ^ SyntaxError: f-string: invalid syntax 3) python crashes >>> exec("\n\nf'{.}'") -- components: Interpreter Core messages: 385377 nosy: lys.nikolaou, neonene, pablogsal priority: normal severity: normal status: open title: pegen parser: Crash on SyntaxError with f-string on Windows type: crash versions: Python 3.10 ___ Python tracker <https://bugs.python.org/issue42986> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42846] Using _multibytecodec module on Windows, test_threading/embed get failure
New submission from neonene : After https://github.com/python/cpython/commit/0b858cdd5d114f0890b11b6c4d6559d0ceb468ab (bpo-1635741: Convert _multibytecodec to multi-phase init), On Windows x64/x86 with chinese/japanese/korean system-locale, MultibyteCodec_Check() in multibytecodec.c returns false and PyExc_TypeError follows. This affects some tests and PGO training. 1) python -m test --verbose test_threading == FAIL: test_daemon_threads_fatal_error (test.test_threading.SubinterpThreadi ngTests) -- Traceback (most recent call last): File "C:\cpython-0b858\lib\test\test_threading.py", line 1124, in test_da emon_threads_fatal_error self.assertIn("Fatal Python error: Py_EndInterpreter: " AssertionError: 'Fatal Python error: Py_EndInterpreter: not the last thread ' not found in 'TypeError: codec is unexpected type\nFatal Python error: _P yThreadState_Delete: tstate 003FF980 is still current\nPython runti me state: initialized\n\nThread 0x0710 (most recent call first):\n\n' 2) python -m test --verbose test_embed == FAIL: test_audit_subinterpreter (test.test_embed.AuditingTests) -- Traceback (most recent call last): File "C:\cpython-0b858\lib\test\test_embed.py", line 1433, in test_audit_ subinterpreter self.run_embedded_interpreter("test_audit_subinterpreter") File "C:\cpython-0b858\lib\test\test_embed.py", line 104, in run_embedded _interpreter self.assertEqual(p.returncode, returncode, AssertionError: 3221225477 != 0 : bad returncode 3221225477, stderr is 'Typ eError: codec is unexpected type\nFatal Python error: _PyThreadState_Delete : tstate 0050CAF0 is still current\nPython runtime state: initializ ed\n\nThread 0x09d8 (most recent call first):\n\n' == FAIL: test_subinterps_different_ids (test.test_embed.EmbeddingTests) -- Traceback (most recent call last): File "C:\cpython-0b858\lib\test\test_embed.py", line 169, in test_subinte rps_different_ids for run in self.run_repeated_init_and_subinterpreters(): File "C:\cpython-0b858\lib\test\test_embed.py", line 110, in run_repeated _init_and_subinterpreters out, err = self.run_embedded_interpreter("test_repeated_init_and_subint erpreters") File "C:\cpython-0b858\lib\test\test_embed.py", line 104, in run_embedded _interpreter self.assertEqual(p.returncode, returncode, AssertionError: 3221225477 != 0 : bad returncode 3221225477, stderr is 'Typ eError: codec is unexpected type\nFatal Python error: _PyThreadState_Delete : tstate 0041C960 is still current\nPython runtime state: initializ ed\n\nThread 0x0a40 (most recent call first):\n\n' == FAIL: test_subinterps_distinct_state (test.test_embed.EmbeddingTests) -- Traceback (most recent call last): File "C:\cpython-0b858\lib\test\test_embed.py", line 177, in test_subinte rps_distinct_state for run in self.run_repeated_init_and_subinterpreters(): File "C:\cpython-0b858\lib\test\test_embed.py", line 110, in run_repeated _init_and_subinterpreters out, err = self.run_embedded_interpreter("test_repeated_init_and_subint erpreters") File "C:\cpython-0b858\lib\test\test_embed.py", line 104, in run_embedded _interpreter self.assertEqual(p.returncode, returncode, AssertionError: 3221225477 != 0 : bad returncode 3221225477, stderr is 'Typ eError: codec is unexpected type\nFatal Python error: _PyThreadState_Delete : tstate 0047C960 is still current\nPython runtime state: initializ ed\n\nThread 0x0b34 (most recent call first):\n\n' == FAIL: test_subinterps_main (test.test_embed.EmbeddingTests) -- Traceback (most recent call last): File "C:\cpython-0b858\lib\test\test_embed.py", line 163, in test_subinte rps_main for run in self.run_repeated_init_and_subinterpreters(): File "C:\cpython-0b858\lib\test\test_embed.py", line 110, in run_repeated _init_and_subinterpreters out, err = self.run_embedded_interpreter("test_repeated_init_and_subint erpreters") File "C:\cpython-0b858\lib\test\test_embed.py", line 104, in run_embedded _interpreter self.assertEqual(p.returncode, returncode, AssertionError: 3221225477 != 0 : bad returncode 3221225477, stderr is 'Typ eError: cod
[issue41766] Python3.10 (x64) crashes after flake8/pyflakes on Windows
neonene added the comment: I applied PR 21961 to master and comfirmed no crash. I'll close this issue. Thanks for your quick reply. -- stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue41766> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41766] Python3.10 (x64) crashes after flake8/pyflakes on Windows
Change by neonene : -- nosy: +pablogsal, vstinner -paul.moore, steve.dower, tim.golden, zach.ware ___ Python tracker <https://bugs.python.org/issue41766> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41766] Python3.10 (x64) crashes after flake8/pyflakes on Windows
New submission from neonene : On Python3.10(64bit only) for Windows, flake8 (with .pyc cache) frequently crashes after output. e.g: python -m flake8 c:/Python3/Lib/turtle.py python -m pyflakes c:/Python3/Lib/turtle.py I think I encountered the crash after PR21293 applied. (bpo-41194: "Convert _ast extension to PEP 489") -- components: Interpreter Core, Windows messages: 376747 nosy: neonene, paul.moore, steve.dower, tim.golden, zach.ware priority: normal severity: normal status: open title: Python3.10 (x64) crashes after flake8/pyflakes on Windows type: crash versions: Python 3.10 ___ Python tracker <https://bugs.python.org/issue41766> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40958] ASAN/UBSAN: heap-buffer-overflow in pegen.c
Change by neonene : -- nosy: +christian.heimes -neonene ___ Python tracker <https://bugs.python.org/issue40958> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40958] ASAN/UBSAN: heap-buffer-overflow in pegen.c
neonene added the comment: FYI, since PR 20875/20919, msvc(x64) has warned C4244 (conversion from 'Py_ssize_t' to 'int', possible loss of data). parse.c especially gets more than 700. -- nosy: +neonene -christian.heimes, miss-islington ___ Python tracker <https://bugs.python.org/issue40958> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40082] Assertion failure in trip_signal
neonene added the comment: On Windows, PyGILState_GetThisThreadState() returns NULL when ^C-interrupt occurs. It is from TlsGetValue() winAPI and I don't think the os's behevior is wrong. In trip_signal(), crash can be avoided by skipping PyEval_SignalReceived() if tstate is invalid. But I'm not sure the skip itself is ok. -- nosy: +neonene ___ Python tracker <https://bugs.python.org/issue40082> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37702] memory leak in ssl certification
neonene added the comment: I raised another PR(15632), which keeps the changes to a minimum. I hope either PR would be in the 3.7.5 / 3.8.0 official. -- ___ Python tracker <https://bugs.python.org/issue37702> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37702] memory leak in ssl certification
Change by neonene : -- pull_requests: +15300 pull_request: https://github.com/python/cpython/pull/15632 ___ Python tracker <https://bugs.python.org/issue37702> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37702] memory leak in ssl certification
New submission from neonene : Windows10/7(x86/x64) After issue35941 (any PR merged) In https-access, memory usage increases by about 200KB per urlopen() and easily reach to giga bytes. I found out leak of certificate-store-handles in _ssl.c and made patch, which works fine for my pc. I guess some users are in trouble with this leak. I'm about to raise PR, so please review. Thanks! -- assignee: christian.heimes components: SSL, Windows messages: 348600 nosy: christian.heimes, neonene, paul.moore, steve.dower, tim.golden, zach.ware priority: normal severity: normal status: open title: memory leak in ssl certification type: resource usage versions: Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue37702> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35941] ssl.enum_certificates() regression
neonene added the comment: I meant 12609. (x86,x64 Py374rc1,Py380a4 and later) And though I tried merging 12610 and Py374, memory usage still increases. Sorry, I can't find out the cause. -- ___ Python tracker <https://bugs.python.org/issue35941> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37498] request.urlopen(), memory leak?
neonene added the comment: issue35941 -- resolution: -> duplicate stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue37498> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35941] ssl.enum_certificates() regression
neonene added the comment: After this patch applied, memory usage increases every https-access and is not released in my Win7x64SP1. I hope this will be fixed or reverted. (case sample) from urllib import request from time import sleep import gc while True: request.urlopen(request.Request('https://...')) gc.collect() sleep(2) -- nosy: +neonene ___ Python tracker <https://bugs.python.org/issue35941> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37498] request.urlopen(), memory leak?
New submission from neonene : Python3.8.0a4,b1,b2(x64,x32) Python3.7.4rc1,rc2 (x64,x32) In Windows7 SP1(x64), memory usage keep increasing by the following code. --- from urllib import request from time import sleep while True: req = request.Request('https://www.python.org/') request.urlopen(req) sleep(1) --- Sorry, I'm not sure why. -- components: Windows messages: 347293 nosy: neonene, paul.moore, steve.dower, tim.golden, zach.ware priority: normal severity: normal status: open title: request.urlopen(), memory leak? type: resource usage versions: Python 3.7, Python 3.8 ___ Python tracker <https://bugs.python.org/issue37498> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com