[issue23573] Avoid redundant allocations in str.find and like
Roundup Robot added the comment: New changeset 311a4d28631b by Serhiy Storchaka in branch '3.5': Issue #23573: Restored optimization of bytes.rfind() and bytearray.rfind() https://hg.python.org/cpython/rev/311a4d28631b New changeset c06410c68217 by Serhiy Storchaka in branch 'default': Issue #23573: Restored optimization of bytes.rfind() and bytearray.rfind() https://hg.python.org/cpython/rev/c06410c68217 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23573 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23573] Avoid redundant allocations in str.find and like
Changes by Serhiy Storchaka storch...@gmail.com: -- resolution: - fixed stage: patch review - resolved status: open - closed versions: +Python 3.6 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23573 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23573] Avoid redundant allocations in str.find and like
Serhiy Storchaka added the comment: Here is a patch that restores optimization of bytes.rfind() and bytearray.rfind() with 1-byte argument on Linux (it also reverts bc1a178b3bc8). -- nosy: +christian.heimes resolution: fixed - stage: resolved - patch review Added file: http://bugs.python.org/file39944/issue23573_bytes_rfind_memrchr.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23573 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23573] Avoid redundant allocations in str.find and like
Serhiy Storchaka added the comment: Many thanks Victor for fixing crashes. Unfortunately I couldn't reproduce a crash on my computers, perhaps it is was 64-bit only. Yes, I'll look how the code can be optimized. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23573 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23573] Avoid redundant allocations in str.find and like
Serhiy Storchaka added the comment: Looks as this patch makes buildbots crash. -- status: closed - open ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23573 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23573] Avoid redundant allocations in str.find and like
STINNER Victor added the comment: Looks as this patch makes buildbots crash. Yep. It took me some minutes to find that the crash was caused by this issue :-p http://buildbot.python.org/all/builders/AMD64%20Windows7%20SP1%203.x/builds/5930/steps/test/logs/stdio ... [117/393/1] test_bigmem Assertion failed: 0, file c:\buildbot.python.org\3.x.kloth-win64\build\objects\stringlib/fastsearch.h, line 76 Fatal Python error: Aborted Current thread 0x10ec (most recent call first): File C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_bigmem.py, line 294 in test_rfind File C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\support\__init__.py, line 1641 in wrapper File C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\case.py, line 577 in run File C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\case.py, line 625 in __call__ File C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\suite.py, line 122 in run File C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\suite.py, line 84 in __call__ File C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\suite.py, line 122 in run File C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\suite.py, line 84 in __call__ File C:\buildbot.python.org\3.x.kloth-win64\build\lib\unittest\runner.py, line 176 in run File C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\support\__init__.py, line 1773 in _run_suite File C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\support\__init__.py, line 1807 in run_unittest File C:\buildbot.python.org\3.x.kloth-win64\build\lib\test\test_bigmem.py, line 1252 in test_main ... -- nosy: +haypo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23573 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23573] Avoid redundant allocations in str.find and like
Roundup Robot added the comment: New changeset 3ac58de829ef by Victor Stinner in branch 'default': Issue #23573: Fix bytes.rfind() and bytearray.rfind() on Windows https://hg.python.org/cpython/rev/3ac58de829ef -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23573 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23573] Avoid redundant allocations in str.find and like
STINNER Victor added the comment: The problem is that Windows has no memrchr() function, and so fastsearch_memchr_1char() only supports FAST_SEARCH on Windows. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23573 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23573] Avoid redundant allocations in str.find and like
STINNER Victor added the comment: It looks like fastsearch_memchr_1char() manipulate pointers for memory alignment. It's not necessary when looking for ASCII or Latin1 characters or for bytes. I propose to add a new fastsearch_memchr_1byte() function which would be used by bytes and bytearray, but also by str for ASCII and Latin1 strings. Are you interested to implement this idea Serhiy? For Windows without memrchr(), the code can be a simple loop. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23573 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23573] Avoid redundant allocations in str.find and like
Changes by Serhiy Storchaka storch...@gmail.com: -- assignee: - serhiy.storchaka resolution: - fixed stage: patch review - resolved status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23573 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23573] Avoid redundant allocations in str.find and like
Roundup Robot added the comment: New changeset 6db9d7c1be29 by Serhiy Storchaka in branch 'default': Issue #23573: Increased performance of string search operations (str.find, https://hg.python.org/cpython/rev/6db9d7c1be29 -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23573 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23573] Avoid redundant allocations in str.find and like
New submission from Serhiy Storchaka: Currently str.find() and similar methods can make a copy of self or searched string if they have different kinds. In some cases this is redundant because the result can be known before trying to search. Longer string can't be found in shorter string and wider string can't be found in narrower string. Proposed patch avoid creating temporary widened copies in such corner cases. It also adds special cases for searching 1-character strings. Some sample microbenchmark results: $ ./python -m timeit -s a = 'x'; b = 'x\U00012345' -- b.find(a) Unpatched: 100 loops, best of 3: 1.92 usec per loop Patched: 100 loops, best of 3: 1.03 usec per loop $ ./python -m timeit -s a = 'x'; b = 'x\U00012345' -- a in b Unpatched: 100 loops, best of 3: 0.543 usec per loop Patched: 100 loops, best of 3: 0.25 usec per loop $ ./python -m timeit -s a = '\U00012345'; b = 'x'*1000 -- b.find(a) Unpatched: 10 loops, best of 3: 4.58 usec per loop Patched: 100 loops, best of 3: 0.969 usec per loop $ ./python -m timeit -s a = 'x'*1000; b = '\U00012345' -- b.find(a) Unpatched: 10 loops, best of 3: 3.77 usec per loop Patched: 100 loops, best of 3: 0.97 usec per loop $ ./python -m timeit -s a = 'x'*1000; b = '\U00012345' -- a in b Unpatched: 10 loops, best of 3: 2.4 usec per loop Patched: 100 loops, best of 3: 0.225 usec per loop -- components: Interpreter Core files: str_find_faster.patch keywords: patch messages: 237137 nosy: serhiy.storchaka priority: normal severity: normal stage: patch review status: open title: Avoid redundant allocations in str.find and like type: performance versions: Python 3.5 Added file: http://bugs.python.org/file38315/str_find_faster.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23573 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com