[issue24821] The optimization of string search can cause pessimization

2017-03-29 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker ___ _

[issue24821] The optimization of string search can cause pessimization

2017-03-29 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: New changeset 0a58f72762353768c7d26412e627ff196aac6c4e by Serhiy Storchaka in branch 'master': bpo-24821: Fixed the slowing down to 25 times in the searching of some (#505) https://github.com/python/cpython/commit/0a58f72762353768c7d26412e627ff196aac6c4e --

[issue24821] The optimization of string search can cause pessimization

2017-03-29 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Thank you for testing Louie. Thank you for your review and testing Xiang. > Would it completely kill performances to remove the optimization? Yes. The optimization is not used if the lowest byte is 0. You can try to search "\0" for getting the result without

[issue24821] The optimization of string search can cause pessimization

2017-03-29 Thread STINNER Victor
STINNER Victor added the comment: Would it completely kill performances to remove the optimization? -- ___ Python tracker ___ ___ Pyth

[issue24821] The optimization of string search can cause pessimization

2017-03-29 Thread Ma Lin
Changes by Ma Lin : -- nosy: +Ma Lin ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/ma

[issue24821] The optimization of string search can cause pessimization

2017-03-29 Thread Xiang Zhang
Xiang Zhang added the comment: I can't give a "realistic" example. A more meaningful example may be: '。', '\u3002', the Chinese period which used in almost every paragraph, '地', '\u5730', which is a common used word. ./python3 -m perf timeit -s 's = "你好,我叫李雷。"*1000' 's.find("地")' Mean +- std d

[issue24821] The optimization of string search can cause pessimization

2017-03-29 Thread Louie Lu
Louie Lu added the comment: I can now only test on Python3.6, providing much meaningful sentence, still trying to use perf on cpython master branch. --- $ python -m perf timeit -s 's="一件乒乓事事亏, 不乏串連产業, 万丈一争今为举, 其乎哀哉"*1000' -- 's.find("乎")' Median

[issue24821] The optimization of string search can cause pessimization

2017-03-29 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Ukrainian "Є" is not the only character suffered from this issue. I suppose much CJK characters are suffered too. The problem is occurred when the lower byte of searched character matches the upper byte of most characters in the text. For example, searching

[issue24821] The optimization of string search can cause pessimization

2017-03-06 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- pull_requests: +413 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://ma

[issue24821] The optimization of string search can cause pessimization

2017-03-06 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- versions: +Python 3.7 -Python 3.6 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscrib

[issue24821] The optimization of string search can cause pessimization

2016-09-12 Thread Ned Deily
Ned Deily added the comment: If it has waited this long and is truly low priority, I think it can wait a little longer until 3.7. -- ___ Python tracker ___ _

[issue24821] The optimization of string search can cause pessimization

2016-09-12 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- nosy: +ned.deily ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.

[issue24821] The optimization of string search can cause pessimization

2016-09-12 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I would commit the patch before beta 1, but: 1) The issue can be considered not as a new feature, but as a fix of performance bug (just we don't will to backport the fix to 3.5). I think the patch can be committed at the beta stage. 2) I would want to tune

[issue24821] The optimization of string search can cause pessimization

2016-09-06 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Could you please make a review Victor? -- ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue24821] The optimization of string search can cause pessimization

2016-06-21 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Ping. -- ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.pytho

[issue24821] The optimization of string search can cause pessimization

2015-11-14 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > Could you please push this part of the change? It looks good to me. Done. The patch that make an optimization now looks much simpler. > The C library provides a wmemchr() function which can be used to search for a > wchar_t character inside a wchar_t* stri

[issue24821] The optimization of string search can cause pessimization

2015-11-14 Thread Roundup Robot
Roundup Robot added the comment: New changeset 1412be96faf0 by Serhiy Storchaka in branch 'default': Issue #24821: Refactor STRINGLIB(fastsearch_memchr_1char) and split it on https://hg.python.org/cpython/rev/1412be96faf0 -- nosy: +python-dev ___ Pyth

[issue24821] The optimization of string search can cause pessimization

2015-11-14 Thread STINNER Victor
STINNER Victor added the comment: > The patch also makes a little refactoring. STRINGLIB(fastsearch_memchr_1char) > now is renamed and split on two functions STRINGLIB(find_char) and > STRINGLIB(rfind_char) with simpler interface. Could you please push this part of the change? It looks good to

[issue24821] The optimization of string search can cause pessimization

2015-11-13 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Proposed patch makes the degenerate case less hard while preserves the optimization for common case. $ ./python -m timeit -s 's = "АБВГД"*10**5' -- 's.find("є")' 1000 loops, best of 3: 330 usec per loop $ ./python -m timeit -s 's = "АБВГД"*10**5' -- 's.rfind(

[issue24821] The optimization of string search can cause pessimization

2015-10-06 Thread Adam
Changes by Adam : -- nosy: +azsorkin ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/ma

[issue24821] The optimization of string search can cause pessimization

2015-09-10 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I was going to provide another optimization (I had an idea), but it is not so easy as looked to me at first glance. This issue exists rather as a reminder to me. I should make a second attempt. -- resolution: -> remind _

[issue24821] The optimization of string search can cause pessimization

2015-09-10 Thread STINNER Victor
STINNER Victor added the comment: > I think we should use more robust optimization. What do you propose? I don't understand the purpose of the issue. Do you want to remove the optimization? -- ___ Python tracker

[issue24821] The optimization of string search can cause pessimization

2015-08-06 Thread Serhiy Storchaka
New submission from Serhiy Storchaka: Search in strings is highly optimized for common case. However for some input data the search in non-ascii string becomes unexpectedly slow. Compare: $ ./python -m timeit -s 's = "АБВГД"*10**4' -- '"є" in s' 10 loops, best of 3: 11.7 usec per loop $ ./p