[issue37723] important performance regression on regular expression parsing

2019-08-04 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Thank you for your contribution yannvgn. -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker ___

[issue37723] important performance regression on regular expression parsing

2019-07-31 Thread miss-islington
miss-islington added the comment: New changeset 77fcccb5321137456549b7f55b819f2c8a4c78a4 by Miss Islington (bot) in branch '3.8': bpo-37723: Fix performance regression on regular expression parsing. (GH-15030) https://github.com/python/cpython/commit/77fcccb5321137456549b7f55b819f2c8a4c78a4

[issue37723] important performance regression on regular expression parsing

2019-07-31 Thread miss-islington
miss-islington added the comment: New changeset 33b700ba8cbb128519442eeed8c8747ff73f4524 by Miss Islington (bot) in branch '3.7': bpo-37723: Fix performance regression on regular expression parsing. (GH-15030) https://github.com/python/cpython/commit/33b700ba8cbb128519442eeed8c8747ff73f4524

[issue37723] important performance regression on regular expression parsing

2019-07-31 Thread miss-islington
Change by miss-islington : -- pull_requests: +14809 pull_request: https://github.com/python/cpython/pull/15060 ___ Python tracker ___ __

[issue37723] important performance regression on regular expression parsing

2019-07-31 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: New changeset 9f1f3df238e58315e724e50cb0d574d75b94 by Serhiy Storchaka (yannvgn) in branch 'master': bpo-37723: Fix performance regression on regular expression parsing. (GH-15030) https://github.com/python/cpython/commit/9f1f3df238e58315e724e50cb0d

[issue37723] important performance regression on regular expression parsing

2019-07-31 Thread miss-islington
Change by miss-islington : -- pull_requests: +14808 pull_request: https://github.com/python/cpython/pull/15059 ___ Python tracker ___ __

[issue37723] important performance regression on regular expression parsing

2019-07-31 Thread yannvgn
yannvgn added the comment: Hey Matthew, we decided to go for this, which is simpler and straightforward: def _uniq(items): return list(dict.fromkeys(items)) (see https://github.com/python/cpython/pull/15030) -- ___ Python tracker

[issue37723] important performance regression on regular expression parsing

2019-07-31 Thread Matthew Barnett
Matthew Barnett added the comment: I've just had a look at _uniq, and the code surprises me. The obvious way to detect duplicates is with a set, but that requires the items to be hashable. Are they? Well, the first line of the function uses 'set', so they are. Why, then, isn't it using a se

[issue37723] important performance regression on regular expression parsing

2019-07-31 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Oh, this is convincing. -- ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe

[issue37723] important performance regression on regular expression parsing

2019-07-31 Thread yannvgn
yannvgn added the comment: > Indeed, it was not expected that the character set contains hundreds of > thousands items. What is its size in your real code? > Could you please show benchmarking results for different implementations and > different sizes? I can't precisely answer that, but sa

[issue37723] important performance regression on regular expression parsing

2019-07-31 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Indeed, it was not expected that the character set contains hundreds of thousands items. What is its size in your real code? Could you please show benchmarking results for different implementations and different sizes? --

[issue37723] important performance regression on regular expression parsing

2019-07-30 Thread Serhiy Storchaka
Change by Serhiy Storchaka : -- nosy: +serhiy.storchaka ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https

[issue37723] important performance regression on regular expression parsing

2019-07-30 Thread yannvgn
Change by yannvgn : -- keywords: +patch pull_requests: +14788 stage: -> patch review pull_request: https://github.com/python/cpython/pull/15030 ___ Python tracker ___

[issue37723] important performance regression on regular expression parsing

2019-07-30 Thread yannvgn
New submission from yannvgn : On complex cases, parsing regular expressions takes much, much longer on Python >= 3.7 Example (ipython): In [1]: import re In [2]: char_list = ''.join([chr(i) for i in range(0x)]) In [3]: long_char_list = char_list * 10 In [4]: pattern = f'[{re.escape(long_c