03.10.17 06:29, INADA Naoki пише:
Before deferring re.compile, can we make it faster?
I profiled `import string` and small optimization can make it 2x faster!
(but it's not backward compatible)
Please open an issue for this.
I found:
* RegexFlag.__and__ and __new__ is called very often.
* _optimize_charset is slow, because re.UNICODE | re.IGNORECASE
diff --git a/Lib/sre_compile.py b/Lib/sre_compile.py
index 144620c6d1..7c662247d4 100644
--- a/Lib/sre_compile.py
+++ b/Lib/sre_compile.py
@@ -582,7 +582,7 @@ def isstring(obj):
def _code(p, flags):
- flags = p.pattern.flags | flags
+ flags = int(p.pattern.flags) | int(flags)
code = []
# compile info block
Maybe cast flags to int earlier, in sre_compile.compile()?
diff --git a/Lib/string.py b/Lib/string.py
index b46e60c38f..fedd92246d 100644
--- a/Lib/string.py
+++ b/Lib/string.py
@@ -81,7 +81,7 @@ class Template(metaclass=_TemplateMetaclass):
delimiter = '$'
idpattern = r'[_a-z][_a-z0-9]*'
braceidpattern = None
- flags = _re.IGNORECASE
+ flags = _re.IGNORECASE | _re.ASCII
def __init__(self, template):
self.template = template
patched:
import time: 1191 | 8479 | string
Of course, this patch is not backward compatible. [a-z] doesn't match
with 'ı' or 'ſ' anymore.
But who cares?
This looks like a bug fix. I'm wondering if it is worth to backport it
to 3.6. But the change itself can break a user code that changes
idpattern without touching flags. There is other way, but it should be
discussed on the bug tracker.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com