New submission from Serhiy Storchaka:
Regular expression parser parses a pattern to a tree, marking nodes by string
identifiers. Regular expression compiler converts this three into plain list of
integers. Node's identifiers are transformed to sequential integers. Resulting
list is not human readable. Proposed patch converts string constants in the
sre_constants module to named integer constants. These constants doesn't need
converting to integers, because they are already integers, and when printed
they looks human-friendly. Now intermediate result of regular expression
compiler looks much more readable.
Example.
>>> import re, sre_compile, sre_parse
>>> sre_compile._code(sre_parse.parse('[a-z_][a-z_0-9]+', re.I), re.I)
Before patch:
[17, 4, 0, 2, 2147483647, 16, 7, 27, 97, 122, 19, 95, 0, 29, 16, 1, 2147483647,
16, 11, 10, 0, 67043328, 2147483648, 134217726, 0, 0, 0, 0, 0, 1, 1]
After patch:
[INFO, 4, 0, 2, MAXREPEAT, IN_IGNORE, 7, RANGE, 97, 122, LITERAL, 95, FAILURE,
REPEAT_ONE, 16, 1, MAXREPEAT, IN_IGNORE, 11, CHARSET, 0, 67043328, 2147483648,
134217726, 0, 0, 0, 0, FAILURE, SUCCESS, SUCCESS]
This patch also affects debugging output when regular expression is compiled
with re.DEBUG (identifiers are uppercased and MAXREPEAT is displayed instead of
2147483647 in repeat statements).
Besides debugging output these changes are invisible for ordinal user. They are
needed only for developing and debugging the re module itself. The patch
doesn't affect performance and almost not affects memory consumption.
----------
components: Regular Expressions
files: re_named_consts.patch
keywords: patch
messages: 227008
nosy: ezio.melotti, mrabarnett, pitrou, serhiy.storchaka
priority: normal
severity: normal
stage: patch review
status: open
title: Use named constants internally in the re module
type: enhancement
versions: Python 3.5
Added file: http://bugs.python.org/file36642/re_named_consts.patch
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue22434>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com