Hi Mike, On 26 August 2014 04:36, Mike Kaplinskiy <mike.kaplins...@gmail.com> wrote: > The regex library I meant is that very one. Named lists are a feature there > but not in cpython's or pypy's re.
The regular expression library is a bit special inside PyPy: its core engine has to be written as RPython code in order to benefit from a regular-expression-aware JIT. (If we wrote it in pure Python, it would be significantly slower.) This core is a bytecode interpreter (a different one than Python's, obviously) in a module called "_sre" --- same name as the corresponding C module in CPython. When Python code does "import re", on either PyPy or CPython, it is also importing some pure Python code for the re.compile() part; only the execution of the compiled regular expressions is done by "_sre". What would likely be the best approach would be to add new bytecodes to the same core engine, for example to support the named lists. These new bytecodes would never be produced by the pure Python parts of the "re" module, so they wouldn't have any impact on that. Then you can write or adapt a pure Python "regex" module. It would compile regex-compatible extended regular expressions down to a format that can be used by the same core engine --- using the extra bytecodes as well. If you end up supporting the complete "regex" syntax this way, then we'd be happy to distribute it included inside PyPy, as a pre-installed module (or, depending on how it turns out, as a separate module that needs to be pip-installed --- but it looks saner to include it with PyPy anyway, given that it depends on changes to PyPy's own built-in "_sre" module). A bientôt, Armin. _______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev