On Jul 27, 11:34 am, MRAB <pyt...@mrabarnett.plus.com> wrote: > I've been working on a new implementation of the re module.
Fabulous! If you're extending/changing the interface, there are a couple of sore points in the current implementation I'd love to see addressed: - findall/finditer doesn't find overlapping matches. Sometimes you really *do* want to know all possible matches, even if they overlap. This comes up in bioinformatics, for example. - split won't split on empty patterns, e.g. empty lookahead patterns. This means that it can't be used for a whole class of interesting cases. This has been discussed previously: http://bugs.python.org/issue3262 http://bugs.python.org/issue852532 http://bugs.python.org/issue988761 - It'd be nice to have a version of split that generates the parts (one by one) rather than returning the whole list. - Repeated subgroup match information is not available. That is, for a match like this re.match('(.){3}', 'xyz') there's no way to discover that the subgroup first matched 'x', then matched 'y', and finally matched 'z'. Here is one past proposal (mine), perhaps over-complex, to address this problem: http://mail.python.org/pipermail/python-dev/2004-August/047238.html Mike -- http://mail.python.org/mailman/listinfo/python-list