New submission from Matt Chaput <m...@whoosh.ca>: Several times in the recent past I've wished for the following methods on the regular expression object. These would allow me to speed up search and parsing code, by limiting the number of regex matches I need to try.
literal_prefix(): Returns any literal string at the start of the pattern (before any "special" parts). E.g., for the pattern "ab(c|d)ef" the method would return "ab". For the pattern "abc|def" the method would return "". When matching a regex against keys in a btree, this would let me limit the search to just the range of keys with the prefix. first_chars(): Returns a string/list/set/whatever of the possible first characters that could appear at the start of a matching string. E.g. for the pattern "ab(c|d)ef" the method would return "a". For the pattern "[a-d]ef" the method would return "abcd". When parsing a string with regexes, this would let me only have to test the regexes that could match at the current character. As long as you're making a new regex package, I thought I'd put in a request for these :) ---------- components: Regular Expressions messages: 143266 nosy: mattchaput priority: normal severity: normal status: open title: Regex object should have introspection methods type: feature request versions: Python 3.3 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue12870> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com