New submission from Matt Chaput <m...@whoosh.ca>:

Several times in the recent past I've wished for the following methods on the 
regular expression object. These would allow me to speed up search and parsing 
code, by limiting the number of regex matches I need to try.

literal_prefix(): Returns any literal string at the start of the pattern 
(before any "special" parts). E.g., for the pattern "ab(c|d)ef" the method 
would return "ab". For the pattern "abc|def" the method would return "". When 
matching a regex against keys in a btree, this would let me limit the search to 
just the range of keys with the prefix.

first_chars(): Returns a string/list/set/whatever of the possible first 
characters that could appear at the start of a matching string. E.g. for the 
pattern "ab(c|d)ef" the method would return "a". For the pattern "[a-d]ef" the 
method would return "abcd". When parsing a string with regexes, this would let 
me only have to test the regexes that could match at the current character.

As long as you're making a new regex package, I thought I'd put in a request 
for these :)

----------
components: Regular Expressions
messages: 143266
nosy: mattchaput
priority: normal
severity: normal
status: open
title: Regex object should have introspection methods
type: feature request
versions: Python 3.3

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue12870>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to