I'm looking for help relating to repeated groups in regular
expressions. When I run the program below:
#!/usr/bin/env python
import re
ptn = re.compile("^((AB+)+)$")
str = "ABABBABBBABBBBABBBBBABBBBBB"
mo=ptn.search(str)
print mo.groups()
I get the results:
('ABABBABBBABBBBABBBBBABBBBBB', 'ABBBBBB')
I was hoping for something like:
('ABABBABBBABBBBABBBBBABBBBBB', 'AB', 'ABB', 'ABBB', 'ABBBB', 'ABBBBB',
'ABBBBBB')
What I'm trying to show is that building a complex regular expression
from simple regular expressions isn't too hard, but getting at all the
subgroups within the outer group isn't as straightforward. I've found a
reference to "counting the opening parenthesis" which explains why there
are two groups returned, but it seems the groups dynamically generated by
'+' and '*' don't accumulate. Is that true, or is there hope for more?
Can anyone offer a hint?.
--
Randolph Bentson
[email protected]