On 29/07/2011 16:45, Thomas Jollans wrote:
On 29/07/11 16:53, rusi wrote:
Can someone throw some light on this anomalous behavior?

import re
r = re.search('a(b+)', 'ababbaaabbbbb')
r.group(1)
'b'
r.group(0)
'ab'
r.group(2)
Traceback (most recent call last):
   File "<stdin>", line 1, in<module>
IndexError: no such group

re.findall('a(b+)', 'ababbaaabbbbb')
['b', 'bb', 'bbbbb']

So evidently group counts by number of '()'s and not by number of
matches (and this is the case whether one uses match or search). So
then whats the point of search-ing vs match-ing?

Or equivalently how to move to the groups of the next match in?

[Side note: The docstrings for this really suck:

help(r.group)
Help on built-in function group:

group(...)


Pretty standard regex behaviour: Group 1 is the first pair of brackets.
Group 2 is the second, etc. pp. Group 0 is the whole match.
The difference between matching and searching is that match assumes that
the start of the regex coincides with the start of the string (and this
is documented in the library docs IIRC). re.match(exp, s) is equivalent
to re.search('^'+exp, s). (if not exp.startswith('^'))

Apparently, findall() returns the content of the first group if there is
one. I didn't check this, but I assume it is documented.

findall returns a list of tuples (what the groups captured) if there is
more than 1 group, or a list of strings (what the group captured) if
there is 1 group, or a list of strings (what the regex matched) if
there are no groups.
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to