On 29/07/11 16:53, rusi wrote: > Can someone throw some light on this anomalous behavior? > >>>> import re >>>> r = re.search('a(b+)', 'ababbaaabbbbb') >>>> r.group(1) > 'b' >>>> r.group(0) > 'ab' >>>> r.group(2) > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > IndexError: no such group > >>>> re.findall('a(b+)', 'ababbaaabbbbb') > ['b', 'bb', 'bbbbb'] > > So evidently group counts by number of '()'s and not by number of > matches (and this is the case whether one uses match or search). So > then whats the point of search-ing vs match-ing? > > Or equivalently how to move to the groups of the next match in? > > [Side note: The docstrings for this really suck: > >>>> help(r.group) > Help on built-in function group: > > group(...) >
Pretty standard regex behaviour: Group 1 is the first pair of brackets. Group 2 is the second, etc. pp. Group 0 is the whole match. The difference between matching and searching is that match assumes that the start of the regex coincides with the start of the string (and this is documented in the library docs IIRC). re.match(exp, s) is equivalent to re.search('^'+exp, s). (if not exp.startswith('^')) Apparently, findall() returns the content of the first group if there is one. I didn't check this, but I assume it is documented. - Thomas -- http://mail.python.org/mailman/listinfo/python-list