I want to match one or two instances of a pattern in a string. According to the docs for the 're' module ( http://python.org/doc/current/lib/re-syntax.html ) the '?' qualifier is greedy by default, and adding a '?' after a qualifier makes it non-greedy.
> The "*", "+", and "?" qualifiers are all greedy... > Adding "?" after the qualifier makes it perform the match in > non-greedy or minimal fashion... In the following example, though my re is intended to allow for 1 or 2 instinces of 'foo', there are 2 in the string I'm matching. So, I would expect group(1) and group(3) to both be populated. (When I remove the conditional match on the 2nd foo, the grouping is as I expect.) $ python2.4 Python 2.4.1 (#2, Mar 31 2005, 00:05:10) [GCC 3.3 20030304 (Apple Computer, Inc. build 1666)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import re >>> foofoo = re.compile(r'^(foo)(.*?)(foo)?(.*?)$') >>> foofoo.match(s).group(0) 'foobarbazfoobar' >>> foofoo.match(s).group(1) 'foo' >>> foofoo.match(s).group(2) '' >>> foofoo.match(s).group(3) >>> foofoo.match(s).group(4) 'barbazfoobar' >>> foofoo = re.compile(r'^(foo)(.*?)(foo)(.*?)$') >>> foofoo.match(s).group(0) 'foobarbazfoobar' >>> foofoo.match(s).group(1) 'foo' >>> foofoo.match(s).group(2) 'barbaz' >>> foofoo.match(s).group(3) 'foo' >>> foofoo.match(s).group(4) 'bar' >>> So, is this a bug, or just a problem with my understanding? If it's my brain that's broken, what's the proper way to do this with regexps? And, if the above is expected behavior, should I submit a doc bug? It's clear that the "?" qualifier (applied to the second foo group) is _not_ greedy in this situation. -John -- http://mail.python.org/mailman/listinfo/python-list