Odd-R. wrote: > Input is a string of four digit sequences, possibly > separated by a -, for instance like this > > "1234,2222-8888,4567," > > My regular expression is like this: > > rx1=re.compile(r"""\A(\b\d\d\d\d,|\b\d\d\d\d-\d\d\d\d,)*\Z""") > > When running rx1.findall("1234,2222-8888,4567,") > > I only get the last match as the result. Isn't > findall suppose to return all the matches?
For a start, an expression that starts with \A and ends with \Z will match the whole string (or not match at all). You have only one match. Secondly, as you have a group in your expression, findall returns what the group matches. Your expression matches zero or more of what your group matches, provided there is nothing else at the start/end of the string. The "zero or more" makes the re engine waltz about a bit; when the music stopped, the group was matching "4567,". Thirdly, findall should be thought of as merely a wrapper around a loop using the search method -- it finds all non-overlapping matches of a pattern. So the clue to get from this is that you need a really simple pattern, like the following. You *don't* have to write an expression that does the looping. So here's the mean lean no-flab version -- you don't even need the parentheses (sorry, Thomas). >>> rx1=re.compile(r"""\b\d\d\d\d,|\b\d\d\d\d-\d\d\d\d,""") >>> rx1.findall("1234,2222-8888,4567,") ['1234,', '2222-8888,', '4567,'] HTH, John -- http://mail.python.org/mailman/listinfo/python-list