bob gailer a écrit :
Emmanuel Ruellan wrote:
Hi tutors!
While trying to write a regular expression that would split a string
the way I want, I noticed a behaviour I didn't expect.
re.findall('.?', 'some text')
['s', 'o', 'm', 'e', ' ', 't', 'e', 'x', 't', '']
Where does the last string, the empty one, come from?
I find this behaviour rather annoying: I'm getting one group too many.
The ? means 0 or 1 occurrence. I think re is matching the null string at
the end.
Drop the ? and you'll get what you want.
Of course you can get the same thing using list('some text') at lower cost.
I find this fully consistent, for your regex means matching
* either any char
* or no char at all
Logically, you first get n chars, then one 'nothing'. Only after that will
parsing be stopped because of end of string. Maybe clearer:
print re.findall('.?', '')
==> ['']
print re.findall('.', '')
==> []
denis
_______________________________________________
Tutor maillist - Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor