Re: [Tutor] Regular expression oddity

spir Sun, 23 Nov 2008 05:01:16 -0800

bob gailer a écrit :

Emmanuel Ruellan wrote:

Hi tutors!


While trying to write a regular expression that would split a string
the way I want, I noticed a behaviour I didn't expect.

re.findall('.?', 'some text')

['s', 'o', 'm', 'e', ' ', 't', 'e', 'x', 't', '']

Where does the last string, the empty one, come from?
I find this behaviour rather annoying: I'm getting one group too many.

The ? means 0 or 1 occurrence. I think re is matching the null string atthe end.


Drop the ? and you'll get what you want.

Of course you can get the same thing using list('some text') at lower cost.

I find this fully consistent, for your regex means matching
* either any char
* or no char at all

Logically, you first get n chars, then one 'nothing'. Only after that willparsing be stopped because of end of string. Maybe clearer:

print re.findall('.?', '')
==> ['']
print re.findall('.', '')
==> []
denis

_______________________________________________
Tutor maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Regular expression oddity

Reply via email to