Mark Tolonen wrote: > > "Sean Brown" <sbrown.h...@[spammy] gmail.com> wrote in message > news:glflaj$qr...@nntp.motzarella.org... >> Using python 2.4.4 on OpenSolaris 2008.11 >> >> I have the following string created by opening a url that has the >> following string in it: >> >> td[ct] = [[ ... ]];\r\n >> >> The ... above is what I'm interested in extracting which is really a >> whole bunch of text. So I think the regex \[\[(.*)\]\]; should do it. >> The problem is it appears that python is escaping the \ in the regex >> because I see this: >>>>> reg = '\[\[(.*)\]\];' >>>>> reg >> '\\[\\[(.*)\\]\\];' >> >> Now to me looks like it would match the string - \[\[ ... \]\]; > > You are viewing the repr of the string > >>>> reg='\[\[(.*)\]\];' >>>> reg > '\\[\\[(.*)\\]\\];' >>>> print reg > \[\[(.*)\]\]; <== these are the chars passed to regex > > The slashes are telling regex the the [ are literal. > >> >> Which obviously doesn't match anything because there are no literal \ in >> the above string. Leaving the \ out of the \[\[ above has re.compile >> throw an error because [ is a special regex character. Which is why it >> needs to be escaped in the first place. >> >> I am either doing something really wrong, which very possible, or I've >> missed something obvious. Either way, I thought I'd ask why this isn't >> working and why it seems to be changing my regex to something else. > > Did you try it? > >>>> s='td[ct] = [[blah blah]];\r\n' >>>> re.search(reg,s).group(1) > 'blah blah' > Beware, though, that by default regex matches are greedy, so if there's a chance that two [[ ... ]] [[ ... ]] can appear on the same line then the above pattern will match
... ]] [[ ... regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list