Re: [Tutor] again... regular expression

Kent Johnson Mon, 21 Nov 2005 08:21:06 -0800

lmac wrote:
> Hallo.
> I want to parse a website for links of this type:
> 
> http://www.example.com/good.php?test=anything&egal=total&nochmal=nummer&so=site&seite=22";>
> 
> ---------------------------------------------------------------------
> re_site = re.compile(r'http://\w+.\w+.\w+./good.php?.+";>')


. and ? have special meaning in re's so they should be escaped with \
You should use a non-greedy match at the end so you match the *first* ">

So I would try
re_site = re.compile(r'http://\w+\.\w+\.\w+\./good.php\?.+?";>')

> for a in file:
>       z = re_site.search(a)
>       if z != None:
>       print z.group(0)                        
> 
> ---------------------------------------------------------------------
> 
> I still don't understand RE-Expressions. I tried some other expressions
>  but didn't get it work.

There is a handy re tester that comes with Python, look for
C:\Python24\Tools\Scripts\redemo.py

Kent
-- 
http://www.kentsjohnson.com

_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] again... regular expression

Reply via email to