Re: regexp question

Chris Devers Mon, 02 Jul 2001 10:45:30 -0700
At 10:45 AM 2001.07.02 +0100, Kristofer Wolff wrote:
>hi folks i do a simple thing: parsing out the site title of an html...
>
>        $subject =~ s/^(.*)\<title\>(.*)\<\/title\>(.*)$/$2/i;
>
>but he returns the complete HTML file, why ?

I don't know, but I also don't know why you're trying to match everything. You don't 
seem to be interested in text outside the title tags, so skip it. A somewhat better 
match would be:

     $subject =~ s/\<title\>(.*)\<\/title\>/$2/sig;

This works across line breaks & only grabs what you want. However, it'll still 
probably break though. As Lee said, the better strategy is to use a parser so that you 
know you're getting the right page element[s]. His code is the way to solve this. 


--
Chris Devers                     [EMAIL PROTECTED]

_______________________________________________
Perl-Win32-Web mailing list
[EMAIL PROTECTED]
http://listserv.ActiveState.com/mailman/listinfo/perl-win32-web
Re: regexp question

Reply via email to