At 10:45 AM 2001.07.02 +0100, Kristofer Wolff wrote:
>hi folks i do a simple thing: parsing out the site title of an html...
>
> $subject =~ s/^(.*)\<title\>(.*)\<\/title\>(.*)$/$2/i;
>
>but he returns the complete HTML file, why ?
I don't know, but I also don't know why you're trying to match everything. You don't
seem to be interested in text outside the title tags, so skip it. A somewhat better
match would be:
$subject =~ s/\<title\>(.*)\<\/title\>/$2/sig;
This works across line breaks & only grabs what you want. However, it'll still
probably break though. As Lee said, the better strategy is to use a parser so that you
know you're getting the right page element[s]. His code is the way to solve this.
--
Chris Devers [EMAIL PROTECTED]
_______________________________________________
Perl-Win32-Web mailing list
[EMAIL PROTECTED]
http://listserv.ActiveState.com/mailman/listinfo/perl-win32-web