HTML::TokeParser is, as they say, your friend.
perldoc html::tokeparser
This example extract the <TITLE> from the document:
use HTML::TokeParser;
$p = HTML::TokeParser->new(shift||"index.html");
if ($p->get_tag("title")) {
my $title = $p->get_trimmed_text;
print "Title: $title\n";
}
Lee
---
Obligatory perl schmutter .sig:
perl -e "print chr(rand>.5?92:47) while 1"
> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]]On Behalf Of
> Kristofer Wolff
> Sent: 02 July 2001 10:46
> To: perllist
> Subject: regexp question
>
>
> hi folks i do a simple thing: parsing out the site title of an html...
>
> $subject =~ s/^(.*)\<title\>(.*)\<\/title\>(.*)$/$2/i;
>
>
> but he returns the complete HTML file, why ?
>
>
> any helps ? what didi i wrong ?
>
>
> thanx kris
> _______________________________________________
> Perl-Win32-Web mailing list
> [EMAIL PROTECTED]
> http://listserv.ActiveState.com/mailman/listinfo/perl-win32-web
_______________________________________________
Perl-Win32-Web mailing list
[EMAIL PROTECTED]
http://listserv.ActiveState.com/mailman/listinfo/perl-win32-web