HTML::TokeParser is, as they say, your friend.

perldoc html::tokeparser

    This example extract the <TITLE> from the document:

      use HTML::TokeParser;
      $p = HTML::TokeParser->new(shift||"index.html");
      if ($p->get_tag("title")) {
          my $title = $p->get_trimmed_text;
          print "Title: $title\n";
      }

Lee
---
Obligatory perl schmutter .sig:
perl -e "print chr(rand>.5?92:47) while 1"

> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]]On Behalf Of
> Kristofer Wolff
> Sent: 02 July 2001 10:46
> To: perllist
> Subject: regexp question
> 
> 
> hi folks i do a simple thing: parsing out the site title of an html...
> 
>       $subject =~ s/^(.*)\<title\>(.*)\<\/title\>(.*)$/$2/i;
> 
> 
> but he returns the complete HTML file, why ?
> 
> 
> any helps  ? what didi i wrong ?
> 
> 
> thanx kris
> _______________________________________________
> Perl-Win32-Web mailing list
> [EMAIL PROTECTED]
> http://listserv.ActiveState.com/mailman/listinfo/perl-win32-web
_______________________________________________
Perl-Win32-Web mailing list
[EMAIL PROTECTED]
http://listserv.ActiveState.com/mailman/listinfo/perl-win32-web

Reply via email to