Re: HTML::TokeParser question

Rob Dixon Wed, 20 Dec 2006 04:32:04 -0800

Mathew Snyder wrote:
>
> I have a script which runs WWW::Mechanize to obtain a page so it can be parsed
> for email addresses.  However, I can't recall how I'm supposed to use
> HTML::TokeParser to get what I need.  This is the pertinent part of the
> script:
>
> ...
> my $data     = $agent->content();
> my $parse    = new HTML::TokeParser($data);


If you are supplying an HTML string directly then the 'new' method expects a
scalar reference. A simple scalar is assumed to be a filename, and checking the
return value from the constructor would have shown your problem.

my $parse = new HTML::TokeParser(\$data) or die $!;

> my @emails;
> my $token;
>
> while ($data) {

$data remains unchanged and will always be true. You need to fetch all the
<small> tags in the HTML and exit the loop when there are no more.

while ($parse->get_tag('small')) {

>        $token = $parse->get_trimmed_text("/small");
>        push @emails, $token;
> }
>
> foreach my $email (@emails){
>         print $email;
> };
>
> This gives me the error Can't call method "get_trimmed_text" on an undefined
> value at ./check_delete_users.pl line 40.
>
> I had this working at one point but lost the file.  What am I missing?

The rest should work.

HTH,

Rob


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Re: HTML::TokeParser question

Reply via email to