RE: What would take care of this?...

2002-02-24 Thread Peter Scott

I cut too much out of my post:

At 11:11 AM 2/25/02 +1030, Daniel Falkenberg wrote:
>   $inputSite = "URL OF CHOICE";
>   $tree = HTML::TreeBuilder->new;
>   $address = "http://"; . $inputSite;
>   $request = HTTP::Request->new('GET', $address);
>   $response = $ua->request($request);
>   my $found = 0;
>
>   my $content = $address;
>
>   $p = HTML::TokeParser->new(shift||$content);

All you need to do is change the penultimate line to

my $content = $response->content;

But still check $!.
--
Peter Scott
Pacific Systems Design Technologies
http://www.perldebugged.com


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




RE: What would take care of this?...

2002-02-24 Thread Peter Scott

Sorry, that was not meant to escape as HTML mail... here we are in ASCII, I 
hope:

At 11:11 AM 2/25/02 +1030, Daniel Falkenberg wrote:
>I went ahead and started using HTML::TokeParser and I have read the
>HTML::TokeParser manpage.  I am sure my coding is correct but for some
>reason the script always returns this error...
>
>Can't call method "get_tag" on an undefined value at
>/var/www/cgi-bin/ line 321.
>
>Here is how the code looks...
>   $p = HTML::TokeParser->new(shift||$content);
>   if ($p->get_tag("title")) {
>   my $title = $p->get_trimmed_text;
>   print "Title: $title\n";
>   }
>
>Can any one see anything wrong with this?

 From the HTML::TokeParser documentation:

$p = HTML::TokeParser->new( $file_or_doc );
The object constructor argument is either a file name, a file handle 
object, or the complete document to be parsed.
If the argument is a plain scalar, then it is taken as the name of a file 
to be opened and parsed. If the file can't be opened for reading, then the 
constructor will return an undefined value and $! will tell you why it failed.

You didn't pass a file or document to the constructor, you passed a 
URL.  And you didn't check $! to see why it failed.

--
Peter Scott
Pacific Systems Design Technologies
http://www.perldebugged.com


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




RE: What would take care of this?...

2002-02-24 Thread Daniel Falkenberg

Hi Chris and list,

I went ahead and started using HTML::TokeParser and I have read the
HTML::TokeParser manpage.  I am sure my coding is correct but for some
reason the script always returns this error...

Can't call method "get_tag" on an undefined value at
/var/www/cgi-bin/ line 321.

Here is how the code looks...

  $inputSite = "URL OF CHOICE";
  $tree = HTML::TreeBuilder->new;
  $address = "http://"; . $inputSite;
  $request = HTTP::Request->new('GET', $address);
  $response = $ua->request($request);
  my $found = 0;

  my $content = $address;

  $p = HTML::TokeParser->new(shift||$content);
  if ($p->get_tag("title")) {
  my $title = $p->get_trimmed_text;
  print "Title: $title\n";
  }

Can any one see anything wrong with this?

Thx,

Dan

-Original Message-
From: Chris Ball [mailto:[EMAIL PROTECTED]]
Sent: Friday, 22 February 2002 9:49 PM
To: Daniel Falkenberg
Cc: [EMAIL PROTECTED]
Subject: Re: What would take care of this?...


>>>>> "Daniel" == Daniel Falkenberg <[EMAIL PROTECTED]> writes:

Daniel> Would I now have to go ahead and use HTML::parser or
Daniel> something of similar nature to extract headings?

Yeah, go with HTML::TokeParser.

Daniel> 
Daniel> Get all data from H1  BGCOLOR="FF">I want all if this data extracted from
Daniel> heading 1 (h1) 

while ($stream->get_tag("h1")) { $data = get_trimmed_text("/h1"); }

(Also see perldoc HTML::TokeParser, once it's installed.)

- Chris.
-- 
$a="printf.net"; Chris Ball | chris@void.$a | www.$a | finger: chris@$a
 "In the beginning there was nothing, which exploded."


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: What would take care of this?...

2002-02-22 Thread Chris Ball

> "Jonathan" == Jonathan E Paton <[EMAIL PROTECTED]> writes:

Jonathan> /([^<]*?)<\/h1>/

Please don't ever try and parse HTML with regexps - I've had to work
with way too much code that did.  There are many situations where your
regex would break, and the TokeParser code wasn't much longer. It's
effectively impossible to parse HTML accurately with regexps.

- Chris.  
-- 
$a="printf.net"; Chris Ball | chris@void.$a | www.$a | finger: chris@$a
 "In the beginning there was nothing, which exploded."


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: What would take care of this?...

2002-02-22 Thread Jonathan E. Paton

 --- Daniel Falkenberg <[EMAIL PROTECTED]> wrote:

> Hey All,
> 
> Just wondering how I would go about extracting all
> the data from heading 1 (h1) in the following HTML
> code.  I figured I could have used HTML::TableExtract
> but then I realized ( :) ) there are not tables in
> the following HTML.  Would I now have to go ahead and
> use HTML::parser or something of similar nature to
> extract headings?

Yes, unless its a "one off"... you are looking for:

I want all if this data extracted from heading 1
(h1)

which can be matched by the regular expression:

/([^<]*?)<\/h1>/

Jonathan Paton

__
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: What would take care of this?...

2002-02-22 Thread Chris Ball

> "Daniel" == Daniel Falkenberg <[EMAIL PROTECTED]> writes:

Daniel> Would I now have to go ahead and use HTML::parser or
Daniel> something of similar nature to extract headings?

Yeah, go with HTML::TokeParser.

Daniel> 
Daniel> Get all data from H1  BGCOLOR="FF">I want all if this data extracted from
Daniel> heading 1 (h1) 

while ($stream->get_tag("h1")) { $data = get_trimmed_text("/h1"); }

(Also see perldoc HTML::TokeParser, once it's installed.)

- Chris.
-- 
$a="printf.net"; Chris Ball | chris@void.$a | www.$a | finger: chris@$a
 "In the beginning there was nothing, which exploded."


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




What would take care of this?...

2002-02-21 Thread Daniel Falkenberg

Hey All,

Just wondering how I would go about extracting all the data from heading
1 (h1) in the following HTML code.  I figured I could have used
HTML::TableExtract but then I realized ( :) ) there are not tables in
the following HTML.  Would I now have to go ahead and use HTML::parser
or something of similar nature to extract headings?


Get all data from H1
I want all if this data extracted from
heading 1 (h1)


Thx,

Dan

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]