my $ua = LWP::UserAgent->new( timeout => 20, ) || die "Cannot create new LWP UserAgent: $! \n"; my $req = HTTP::Request->new('GET', $url ) || die "Cannot create LWP Request: $! \n"; $ua->request($req, \&callback, $block ) || die "Cannot request data: $! \n";
First of all, these lines are better written using "or" instead of "||". A line like
$var = foo() || die "...";
is interpreted as
$var = (foo() || die "...");
due to operator precedence. I don't think it matters here, but it's a bad habit to get into.
sub callback { my ($html, $response, $protocol ) = @_; $html =~ s/^<html>.*Synopsis//; print LOG "\n\t html: \n $html \n"; my @tr = $html =~ /<tr>.*?<\/tr>/gis; print LOG "\t Table rows: \n", join("\n", @tr), "\n"; }
The code gets delivered in chunks, so the print statements are repeated. Setting $block to 50k (bigger than the page) did not help.
I've never really worked with LWP, but according to the manual you can just do:
my $response = $ua->request($req) or die ...; print $response->content;
$response is a HTTP::Response object, with a method called "content" that returns the content of the response.
Alternately using the callback method, something like this would collect the content in a local variable:
my $html = ''; sub callback { $html .= $_[0] } $ua->request($req, \&callback, $block ) or die...; print $html; # Contains the response content
-- Kenneth Herron [EMAIL PROTECTED] 916-366-7338 _______________________________________________ vox-tech mailing list [EMAIL PROTECTED] http://lists.lugod.org/mailman/listinfo/vox-tech