Thanks for the tip on WWW::Mechanize, Im going to play with it later,
meanwhile I grabbed this code from the lwpcook manpage (large documents section)
interesting thing is when I print out the chunk to my outfile... it prints the entire
document,
I was thinking it might not print the last line or so but it does so I guess the page
does not
send a EOF.
Not sure what pearl considers the EOF to be when getting a html file ?? do you know if
its
</html> ?
Because the printed out chunk does have a </html>
Code from the lwpcook manpage (large documents section):
my $expected_length;
my $bytes_received = 0;
my $res =
$ua->request(HTTP::Request->new(GET => $url),
sub {
my($chunk, $res) = @_;
$bytes_received += length($chunk);
unless (defined $expected_length) {
$expected_length = $res->content_length || 0;
}
if ($expected_length) {
printf OUT "%d%% - ",
100 * $bytes_received / $expected_length;
}
print OUT "$bytes_received bytes received\n";
# XXX Should really do something with the chunk itself
print OUT2 $chunk;
});
print $res->status_line, "\n";
print OUT $res->status_line, "\n";
***************************************************************
Just in case anyone cares the whole test program:
#!/usr/bin/perl -w
use strict;
use URI::URL;
use LWP;
use LWP::Debug qw(+ -conns);
use HTTP::Cookies;
my $errors_page = "proxy_errors.txt";
my $url = url ('http://earthquake.usgs.gov/recenteqsUS/Quakes/quakes_all.html');
#my $url = url
('http://www.nanpa.com/number_resource_info/co_code_assignments1.html/');
my $outdir = "C:\\a_perl\\proxy_tests\\" ;
my $src = $outdir .'quakes_all.html';
open OUT, ">$errors_page" or die "Create $errors_page: $!";
open (OUT2, "> $src") or die "Cant write on file '$src'\n";
my $PROXY_URL = 'http://proxy-web.dri.edu/'; ### Proxy URL or Address + Port
my $PROXY_FTP = 'http://proxy-ftp.dri.edu/';
my $ua = LWP::UserAgent->new(env_proxy => 1,
timeout => 120,
);
$ua->proxy(http => $PROXY_URL);
$ua->proxy(ftp => $PROXY_FTP);
$ua->cookie_jar();
$ua->cookie_jar(HTTP::Cookies->new(file => 'lwpcookies.txt',
autosave => 1));
#my $req = new HTTP::Request 'GET', $url;
my $expected_length;
my $bytes_received = 0;
my $res =
$ua->request(HTTP::Request->new(GET => $url),
sub {
my($chunk, $res) = @_;
$bytes_received += length($chunk);
unless (defined $expected_length) {
$expected_length = $res->content_length || 0;
}
if ($expected_length) {
printf OUT "%d%% - ",
100 * $bytes_received / $expected_length;
}
print OUT "$bytes_received bytes received\n";
# XXX Should really do something with the chunk itself
print OUT2 $chunk;
});
print $res->status_line, "\n";
print OUT $res->status_line, "\n";
**************************************************************
"$Bill Luebkert" wrote:
> lorid wrote:
> > Thanks Bill!
> > Your code worked great, I tested it on a different url and it worked ! yeah, but
> > it timed out on the page Im trying to get (Im working from home on a 56k modem)
> > but it
> > seems to get
> > other pages just fine. With the debug code I can see that on the ... quakes.all
> > page it
> > zips along and then
> > it hangs. I played with increasing the timeout and it still timesout... must be
> > something in the source code..
> > but the main thing is now I can use LWP to get files.
> > The purpose in reading this file was to learn how to get a file thru LWP (even
> > through a
> > proxy) to parse it
> > and create new files.
> > Ive got the rest working fine but was stuck on the proxy part... thanks again
>
> Some sites may need to send cookies.
>
> Try using one of these (the third one will allow you to save them
> between session):
>
> $ua->cookie_jar();
> $ua->cookie_jar(HTTP::Cookies::Netscape->new);
> $ua->cookie_jar(HTTP::Cookies->new(file => 'lwpcookies.txt',
> autosave => 1));
>
> You can also check out WWW::Mechanize for this sort of thing.
>
> --
> ,-/- __ _ _ $Bill Luebkert Mailto:[EMAIL PROTECTED]
> (_/ / ) // // DBE Collectibles Mailto:[EMAIL PROTECTED]
> / ) /--< o // // Castle of Medieval Myth & Magic http://www.todbe.com/
> -/-' /___/_<_</_</_ http://dbecoll.tripod.com/ (My Perl/Lakers stuff)
_______________________________________________
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs