Thanks for the tip on WWW::Mechanize, Im going to play with it later,
meanwhile I grabbed this code from the lwpcook manpage (large documents section)
interesting thing is when I print out the chunk to my outfile... it prints the entire
document,
I was thinking it might not print the last line or so but it does so I guess the page 
does not
send a EOF.
Not sure what pearl considers the EOF to be when getting a html file ?? do you know if 
its
</html> ?
Because the printed out chunk does have a </html>

Code from the lwpcook manpage (large documents section):
my $expected_length;
  my $bytes_received = 0;
  my $res =
     $ua->request(HTTP::Request->new(GET => $url),
               sub {
                   my($chunk, $res) = @_;
                   $bytes_received += length($chunk);
                   unless (defined $expected_length) {
                      $expected_length = $res->content_length || 0;
                   }
                   if ($expected_length) {
                        printf OUT "%d%% - ",
                                  100 * $bytes_received / $expected_length;
                   }
                   print OUT "$bytes_received bytes received\n";
                   # XXX Should really do something with the chunk itself
                   print OUT2 $chunk;
               });
                print $res->status_line, "\n";
   print OUT $res->status_line, "\n";
***************************************************************
Just in case anyone cares the whole test program:

#!/usr/bin/perl -w


use strict;
use URI::URL;
use LWP;
use LWP::Debug qw(+ -conns);
use HTTP::Cookies;

my $errors_page = "proxy_errors.txt";
my $url = url ('http://earthquake.usgs.gov/recenteqsUS/Quakes/quakes_all.html');
#my $url = url 
('http://www.nanpa.com/number_resource_info/co_code_assignments1.html/');

my $outdir = "C:\\a_perl\\proxy_tests\\" ;
my $src = $outdir .'quakes_all.html';

open OUT, ">$errors_page" or die "Create $errors_page: $!";
open (OUT2, "> $src") or die "Cant write on file '$src'\n";


my $PROXY_URL = 'http://proxy-web.dri.edu/'; ### Proxy URL or Address + Port
my $PROXY_FTP = 'http://proxy-ftp.dri.edu/';


my $ua = LWP::UserAgent->new(env_proxy => 1,
                              timeout => 120,
                             );

$ua->proxy(http => $PROXY_URL);
$ua->proxy(ftp => $PROXY_FTP);

$ua->cookie_jar();

$ua->cookie_jar(HTTP::Cookies->new(file => 'lwpcookies.txt',
  autosave => 1));

#my $req = new HTTP::Request 'GET', $url;


my $expected_length;
  my $bytes_received = 0;
  my $res =
     $ua->request(HTTP::Request->new(GET => $url),
               sub {
                   my($chunk, $res) = @_;
                   $bytes_received += length($chunk);
                   unless (defined $expected_length) {
                      $expected_length = $res->content_length || 0;
                   }
                   if ($expected_length) {
                        printf OUT "%d%% - ",
                                  100 * $bytes_received / $expected_length;
                   }
                   print OUT "$bytes_received bytes received\n";
                   # XXX Should really do something with the chunk itself
                   print OUT2 $chunk;
               });
                print $res->status_line, "\n";
   print OUT $res->status_line, "\n";
**************************************************************
"$Bill Luebkert" wrote:

> lorid wrote:
> > Thanks Bill!
> > Your code worked great, I tested it on a different url and it worked ! yeah, but
> > it timed out on the page Im trying to get (Im working from home on a 56k modem) 
> > but it
> > seems to get
> > other pages just fine. With the debug code I can see that on  the ... quakes.all 
> > page it
> > zips along and then
> > it hangs. I played with increasing the timeout and it still timesout... must be
> > something in the source code..
> > but the main thing is now I can use LWP to get files.
> > The purpose in reading this file was to learn how to get a file thru LWP (even 
> > through a
> > proxy) to parse it
> > and create new files.
> > Ive got the rest working fine but was stuck on the proxy part... thanks again
>
> Some sites may need to send cookies.
>
> Try using one of these (the third one will allow you to save them
> between session):
>
> $ua->cookie_jar();
> $ua->cookie_jar(HTTP::Cookies::Netscape->new);
> $ua->cookie_jar(HTTP::Cookies->new(file => 'lwpcookies.txt',
>   autosave => 1));
>
> You can also check out WWW::Mechanize for this sort of thing.
>
> --
>   ,-/-  __      _  _         $Bill Luebkert    Mailto:[EMAIL PROTECTED]
>  (_/   /  )    // //       DBE Collectibles    Mailto:[EMAIL PROTECTED]
>   / ) /--<  o // //      Castle of Medieval Myth & Magic http://www.todbe.com/
> -/-' /___/_<_</_</_    http://dbecoll.tripod.com/ (My Perl/Lakers stuff)

_______________________________________________
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Reply via email to