On Nov 21, 2:33 am, [EMAIL PROTECTED] (Rob Dixon) wrote:
> Francois wrote:
> > I tried to get data from a site which use cookies and redirect the
> > user, I spend a lot of time with the same result: connection timed out
> > until I realised that all was fine if I did'nt send the header...
>
> > Thanks for any explanations !!!
> > Francois
>
> > here is my code:
>
> >   use strict;
> >     use warnings;
>
> >     use LWP;
> >     use HTML::Parser;
> >     use HTML::FormatText;
> >     use HTML::Tree;
> >     # use DateTime::Duration;
> >     use HTTP::Headers;
> >     use HTTP::Cookies;
> >     use HTTP::Cookies::Netscape;
> >     use CGI qw(header -no_debug);
>
> >     my $h = HTTP::Headers->new(
> >        Accept => "text/xml,application/xml,application/xhtml+xml,text/
> > html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5",
> >        Host => "www.unifr.ch",
> >    );
>
> >     $h->server("Apache/2.0.46 (Red Hat)");
> >     $h->user_agent("Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:
> > 1.8.1.9) Gecko/20071025 Firefox/2.0.0.9");
>
> >     my $reflink = "http://linkinghub.elsevier.com/retrieve/pii/
> > S0020138307000095";
>
> >     my $c = HTTP::Cookies::Netscape->new(file=>'cookies.txt',
> > autosave=>"1");
> >     my $ua_short = LWP::UserAgent->new(cookie_jar => $c, timeout=>
> > 20);
> >    $ua_short->agent("Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:
> > 1.8.1.9) Gecko/20071025 Firefox/2.0.0.9");
> >     # with this line the header is send with my request and it does
> > not work
> >    # my $req = HTTP::Request->new(GET=>$reflink, $h);
>
> >   #with this line it's ok ....
> >     my $req = HTTP::Request->new(GET=>$reflink);
>
> >      my $response =$ua_short->request($req);
> >     print header;
> >     print $response->status_line,"\n";
> >     my $formatter = HTML::FormatText->new();
>
> >            if ($response->is_success) {
> >                    my $tree = 
> > HTML::TreeBuilder->new->parse($response->content);
> >                    my $ascii = $formatter->format($tree);
> >                    $tree->delete();
> >                    print $ascii;
> >            }
>
> Hi Francois.
>
> As a general rule it's polite to reduce code as much as possible before
> posting it here to ask for help: there's a lot of junk in here that
> isn't relevant to the problem and just needs to be waded through before
> we can give you an answer.
>
> What's going wrong is that you have a Host header value ofwww.unifr.ch
> but you are sending the request to linkinghub.elsevier.com, which
> doesn't have a host of that name and so doesn't reply.
>
> But that's a huge amount of code just to fetch a web page! You may need
> some of that stuff but I can't see how you would want all of it. How
> about just
>
>    my $ua = LWP::UserAgent->new;
>    my $resp =
> $ua->get('http://linkinghub.elsevier.com/retrieve/pii/S0020138307000095');
>
> which seems to me to do the same thing.
>
> HTH,
>
> Rob

Hi Rob

Many thanks for educating me and for the answer. I tried to post to
libwwww forum without having an answer yet. My wrong host in the
header explains also the troubles I hade with cookies (witch was the
topic on my post there)
Thanks again !
Francois


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to