Dan Anderson wrote: > > I am trying to create a spider to grab my books off of Safari > for a batch printing job so I don't need to go through each chapter > myself and hit the Print button. So I used this script to try and log > myself in to the safari site: > > # BEGIN CODE > #! /usr/bin/perl > > use strict; > use warnings; > use LWP; > use LWP::UserAgent;
Use one or the other, but not both. LWP is a module that just 'require's LWP::UserAgent. > # variables > my $cookie_jar_file = "./cookies.txt"; > my @headers = ( > 'User-Agent' => 'Mozilla/4.76 [en] (Win98; U)', > 'Accept' => 'image/gif, image/x-bitmap, image/jpeg, > image/pjpeg, image/png, */*', > 'Accept-Charset' => 'iso-8859-1,*', > 'Accept-Language' => 'en-US', > "catid" => "", > "s" => "1", > "o" => "1", > "b" => "1", > "t" => "1", > "f" => "1", > "c" => "1", > "u" => "1", > "r" => "", > "l" => "1", > "g" => "", > "usr" => "myemail", > "pwd" => "mypassword", > "savepwd" => "1", > ); > # end variables > > my $user_agent = LWP::UserAgent->new; > $user_agent->cookie_jar({file => $cookie_jar_file}); > my $response = $user_agent->post( > 'http://safari.oreilly.com/JVXSL.asp', > @headers, > ); > # END CODE > > Now I know that this is the form I should post to because > I stripped the following forms out of the web page (and there is > no Javascript to modify the forms): > > <form action="JVXSL.asp" method="post"> > <input type="hidden" name="catid" value=""> > <input type="hidden" name="s" value="1"> > <input type="hidden" name="o" value="1"> > <input type="hidden" name="b" value="1"> > <input type="hidden" name="t" value="1"> > <input type="hidden" name="f" value="1"> > <input type="hidden" name="c" value="1"> > <input type="hidden" name="u" value="1"> > <input type="hidden" name="r" value=""> > <input type="hidden" name="l" value="1"> > <input type="hidden" name="g" value=""> > <input name="usr" type="text" value="" size="12"> > <input name="pwd" type="password" value="" size="12"> > <input type="checkbox" name="savepwd" value="1"> > <input type="image" name="Login" src="images/btn_login.gif" width="40" height="16" > border="0" align="absmiddle"> > </form> > > When I pull up this web page there's nothing in > $response->content. I know that safari.oreilly.com will return a > blank page if it doesn't like the user agent, and upon signing in > it'll return to the safari.oreilly.com page with a very large number > of get variables. Does anyone know what I might be doing wrong? You can't put form input into header fields! Use LWP to fetch the Safari home page and HTML::Form to parse the form and enter field values. None of the 'Accept' headers are necessary. Take a look at this: use strict; use warnings; use LWP; use HTML::Form; my $ua = new LWP::UserAgent(agent => 'Mozilla/4.76 [en] (Win98; U)'); $ua->cookie_jar({}); my $resp = $ua->get('http://safari.oreilly.com/'); die $resp->status_line unless $resp->is_success; # There are two forms on the page. Find the one with an input named 'Login'. # my $login; foreach (HTML::Form->parse($resp)) { if ($_->find_input('Login')) { $login = $_; last; } } $login->param('usr', '[EMAIL PROTECTED]'); $login->param('pwd', 'secret'); $resp = $ua->request($login->click); die $resp->status_line unless $resp->is_success; HTH, Rob -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>