RE: Cannot WGet Google Search Page?

Phil Lewis Sun, 13 Jun 2004 13:59:33 -0700

Jens, thank you for your response! Here's my command line:

"c:\program files\wget\wget" -r -N -t2 -l2 -E -e robots=off -awGet.log -T
200 -H -Priserless http://www.google.com/search?q=riserless


I have tried the URL in single quotes, double quotes and no quotes with the
same result: A 403 Forbidden error. The logfile is given below. Thank your
for your help!

--12:41:25--  http://www.google.com/search?q=riserless
           => `riserless/www.google.com/[EMAIL PROTECTED]'
Resolving www.google.com... 64.233.167.104, 64.233.167.99
Connecting to www.google.com[64.233.167.104]:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
12:41:26 ERROR 403: Forbidden.


FINISHED --12:41:26--
Downloaded: 0 bytes in 0 files

-----Original Message-----
From: Jens Rösner [mailto:[EMAIL PROTECTED] 
Sent: Saturday, June 12, 2004 11:30 AM
To: Phil Lewis
Cc: [EMAIL PROTECTED]
Subject: Re: Cannot WGet Google Search Page?


Hi Phil!

Without more info (wget's verbose or even debug output, full command
line,...) I find it hard to tell what is happening.
However, I have had very good success with wget and google.
So, some hints:
1. protect the google URL by enclosing it in "
2. remember to span (and allow only certain) hosts, otherwise, wget will
only download google pages 
And lastly -but you obviously did so- think about restricting the recursion
depth.

Hope that helps a bit
Jens

 > I have been trying to wget several levels deep from a Google search page
> (e.g., http://www.google.com/search?=deepwater+oil). But on the very 
> first page, wget returns a 403 Forbidden error and stops. Anyone know 
> how I can get around this?
> 
> Regards, Phil 
> Philip E. Lewis, P.E.
> [EMAIL PROTECTED]
> 
> 

-- 
"Sie haben neue Mails!" - Die GMX Toolbar informiert Sie beim Surfen! Jetzt
aktivieren unter http://www.gmx.net/info

RE: Cannot WGet Google Search Page?

Reply via email to