On Tue, Mar 12, 2002 at 09:59:42AM -0600, Claus wrote:
>First of all I don't use w-get so I don't know much of it, however it does 
>behave in a non-polite manner.

You don't use it, so you're not going to be able to give us any information
about how it was invoked or on what platform, or anything else that can help
us, but you'll be happy to complain about *it* not being polite. Hmmph.

However, I *do* recognize this looping behavior. Read on.

>A small sample from the log files:
>
>2002-03-12 09:53:23 2 80.128.229.62 aldireview.niesens.com 80 "GET 
>/logo.gif HTTP/1.0" 404 0 202 "http://aldireview.niesens.com/"; "Wget/1.7" 
>2002-03-12 09:53:24 2 80.128.229.62 aldireview.niesens.com 80 "GET 
>/index.en.html HTTP/1.0" 404 0 207 "http://aldireview.niesens.com/"; 
>"Wget/1.7" 2002-03-12 09:53:24 2 80.128.229.62 aldireview.niesens.com 80 

This won't give you much comfort, but ...

Wget 1.7 had a bug, one that I worked on to the point of identifying
but that I couldn't trace back to its origin, where one of its
internal data structures that holds the list of links found in a page
was corrupted.

(In wget 1.7) links are maintained in a list, in order found in the page.
Somehow, this list ended up out of order, and caused the looping that you
are seeing here. As an added bonus, the wget process would consume ever more
memory until (on Unix) it drove the system out of memory and got killed. 

This bug is no longer present. So, the solution is to update to Wget
1.8.1.  Since you're not the one running it, but the webmaster who's
being inconvenienced by it, this doesn't help you much, but it's the
best that I can do.

Cc: changed to [EMAIL PROTECTED]

-- 
Alan Eldridge
"Dave's not here, man."

Reply via email to