Hello.

I'm using wget to download sites for a java-project. Wget is run with the
RunTime-class in java. Everything has been working fine until I tried to
download really large sites. My example was www.startsiden.no (a Norwegian
web-portal with a large amount of external linking) with a depth of 2. The
command that was run:

wget --tries=3 --timeout=300 -N -erobots=off --random-wait
--directory-prefix=2004-11-05T11:42:12_www.startsiden.no --html-extension
--convert-links --page-requisites -o
/var/www/sites/logs/2004-11-05T11:42:12_www.startsiden.no.log -r --level=2
-H http://www.startsiden.no

The download seems to run as normal for a little over five hours. After that
it just locks up. The last line in the log-file is:

HTTP request sent, awaiting response... 302 Moved Temporarily
Location: 
https://secure1.netpower.no/den-norske-kreftforening/julegave/skjema.html
[following]
--17:02:50--  
https://secure1.netpower.no/den-norske-kreftforening/julegave/skjema.html
           => 
`2004-11-05T11:42:12_www.startsiden.no/secure1.netpower.no/den-norske-kreftf
orening/julegave/skjema.html'
Resolving secure1.netpower.no... 212.33.133.203
Connecting to secure1.netpower.no[212.33.133.203]:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 10,486 [text/html]

    0K ..........

The time is now: nov 6 13:01:01 CET 2004, so the process has been idle for
20 hours.

Why does this happen? And can I do anything to prevent it?

Please reply to [EMAIL PROTECTED], since I don't subscribe to this list.

Thank you.

Kind regards
Christian Larsen


Reply via email to