Re: Using --spider to check for dead links?
Hi, First of all thanks for the quick answer! :) Am 18.07.2006 17:34, Mauro Tortonesi schrieb: Stefan Melbinger ha scritto: I need to check whole websites for dead links, with output easy to parse for lists of dead links, statistics, etc... Does anybody have experience with that problem or has maybe used the --spider mode for this before (as suggested by some pages)? > historically, wget never really supported recursive --spider mode. fortunately, this has been fixed in 1.11-alpha-1: How will wget react when started in recursive --spider mode? It will have to download, parse and delete/forget HTML pages in order to know where to go, but what happens with images and large files like videos, for example? Will wget check whether they exist? Thanks a lot, Stefan PS: The background for my question is that my company wants to check large websites for dead links (without using any commercial software). Hours of Google-searching left me with wget, which seems to have the best fundamentals to do this...
Re: Using --spider to check for dead links?
Stefan Melbinger ha scritto: Hello, I need to check whole websites for dead links, with output easy to parse for lists of dead links, statistics, etc... Does anybody have experience with that problem or has maybe used the --spider mode for this before (as suggested by some pages)? If this should work, all HTML pages would have to be parsed completely, while pictures and other files should only be HEAD-checked for existence (in order to save bandwidth)... Using --spider and --spider -r was not the right way to do this, I fear. Any help is appreciated, thanks in advance! hi stefan, historically, wget never really supported recursive --spider mode. fortunately, this has been fixed in 1.11-alpha-1: http://www.mail-archive.com/wget@sunsite.dk/msg09071.html so, it will be included in the upcoming 1.11 release. -- Aequam memento rebus in arduis servare mentem... Mauro Tortonesi http://www.tortonesi.com University of Ferrara - Dept. of Eng.http://www.ing.unife.it GNU Wget - HTTP/FTP file retrieval tool http://www.gnu.org/software/wget Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net Ferrara Linux User Group http://www.ferrara.linux.it
Using --spider to check for dead links?
Hello, I need to check whole websites for dead links, with output easy to parse for lists of dead links, statistics, etc... Does anybody have experience with that problem or has maybe used the --spider mode for this before (as suggested by some pages)? If this should work, all HTML pages would have to be parsed completely, while pictures and other files should only be HEAD-checked for existence (in order to save bandwidth)... Using --spider and --spider -r was not the right way to do this, I fear. Any help is appreciated, thanks in advance! Greets, Stefan Melbinger