Thanks for the comments. Here's the solution. http://groups.google.com/group/comp.os.linux.help/msg/5b086b3500985efe
On Dec 7, 2007 10:19 PM, Micah Cowan <[EMAIL PROTECTED]> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Josh Williams wrote: > > On 12/7/07, Brian <[EMAIL PROTECTED]> wrote: > >> For the life of me, I cannot convince wget to download an old copy of a > >> website from the Internet Archive. I think the url within a url is > somehow > >> messing it up.. > >> > >> wget -e robots=off --base= > >> http://web.archive.org/web/19990125085924/http://gnu.org/ > >> -r -Gbase > >> http://web.archive.org/web/19990125085924/http://gnu.org/ > >> > >> How can I get this to work? > > > > We've seen this issue a lot. IIRC, the --base option does no good in > > this instance because the problem is actually a parsing error. > > No parsing error. Archive uses JavaScript to reset the URLs to refer to > archive pages in a browser, but without JavaScript they're pointing at > the original links. > > Note that the archive terms of service prohibit the use of automated > crawlers, and in particular making personal copies. > > - --base doesn't work for this because it's not intended to override > _real_ bases, but to specify the base for relative links that wget reads > from input files. > > - -- > Micah J. Cowan > Programmer, musician, typesetting enthusiast, gamer... > http://micah.cowan.name/ > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.6 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > > iD8DBQFHWila7M8hyUobTrERAjYsAJ43y4F+/eoqik1itAsZjm2d0BnwFgCfUkKn > 4du9KE4ozn1CGOROS3xeTKg= > =Uqws > -----END PGP SIGNATURE----- >