Zhou, Jingchen <jingchen <at> SLAC.Stanford.EDU> writes:
>
> Hi,
>
> I am trying to use wget to mirror a web site
>
> Here is the command I use:
>
> $ wget -r -l2 --no-parent -nH -k --cut-dirs=4
http://www.slac.stanford.edu/grp/cd/soft/unix/dev/
> <http://www.slac.stanford.edu/grp/cd/soft/unix/dev/>
>
> For some reason, not all the files are mirrored (or synced over)...
</snip>
OK,
It's not for "some reason" it's for a very good reason... because that's the way
wget works!
It would:
Download index.html (or the default page)
Parse that page for any links.
For each link
Download page (and parse links)
next link
If the documents aren't linked, they're never found.
A better solution would be to FTP the site such as
wget -r ftp://username:[email protected]
or
wget -r ftp://username:[email protected]/users/MyName
Hope this helps,
Vappy