Anders Rosendal asked:
> Could you make an option to only fetch from other hosts what is directly
> referenced from the orig page?

Have you tried the "--page-requisites" (a.k.a. "-p") command line option?

The info documentation says this:

     Actually, to download a single page and all its requisites (even
     if they exist on separate websites), and make sure the lot
     displays properly locally, this author likes to use a few options
     in addition to `-p':

          wget -E -H -k -K -nh -p http://SITE/DOCUMENT

     In one case you'll need to add a couple more options.  If DOCUMENT
     is a `<FRAMESET>' page, the "one more hop" that `-p' gives you
     won't be enough--you'll get the `<FRAME>' pages that are
     referenced, but you won't get _their_ requisites.  Therefore, in
     this case you'll need to add `-r -l1' to the commandline.  The `-r
     -l1' will recurse from the `<FRAMESET>' page to to the `<FRAME>'
     pages, and the `-p' will get their requisites.  If you're already
     using a recursion level of 1 or more, you'll need to up it by one.
     In the future, `-p' may be made smarter so that it'll do "two
     more hops" in the case of a `<FRAMESET>' page.

     To finish off this topic, it's worth knowing that Wget's idea of an
     external document link is any URL specified in an `<A>' tag, an
     `<AREA>' tag, or a `<LINK>' tag other than `<LINK
     REL="stylesheet">'.

Reply via email to