Hello everyone, I've been experimenting with combinations of --recursive, --span-hosts, --page-requisites, --domains='X,Y,Z' for downloading pages from blogs and forums, and can't figure out how to do exactly what I want.
I want to follow pages recursively, but only within certain domains, so I set --recursive, --span-hosts, and --domains='X,Y,Z'. For each page fetched I also want to grab all the page requisites, especially images and CSS files, so I set --page-requisites, but it looks like --page-requisites is subject to --span-hosts and the --domain= flag, so it won't grab images outside of the domains I specify. What I'd like is for --page-requisites to visit any domains needed without restriction, but of course if I just set --span-hosts and don't set --domains=, then I get a runaway recursive download. (Currently I'm solving this by getting the pages once, grepping for img tags, then adding those domains to my --domains flag. But this backfires on me if someone links to the image hosting site in the page I'm fetching, and I get runaway recursion.) Is there any way to do what I want? Thanks in advance. Chris
