First let me say, plucker is an excellent set of tools and applications! I'm having trouble figuring out the best way to do this. Specifically, I seem to have problems getting best results with meta news sites. Let's say I want to pluck slashdot. If I set max depth to 2, while making it stay on host, I won't get the article, rather, only the article talking about the article. That's about worthless. If I remove the stayonhost restriction, it quickly spiders far too much stuff, mostly, away from the meta site, which makes the pdb grow far too large and simply wastes time spidering it. This again, is not what I want. It would be nice to allow for something like:
--maxnonhomedepth 1 --maxdepth 5 Which would allow a max depth of 5 for anything at or below the home url (the meta news site, in this case) while anything not below the home url, would get a max depth of 1. This way, as for meta news sites, I can still get the referred article AND the content being provided by the meta news site. Is there any way to do what I'm wanting to do without modifying plucker? Thanks, Greg _______________________________________________ plucker-list mailing list [EMAIL PROTECTED] http://lists.rubberchicken.org/mailman/listinfo/plucker-list

