Let's say I want to pluck slashdot. If I set max depth to 2, while making it stay on host, I won't get the article, rather, only the article talking about the article. That's about worthless.
Have you tried using http://slashdot.org/palm/ ?
If I remove the stayonhost restriction, it quickly spiders far too much stuff, mostly, away from the meta site, which makes the pdb grow far too large and simply wastes time spidering it.
Try using staybelow="http://slashdot.org/palm/", but realize you'll miss the top header image on the main page of that site. If you want the main image, try using stayonhost with a lower maxdepth. If that doesn't work, try stayondomain.
--maxdepth 5
A depth of 5 is extremely excessive. 3 is the most I've ever seen anyone require for a site like Slashdot.
Is there any way to do what I'm wanting to do without modifying plucker?
Sure, dozens of ways. Each site requires custom treatment, on a site-by-site basis. Just be careful with a site like Slashdot. If you spider it too much, or too often, they'll ban your IP from being able to reach the site again.
David A. Desrosiers [EMAIL PROTECTED] http://gnu-designs.com _______________________________________________ plucker-list mailing list [EMAIL PROTECTED] http://lists.rubberchicken.org/mailman/listinfo/plucker-list

