... when using droids to crawl a site, to read and parse pages that have got thru the filters that have been set up, but stop them being passed to the handler?
IE, some sort of post-parsing filter like the AlreadyVisitedFilter but which is applied after the page has been parsed for new links but before the handler is triggered? Or do I have to wait until it hits the handler then find check my cache to see if I've already got that page? (Just trying to separate out my processing stages). Tony Dietrich
