I want to do some specific crawling where I crawl one site with one
set of urls to accept/reject, then reset to crawl another site with
another set of urls to accept/reject, etc.  I'm writing my own wrapper
that sticks the urls to accept/reject into the Configuration and a
URLFilter that uses that configuration item to do the
accepting/rejecting, but I don't see how to make it start at a given
url other than making a dir/url file with that url in it.  In this
case that's inefficient and I'd rather just parse one file with a list
of urls and the accept/reject list for that url, then say "Inject this
url", then do my own generate/fetch/updatedb cycle, then inject the
next and repeat.

-- 
http://www.linkedin.com/in/paultomblin

Reply via email to