On 12/01/2004 01:42 PM, Luke Baker wrote:
Hey,

Here's a patch that'll allow users to configure how many threads they want to access the same host at the same time. Right Nutch only allows one thread at a time to access any given host. The default will still be 1 thread per host.

The somewhat fuzzy part of this is that it will wait the fetcher.server.delay only when it pops off the last thread accessing a host. With 1 thread per host this results in identical behavior as currently.

Let me know what you think. I've tested it a little and seems to work as it is supposed to.

Thanks,

Luke Baker

Hey,

Anybody have thoughts of committing this to CVS? I've used it for a several million document crawl, and it worked great. This will be a great benefit for those doing intranet crawls and whose infrastucture can afford a fast crawling Nutch.

Thoughts?

Thanks,

Luke Baker


------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to