You can either use robots.txt or modify the Fetcher. Fetcher has a 
FetchItemQueue for each queue, this also records the CrawlDelay for that queue. 
A FetchItemQueue is created by FetchItemQueues.getFetchItemQueue(), here it 
sets the CrawlDelay for the queue. You can have a lookup table here that looks 
for CrawlDelay for a given queue id (host or domain or IP).

 
 
-----Original message-----
> From:vivekvl <vive...@yahoo.com>
> Sent: Tue 28-May-2013 16:01
> To: user@nutch.apache.org
> Subject: How to achieve different fetcher.server.delay configuration for 
> different hosts/sub domains?
> 
> I have a problem in configuring fetcher.server.delay for my crawl. Some of
> the sub domains needs fetcher.server.delay to be high and some needs this to
> be less. Whether there is a straight forward way to achieve this? If yes
> what are the configurations I need to make.
> 
> If this is not going to be simple, is there any workaround to achieve this?
> 
> Thanks,
> Vivek
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/How-to-achieve-different-fetcher-server-delay-configuration-for-different-hosts-sub-domains-tp4066505.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
> 

Reply via email to