You can either use robots.txt or modify the Fetcher. Fetcher has a FetchItemQueue for each queue, this also records the CrawlDelay for that queue. A FetchItemQueue is created by FetchItemQueues.getFetchItemQueue(), here it sets the CrawlDelay for the queue. You can have a lookup table here that looks for CrawlDelay for a given queue id (host or domain or IP).
-----Original message----- > From:vivekvl <vive...@yahoo.com> > Sent: Tue 28-May-2013 16:01 > To: user@nutch.apache.org > Subject: How to achieve different fetcher.server.delay configuration for > different hosts/sub domains? > > I have a problem in configuring fetcher.server.delay for my crawl. Some of > the sub domains needs fetcher.server.delay to be high and some needs this to > be less. Whether there is a straight forward way to achieve this? If yes > what are the configurations I need to make. > > If this is not going to be simple, is there any workaround to achieve this? > > Thanks, > Vivek > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/How-to-achieve-different-fetcher-server-delay-configuration-for-different-hosts-sub-domains-tp4066505.html > Sent from the Nutch - User mailing list archive at Nabble.com. >