I want to do schedule crawling in nutch..... Eg: I have crawled a site which has 1 million pages. and want to crawl the same site for updates once per week automatically(scheduled & incremental crawling). It has to crawl only modified or newly added content.
Is it possible with nutch? If possible how can I achieve it? -- View this message in context: http://www.nabble.com/scheduled-crawling-in-nutch-tp19087524p19087524.html Sent from the Nutch - User mailing list archive at Nabble.com.
