Hello, I have a problem...
I'm trying to index a small domain, and I'm using org.apache.nutch.crawl.Crawler to do it. The problem, is that after the crawler has indexed all the pages of the domain, I execute the crawler again... and It fetch all the pages again althoug the fetch interval has not expired... This is wrong because it generates a lot of connections... I'm using the default config and this is the command that I execute: org.apache.nutch.crawl.Crawler -depth 1 -threads 1 -topN 5 Can you help me? please Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/re-Crawl-re-fetch-all-pages-each-time-tp4020464.html Sent from the Nutch - User mailing list archive at Nabble.com.