Jerome and Jeff Thanks for the help:)
I found the answers in the wiki faq, to recover an aborted fetch, which has insightful It also mentions you can "indexed what was already crawled" "You should be able to index the part of the segment for crawling which is allready fetched. " I tried the commands, i put in my least email bin/nutch index indexes crawled/linkdb crawled/segments/* But it failed. How can I recover an aborted fetch process? Well, you can not. However, you have two choices to proceed: 1) Recover the pages already fetched and than restart the fetcher. You'll need to create a file fetcher.done in the segment directory an than: updatedb, generate and fetch . Assuming your index is at /index % touch /index/segments/2005somesegment/fetcher.done % bin/nutch updatedb /index/db/ /index/segments/2005somesegment/ % bin/nutch generate /index/db/ /index/segments/2005somesegment/ % bin/nutch fetch /index/segments/2005somesegment All the pages that were not crawled will be re-generated for fetch. If you fetched lots of pages, and don't want to have to re-fetch them again, this is the best way. 2) Discard the aborted output. Delete all folders from the segment folder except the fetchlist folder and restart the fetcher. Richard Braman mailto:[EMAIL PROTECTED] 561.748.4002 (voice) http://www.taxcodesoftware.org Free Open Source Tax Software -----Original Message----- From: Richard Braman [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 28, 2006 5:02 PM To: nutch-dev@lucene.apache.org Subject: FW: Index aborted crawl. I had to abort a crawl mid-crawl (after 2 days of crawling becuse I realized I had an error in my filter). I know at least 6 segments were fetched, I tried the command bin/nutch index indexes crawled/linkdb crawled/segments/* but it failed. I would like to review the results of the crawl, but if its impossible its impossible.