org.apache.nutch.tools.CrawlTool throws error while doing deleteduplicates
--------------------------------------------------------------------------

         Key: NUTCH-148
         URL: http://issues.apache.org/jira/browse/NUTCH-148
     Project: Nutch
        Type: Bug
  Components: indexer  
    Versions: 0.8-dev    
 Environment: Windows XP Home
    Reporter: raghavendra prabhu


I get the following error while running org.apache.nutch.tools.CrawlTool

The error actually is in deleteduplicates 

51223 001121 Reading url hashes...
051223 001121 Sorting url hashes...
051223 001121 Deleting url duplicates...
051223 001121 Error moving bad file 
G:\apache-tomcat-5.5.12\webapps\crux\WEB-INF
\classes\ddup-workingdir\ddup-20051223001121: java.io.IOException: 
CreateProcess
: df -k  
G:\apache-tomcat-5.5.12\webapps\crux\WEB-INF\classes\ddup-workingdir\ddup-20051223001121
 error=2
It throws the error here in NFSDataInputStream.java
The exception is org.apache.nutch.fs.ChecksumException: Checksum 
error: G:\apach
e-tomcat-5.5.12\webapps\crux\WEB-INF\classes\ddup-workingdir\ddup-20051223001121
 at 0

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to