[ 
http://issues.apache.org/jira/browse/NUTCH-266?page=comments#action_12422929 ] 
            
Sami Siren commented on NUTCH-266:
----------------------------------

I finally found the time to setup an environment with cygwin and try this out. 
I can confirm that the hadoop.jar version provided with nutch gives these 
errors. 

I then checked tested nutch with hadoop nightly jar and everything worked just 
fine.

Can someone try the hadoop nightly jar with nutch and see if it works for you. 
Nightly builds for hadoop are available from
http://people.apache.org/dist/lucene/hadoop/nightly/

just extract the archive and grab the hadoop-nightly.jar from there and replace 
the one in nutch installation with that one

thanks

> hadoop bug when doing updatedb
> ------------------------------
>
>                 Key: NUTCH-266
>                 URL: http://issues.apache.org/jira/browse/NUTCH-266
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 0.8-dev
>         Environment: windows xp, JDK 1.4.2_04
>            Reporter: Eugen Kochuev
>
> I constantly get the following error message
> 060508 230637 Running job: job_pbhn3t
> 060508 230637 
> c:/nutch/crawl-20060508230625/crawldb/current/part-00000/data:0+245
> 060508 230637 
> c:/nutch/crawl-20060508230625/segments/20060508230628/crawl_fetch/part-00000/data:0+296
> 060508 230637 
> c:/nutch/crawl-20060508230625/segments/20060508230628/crawl_parse/part-00000:0+5258
> 060508 230637 job_pbhn3t
> java.io.IOException: Target 
> /tmp/hadoop/mapred/local/reduce_qnd5sx/map_qjp7tf.out already exists
>         at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:162)
>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:62)
>         at 
> org.apache.hadoop.fs.LocalFileSystem.renameRaw(LocalFileSystem.java:191)
>         at org.apache.hadoop.fs.FileSystem.rename(FileSystem.java:306)
>         at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:101)
> Exception in thread "main" java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:341)
>         at org.apache.nutch.crawl.CrawlDb.update(CrawlDb.java:54)
>         at org.apache.nutch.crawl.Crawl.main(Crawl.java:114)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to