I tried setting hadoop.tmp.dir to /cygdrive/d/tmp and it created D:\cygdrive\d\tmp\mapred\temp\inject-temp-1365510909\_reduce_n7v9vq.
The same error occurred:- 2008-02-15 10:19:22,833 WARN mapred.LocalJobRunner - job_local_1 java.io.IOException: Target file:/D:/cygdrive/d/tmp/mapred/temp/inject-temp-1365 510909/_reduce_n7v9vq/part-00000 already exists at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:246) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:125) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:116) at org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:180) at org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:394) at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:452) at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:469) at org.apache.hadoop.mapred.Task.saveTaskOutput(Task.java:426) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:165) Regards, Susam Pal On Thu, Feb 14, 2008 at 10:07 PM, Susam Pal <[EMAIL PROTECTED]> wrote: > What I did try was setting hadoop.tmp.dir to /opt/tmp. I found the > behavior strange. I had an /opt/tmp directory in my Cygwin > installation (Absolute Windows path: D:\Cygwin\opt\tmp) and I was > expecting Hadoop to use it. However, it created a new D:\opt\tmp and > wrote the temp files there. Of course this failed with the same error. > > Right now I don't have a Windows system with me. I will try setting it > as /cygdrive/d/tmp/ tomorrow when I again have access to a Windows > system and then I'll update the mailing list with the observations. > Thanks for the suggestion. > > Regards, > Susam Pal > > > > On Thu, Feb 14, 2008 at 9:41 PM, Dennis Kubes <[EMAIL PROTECTED]> wrote: > > I think what might be occurring is a file path issue with hadoop. I > > have seen it in the past. Can you try on windows using the cygdrive > > path and see if that works? For below it would be /cygdrive/D/tmp/ ... > > > > Dennis > > > > > > > > Susam Pal wrote: > > > I can confirm this error as I just tried running the last revision of > > > Nutch, rev-620818 on Debian as well as Cygwin on Windows. > > > > > > It works fine on Debian but fails on Cygwin with this error:- > > > > > > 2008-02-14 19:49:47,756 WARN regex.RegexURLNormalizer - can\'t find > > > rules for scope \'inject\', using default > > > 2008-02-14 19:49:48,381 WARN mapred.LocalJobRunner - job_local_1 > > > java.io.IOException: Target > > > > file:/D:/tmp/hadoop-guest/mapred/temp/inject-temp-322737506/_reduce_bjm6rw/part-00000 > > > already exists > > > at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:246) > > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:125) > > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:116) > > > at > org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:196) > > > at > org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:394) > > > at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:452) > > > at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:469) > > > at org.apache.hadoop.mapred.Task.saveTaskOutput(Task.java:426) > > > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:165) > > > 2008-02-14 19:49:49,225 FATAL crawl.Injector - Injector: > > > java.io.IOException: Job failed! > > > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:831) > > > at org.apache.nutch.crawl.Injector.inject(Injector.java:162) > > > at org.apache.nutch.crawl.Injector.run(Injector.java:192) > > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > > > at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:54) > > > at org.apache.nutch.crawl.Injector.main(Injector.java:182) > > > > > > Indeed the \'inject-temp-322737506\' is present in the specified > > > folder of D drive and doesn\'t get deleted. > > > > > > Is this because multiple map/reduce is running and one of them is > > > finding the directory to be present and therefore fails? > > > > > > So, I also tried setting this in \'conf/hadoop-site.xml\':- > > > > > > <property> > > > <name>mapred.speculative.execution</name> > > > <value>false</value> > > > <description></description> > > > </property> > > > > > > I wonder why the same issue doesn\'t occur in Linux. I am not well > > > acquainted with the Hadoop code yet. Could someone throw light on what > > > might be going wrong? > > > > > > Regards, > > > Susam Pal > > > > > > On 2/7/08, DS jha <[EMAIL PROTECTED]> wrote: > > > Hi - > > >> Looks like latest trunk version of nutch is failing with the following > > >> exception when trying to perform inject operation: > > >> > > >> java.io.IOException: Target > > >> > file:/tmp/hadoop-user/mapred/temp/inject-temp-1280136828/_reduce_dv90x0/part-00000 > > >> already exists > > >> at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:246) > > >> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:125) > > >> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:116) > > >> at > org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:196) > > >> at > org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:394) > > >> at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:452) > > >> at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:469) > > >> at org.apache.hadoop.mapred.Task.saveTaskOutput(Task.java:426) > > >> at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:165) > > >> > > >> Any thoughts? > > >> > > >> Thanks > > >> Jha > > >> > > >