[
http://issues.apache.org/jira/browse/NUTCH-266?page=comments#action_12417387 ]
KuroSaka TeruHiko commented on NUTCH-266:
-----------------------------------------
Both Eugine's case and my case are failing in the call chain started at line
101 of LocalJobRunner.java,
which reads:
if (!localFs.rename(mapOut, reduceIn)) // Line 101
throw new IOException("Couldn't rename " + mapOut); // Line 102
This eventually calls LocalFileSystem.renameRaw(Path, Path) whose
implementation is:
public boolean renameRaw(Path src, Path dst) throws IOException {
if (useCopyForRename) {
return FileUtil.copy(this, src, this, dst, true, getConf());
} else return pathToFile(src).renameTo(pathToFile(dst));
}
The difference in the error message between Eugine's and mine is whether
useCopyForRename was true or false.
I inserted a LOG.debug() call at the entrance of FileSystem.rename() to see
what rename
is asked to do. Below is the output:
2006-06-22 15:45:11,996 DEBUG dfs.DistributedFileSystem
(FileSystem.java:rename(308)) - Renaming
"C:/tmp/hadoop/mapred/local/map_iwp4ih/part-0.out" to
"C:/tmp/hadoop/mapred/local/reduce_ilpajy/map_1.out"...
2006-06-22 15:45:12,012 DEBUG dfs.DistributedFileSystem
(FileSystem.java:rename(308)) - Renaming
"C:/tmp/hadoop/mapred/local/map_iwp4ih/part-0.out" to
"C:/tmp/hadoop/mapred/local/reduce_ilpajy/map_2.out"...
2006-06-22 15:45:12,028 WARN mapred.LocalJobRunner
(LocalJobRunner.java:run(119)) - job_i2gl4i
java.io.IOException: Couldn't rename
C:/tmp/hadoop/mapred/local/map_iwp4ih/part-0.out
As seen, the same rename operation is attempted twice, the first one succeeded
while the second one failed.
Is this how rename is supposed to be called?
Another thing I noticed, by comparing the source code of the version that works
and the version that doesn't, is that "File" (java.io.File?) has been replaced
by "Path" (org.apache.hadoop.fs.Path?) recently. This may relate to the
problem we are having.
> hadoop bug when doing updatedb
> ------------------------------
>
> Key: NUTCH-266
> URL: http://issues.apache.org/jira/browse/NUTCH-266
> Project: Nutch
> Type: Bug
> Versions: 0.8-dev
> Environment: windows xp, JDK 1.4.2_04
> Reporter: Eugen Kochuev
>
> I constantly get the following error message
> 060508 230637 Running job: job_pbhn3t
> 060508 230637
> c:/nutch/crawl-20060508230625/crawldb/current/part-00000/data:0+245
> 060508 230637
> c:/nutch/crawl-20060508230625/segments/20060508230628/crawl_fetch/part-00000/data:0+296
> 060508 230637
> c:/nutch/crawl-20060508230625/segments/20060508230628/crawl_parse/part-00000:0+5258
> 060508 230637 job_pbhn3t
> java.io.IOException: Target
> /tmp/hadoop/mapred/local/reduce_qnd5sx/map_qjp7tf.out already exists
> at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:162)
> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:62)
> at
> org.apache.hadoop.fs.LocalFileSystem.renameRaw(LocalFileSystem.java:191)
> at org.apache.hadoop.fs.FileSystem.rename(FileSystem.java:306)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:101)
> Exception in thread "main" java.io.IOException: Job failed!
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:341)
> at org.apache.nutch.crawl.CrawlDb.update(CrawlDb.java:54)
> at org.apache.nutch.crawl.Crawl.main(Crawl.java:114)
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers