Removal of default port# in NameNode.getUri() cause a map/reduce job failed to
prompt temporay output
-----------------------------------------------------------------------------------------------------
Key: HADOOP-4717
URL: https://issues.apache.org/jira/browse/HADOOP-4717
Project: Hadoop Core
Issue Type: Bug
Components: dfs
Affects Versions: 0.18.0
Reporter: Hairong Kuang
Fix For: 0.18.3
Problem reported here is that when the default port number (8020) is specified
in the output, job succeeds but no output is created. The cause of the problem
is that "listStatus" call drops the port number because NameNode.getUri removes
the default port#.
Assuming that a map/reduce output directory is set to be
"hdfs://localhost:8020/out", A call "listStatus" on any of its sub directory,
for example, "hdfs://localhost:8020/out/tempXX", returns results like below:
hdfs://localhost/out/tempXX/part-00005
Because of this, Task.java
574 private Path getFinalPath(Path jobOutputDir, Path taskOutput) {
575 URI relativePath =
taskOutputPath.toUri().relativize(taskOutput.toUri());
does not get the correct relativePath because TaskOutputPath contain ports, but
taskOutput doesn't.
It seems to me that the problem could be fixed if we make Path.makeQualified()
to return the same path not matter the input path contains the default port or
not.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.