[ https://issues.apache.org/jira/browse/MAPREDUCE-3638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196427#comment-13196427 ]
Philip Su commented on MAPREDUCE-3638: -------------------------------------- I did some more follow up testing on this and I think I know more precisely where the problem is. 1) The failure occurs when running a streaming job with the -cacheFile option on a local file system using file:///<path>. 2) I ran hdfs dfs -ls file:///<path> to make sure the file exists. 3) I ran the same streaming job using the same value from 1). But instead of using the deprecated -cacheFile option, I used -files instead. The job ran and passed. So is seems when running the streaming job using the deprecated option -cacheFile on a local file system, it is not getting the correct file permission on it. > Yarn trying to download cacheFile to container but Path is a local file > ----------------------------------------------------------------------- > > Key: MAPREDUCE-3638 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3638 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 > Affects Versions: 0.23.0 > Reporter: Thomas Graves > Assignee: Mahadev konar > > It looks like the AM, which is running on > host1.com, is trying to access a local file but the file is on host2.com > (where the command was run). > ran: > hadoop --config conf/hadoop/ > jar hadoop-streaming.jar -Dmapreduce.job.acl-view-job=* > -input Streaming/streaming-610/input.txt -mapper 'xargs cat' > -reducer cat -output > Streaming/streaming-610/Output -cacheFile > file://Streaming/data/streaming-610//InputFile#testlink > -jobconf mapred.map.tasks=1 -jobconf mapred.reduce.tasks=1 > -jobconf > mapred.job.name=streamingTest-610 -jobconf > mapreduce.job.acl-view-job=* > failure: > 11/11/10 07:48:06 INFO mapreduce.Job: Job job_1320887371559_0215 failed with > state FAILED due to: Application > application_1320887371559_0215 failed 1 times due to AM Container for > appattempt_1320887371559_0215_000001 exited with > exitCode: -1000 due to: java.io.FileNotFoundException: File > file:/Streaming/data/streaming-610/InputFile > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:431) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:315) > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:85) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:152) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira