[jira] [Commented] (MAPREDUCE-3638) Yarn trying to download cacheFile to container but Path is a local file

2012-02-29 Thread Ramya Sunil (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13219605#comment-13219605
 ] 

Ramya Sunil commented on MAPREDUCE-3638:


cacheFile for local FS was never supported. cacheFile downloads files from HDFS 
only. This is a deprecated option and files option has to be used for 
downloading files from local FS. This is not an issue.

 Yarn trying to download cacheFile to container but Path is a local file
 ---

 Key: MAPREDUCE-3638
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3638
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Thomas Graves
Assignee: Mahadev konar

 It looks like the AM, which is running on
 host1.com, is trying to access a local file but the file is on host2.com
 (where the command was run).
 ran:
 hadoop --config conf/hadoop/ 
 jar hadoop-streaming.jar  -Dmapreduce.job.acl-view-job=*   
   -input Streaming/streaming-610/input.txt   -mapper 'xargs cat'  
  -reducer cat  -output
 Streaming/streaming-610/Output  -cacheFile
 file://Streaming/data/streaming-610//InputFile#testlink
  -jobconf mapred.map.tasks=1   -jobconf mapred.reduce.tasks=1 
  -jobconf
 mapred.job.name=streamingTest-610  -jobconf 
 mapreduce.job.acl-view-job=*
 failure:
 11/11/10 07:48:06 INFO mapreduce.Job: Job job_1320887371559_0215 failed with 
 state FAILED due to: Application
 application_1320887371559_0215 failed 1 times due to AM Container for 
 appattempt_1320887371559_0215_01 exited with 
 exitCode: -1000 due to: java.io.FileNotFoundException: File
 file:/Streaming/data/streaming-610/InputFile
 does not exist
 at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:431)
 at 
 org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:315)
 at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:85)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:152)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3638) Yarn trying to download cacheFile to container but Path is a local file

2012-01-30 Thread Philip Su (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13196427#comment-13196427
 ] 

Philip Su commented on MAPREDUCE-3638:
--

I did some more follow up testing on this and I think I know more precisely 
where the problem is. 

1) The failure occurs when running a streaming job with the -cacheFile option 
on a local file system using file:///path. 
2) I ran hdfs dfs -ls file:///path to make sure the file exists. 
3) I ran the same streaming job using the same value from 1). But instead of 
using the deprecated -cacheFile option, I used -files instead. The job ran and 
passed. 

So is seems when running the streaming job using the deprecated option 
-cacheFile on a local file system, it is not getting the correct file 
permission on it. 


 Yarn trying to download cacheFile to container but Path is a local file
 ---

 Key: MAPREDUCE-3638
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3638
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Thomas Graves
Assignee: Mahadev konar

 It looks like the AM, which is running on
 host1.com, is trying to access a local file but the file is on host2.com
 (where the command was run).
 ran:
 hadoop --config conf/hadoop/ 
 jar hadoop-streaming.jar  -Dmapreduce.job.acl-view-job=*   
   -input Streaming/streaming-610/input.txt   -mapper 'xargs cat'  
  -reducer cat  -output
 Streaming/streaming-610/Output  -cacheFile
 file://Streaming/data/streaming-610//InputFile#testlink
  -jobconf mapred.map.tasks=1   -jobconf mapred.reduce.tasks=1 
  -jobconf
 mapred.job.name=streamingTest-610  -jobconf 
 mapreduce.job.acl-view-job=*
 failure:
 11/11/10 07:48:06 INFO mapreduce.Job: Job job_1320887371559_0215 failed with 
 state FAILED due to: Application
 application_1320887371559_0215 failed 1 times due to AM Container for 
 appattempt_1320887371559_0215_01 exited with 
 exitCode: -1000 due to: java.io.FileNotFoundException: File
 file:/Streaming/data/streaming-610/InputFile
 does not exist
 at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:431)
 at 
 org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:315)
 at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:85)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:152)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3638) Yarn trying to download cacheFile to container but Path is a local file

2012-01-30 Thread Philip Su (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13196499#comment-13196499
 ] 

Philip Su commented on MAPREDUCE-3638:
--

It's not urgent. We do have 4 regression tests blocked by this, so it would be 
good to have this fixed at some point in the near future. Thanks!

 Yarn trying to download cacheFile to container but Path is a local file
 ---

 Key: MAPREDUCE-3638
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3638
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Thomas Graves
Assignee: Mahadev konar

 It looks like the AM, which is running on
 host1.com, is trying to access a local file but the file is on host2.com
 (where the command was run).
 ran:
 hadoop --config conf/hadoop/ 
 jar hadoop-streaming.jar  -Dmapreduce.job.acl-view-job=*   
   -input Streaming/streaming-610/input.txt   -mapper 'xargs cat'  
  -reducer cat  -output
 Streaming/streaming-610/Output  -cacheFile
 file://Streaming/data/streaming-610//InputFile#testlink
  -jobconf mapred.map.tasks=1   -jobconf mapred.reduce.tasks=1 
  -jobconf
 mapred.job.name=streamingTest-610  -jobconf 
 mapreduce.job.acl-view-job=*
 failure:
 11/11/10 07:48:06 INFO mapreduce.Job: Job job_1320887371559_0215 failed with 
 state FAILED due to: Application
 application_1320887371559_0215 failed 1 times due to AM Container for 
 appattempt_1320887371559_0215_01 exited with 
 exitCode: -1000 due to: java.io.FileNotFoundException: File
 file:/Streaming/data/streaming-610/InputFile
 does not exist
 at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:431)
 at 
 org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:315)
 at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:85)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:152)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-3638) Yarn trying to download cacheFile to container but Path is a local file

2012-01-29 Thread Arun C Murthy (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195877#comment-13195877
 ] 

Arun C Murthy commented on MAPREDUCE-3638:
--

This looks like a very long-standing bug, this code hasn't changed since 2009...

 Yarn trying to download cacheFile to container but Path is a local file
 ---

 Key: MAPREDUCE-3638
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3638
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 0.23.0
Reporter: Thomas Graves
Assignee: Mahadev konar
Priority: Blocker

 It looks like the AM, which is running on
 host1.com, is trying to access a local file but the file is on host2.com
 (where the command was run).
 ran:
 hadoop --config conf/hadoop/ 
 jar hadoop-streaming.jar  -Dmapreduce.job.acl-view-job=*   
   -input Streaming/streaming-610/input.txt   -mapper 'xargs cat'  
  -reducer cat  -output
 Streaming/streaming-610/Output  -cacheFile
 file://Streaming/data/streaming-610//InputFile#testlink
  -jobconf mapred.map.tasks=1   -jobconf mapred.reduce.tasks=1 
  -jobconf
 mapred.job.name=streamingTest-610  -jobconf 
 mapreduce.job.acl-view-job=*
 failure:
 11/11/10 07:48:06 INFO mapreduce.Job: Job job_1320887371559_0215 failed with 
 state FAILED due to: Application
 application_1320887371559_0215 failed 1 times due to AM Container for 
 appattempt_1320887371559_0215_01 exited with 
 exitCode: -1000 due to: java.io.FileNotFoundException: File
 file:/Streaming/data/streaming-610/InputFile
 does not exist
 at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:431)
 at 
 org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:315)
 at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:85)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:152)
 at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira