[ 
https://issues.apache.org/jira/browse/YARN-253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated YARN-253:
---------------------------

    Attachment: YARN-253-test.patch

This change to the DS unit test exposes the problem by changing the mini YARN 
cluster to have multiple NMs and multiple local directories so that the 
appcache created by the AM is not necessarily the same one used by the shell 
container.

Here's the stacktrace showing the failure:

{noformat}
2012-11-30 18:01:02,729 WARN  [ContainersLauncher #0] launcher.ContainerLaunch 
(ContainerLaunch.java:call(247)) - Failed to launch container.
java.io.FileNotFoundException: File 
/Users/tom/workspace/hadoop-2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/target/org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell/org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell-localDir-nm-2_0/usercache/tom/appcache/application_1354298449311_0001
 does not exist
        at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:485)
        at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:996)
        at 
org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:150)
        at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:187)
        at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:712)
        at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:708)
        at 
org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2361)
        at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:708)
        at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.createDir(DefaultContainerExecutor.java:332)
        at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:128)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:242)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:68)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:680)
{noformat}
                
> Container launch may fail if no files were localized
> ----------------------------------------------------
>
>                 Key: YARN-253
>                 URL: https://issues.apache.org/jira/browse/YARN-253
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.2-alpha
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: YARN-253-test.patch
>
>
> This can be demonstrated with DistributedShell. The containers running the 
> shell do not have any files to localize (if there is no shell script to copy) 
> so if they run on a different NM to the AM (which does localize files), then 
> they will fail since the appcache directory does not exist.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to