zhengchenyu created YARN-6091:
---------------------------------

             Summary: the AppMaster register failed when use Docker on 
LinuxContainer 
                 Key: YARN-6091
                 URL: https://issues.apache.org/jira/browse/YARN-6091
             Project: Hadoop YARN
          Issue Type: Bug
          Components: nodemanager, yarn
    Affects Versions: 2.8.0
         Environment: CentOS
            Reporter: zhengchenyu
            Priority: Critical
             Fix For: 2.8.0


In some servers, When I use Docker on LinuxContainer, I found the aciton that 
AppMaster register to Resourcemanager failed. But didn't happen in other 
servers. 
I found the pclose (in container-executor.c) return different value in 
different server, even though the process which is launched by popen is running 
normally. Some server return 0, and others return 13. 
Because yarn regard the application as failed application when pclose return 
nonzero, and yarn will remove the AMRMToken, then the AppMaster register failed 
because Resourcemanager have removed this applicaiton's token. 
In container-executor.c, the judgement condition is whether the return code is 
zero. But man the pclose, the document tells that "pclose return -1" represent 
wrong. So I change the judgement condition, then slove this problem. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

Reply via email to