zhengchenyu created YARN-6091: --------------------------------- Summary: the AppMaster register failed when use Docker on LinuxContainer Key: YARN-6091 URL: https://issues.apache.org/jira/browse/YARN-6091 Project: Hadoop YARN Issue Type: Bug Components: nodemanager, yarn Affects Versions: 2.8.0 Environment: CentOS Reporter: zhengchenyu Priority: Critical Fix For: 2.8.0
In some servers, When I use Docker on LinuxContainer, I found the aciton that AppMaster register to Resourcemanager failed. But didn't happen in other servers. I found the pclose (in container-executor.c) return different value in different server, even though the process which is launched by popen is running normally. Some server return 0, and others return 13. Because yarn regard the application as failed application when pclose return nonzero, and yarn will remove the AMRMToken, then the AppMaster register failed because Resourcemanager have removed this applicaiton's token. In container-executor.c, the judgement condition is whether the return code is zero. But man the pclose, the document tells that "pclose return -1" represent wrong. So I change the judgement condition, then slove this problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org