[ https://issues.apache.org/jira/browse/YARN-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rohith resolved YARN-3775. -------------------------- Resolution: Not A Problem Closing as Not A Problem. Please Reopen if you disagree.. > Job does not exit after all node become unhealthy > ------------------------------------------------- > > Key: YARN-3775 > URL: https://issues.apache.org/jira/browse/YARN-3775 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager > Affects Versions: 2.7.1 > Environment: Environment: > Version : 2.7.0 > OS: RHEL7 > NameNodes: xiachsh11 xiachsh12 (HA enabled) > DataNodes: 5 xiachsh13-17 > ResourceManage: xiachsh11 > NodeManage: 5 xiachsh13-17 > all nodes are openstack provisioned: > MEM: 1.5G > Disk: 16G > Reporter: Chengshun Xia > Attachments: logs.tar.gz > > > Running Terasort with data size 10G, all the containers exit since the disk > space threshold 0.90 reached,at this point,the job does not exit with error > 15/06/05 13:13:28 INFO mapreduce.Job: map 9% reduce 0% > 15/06/05 13:13:52 INFO mapreduce.Job: map 10% reduce 0% > 15/06/05 13:14:30 INFO mapreduce.Job: map 11% reduce 0% > 15/06/05 13:15:11 INFO mapreduce.Job: map 12% reduce 0% > 15/06/05 13:15:43 INFO mapreduce.Job: map 13% reduce 0% > 15/06/05 13:16:38 INFO mapreduce.Job: map 14% reduce 0% > 15/06/05 13:16:41 INFO mapreduce.Job: map 15% reduce 0% > 15/06/05 13:16:53 INFO mapreduce.Job: map 16% reduce 0% > 15/06/05 13:17:24 INFO mapreduce.Job: map 17% reduce 0% > 15/06/05 13:17:53 INFO mapreduce.Job: map 18% reduce 0% > 15/06/05 13:18:36 INFO mapreduce.Job: map 19% reduce 0% > 15/06/05 13:19:03 INFO mapreduce.Job: map 20% reduce 0% > 15/06/05 13:19:09 INFO mapreduce.Job: map 15% reduce 0% > 15/06/05 13:19:32 INFO mapreduce.Job: map 16% reduce 0% > 15/06/05 13:20:00 INFO mapreduce.Job: map 17% reduce 0% > 15/06/05 13:20:36 INFO mapreduce.Job: map 18% reduce 0% > 15/06/05 13:20:57 INFO mapreduce.Job: map 19% reduce 0% > 15/06/05 13:21:22 INFO mapreduce.Job: map 18% reduce 0% > 15/06/05 13:21:24 INFO mapreduce.Job: map 14% reduce 0% > 15/06/05 13:21:25 INFO mapreduce.Job: map 9% reduce 0% > 15/06/05 13:21:28 INFO mapreduce.Job: map 10% reduce 0% > 15/06/05 13:22:22 INFO mapreduce.Job: map 11% reduce 0% > 15/06/05 13:23:06 INFO mapreduce.Job: map 12% reduce 0% > 15/06/05 13:23:41 INFO mapreduce.Job: map 9% reduce 0% > 15/06/05 13:23:42 INFO mapreduce.Job: map 5% reduce 0% > 15/06/05 13:24:38 INFO mapreduce.Job: map 6% reduce 0% > 15/06/05 13:25:16 INFO mapreduce.Job: map 7% reduce 0% > 15/06/05 13:25:53 INFO mapreduce.Job: map 8% reduce 0% > 15/06/05 13:26:35 INFO mapreduce.Job: map 9% reduce 0% > the last response time is 15/06/05 13:26:35 > and current time : > [root@xiachsh11 logs]# date > Fri Jun 5 19:19:59 EDT 2015 > [root@xiachsh11 logs]# > [root@xiachsh11 logs]# yarn node -list > 15/06/05 19:20:18 INFO client.RMProxy: Connecting to ResourceManager at > xiachsh11.eng.platformlab.ibm.com/9.21.62.234:8032 > Total Nodes:0 > Node-Id Node-State Node-Http-Address > Number-of-Running-Containers > [root@xiachsh11 logs]# -- This message was sent by Atlassian JIRA (v6.3.4#6332)