[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v18.patch Correcting minute checkstyle nit in RMNode.java (removing public modifier). {{TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers}} failure is tracked through YARN-5024, although {{testUsageAfterAMRestartKeepContainers}} also seems to be related. They fail irrespective of the patch. {{TestRMWebServicesNodes}} fails per YARN-4947 and is unrelated to this patch. Other failures are long standing known test failures. Findbugs warning is from EventDispatcher.java which is unrelated to this change AFAIK. {code} Bug type DM_EXIT (click for details) In class org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor In method org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run() At EventDispatcher.java:[line 80] {code} > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-branch-2.7.001.patch, > YARN-4311-branch-2.7.002.patch, YARN-4311-branch-2.7.003.patch, > YARN-4311-branch-2.7.004.patch, YARN-4311-v1.patch, YARN-4311-v10.patch, > YARN-4311-v11.patch, YARN-4311-v11.patch, YARN-4311-v12.patch, > YARN-4311-v13.patch, YARN-4311-v13.patch, YARN-4311-v14.patch, > YARN-4311-v15.patch, YARN-4311-v16.patch, YARN-4311-v17.patch, > YARN-4311-v18.patch, YARN-4311-v2.patch, YARN-4311-v3.patch, > YARN-4311-v4.patch, YARN-4311-v5.patch, YARN-4311-v6.patch, > YARN-4311-v7.patch, YARN-4311-v8.patch, YARN-4311-v9.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v17.patch Thanks Jason! Rebased patch and added changes per review comments. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-branch-2.7.001.patch, > YARN-4311-branch-2.7.002.patch, YARN-4311-branch-2.7.003.patch, > YARN-4311-branch-2.7.004.patch, YARN-4311-v1.patch, YARN-4311-v10.patch, > YARN-4311-v11.patch, YARN-4311-v11.patch, YARN-4311-v12.patch, > YARN-4311-v13.patch, YARN-4311-v13.patch, YARN-4311-v14.patch, > YARN-4311-v15.patch, YARN-4311-v16.patch, YARN-4311-v17.patch, > YARN-4311-v2.patch, YARN-4311-v3.patch, YARN-4311-v4.patch, > YARN-4311-v5.patch, YARN-4311-v6.patch, YARN-4311-v7.patch, > YARN-4311-v8.patch, YARN-4311-v9.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v16.patch > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-branch-2.7.001.patch, > YARN-4311-branch-2.7.002.patch, YARN-4311-branch-2.7.003.patch, > YARN-4311-branch-2.7.004.patch, YARN-4311-v1.patch, YARN-4311-v10.patch, > YARN-4311-v11.patch, YARN-4311-v11.patch, YARN-4311-v12.patch, > YARN-4311-v13.patch, YARN-4311-v13.patch, YARN-4311-v14.patch, > YARN-4311-v15.patch, YARN-4311-v16.patch, YARN-4311-v2.patch, > YARN-4311-v3.patch, YARN-4311-v4.patch, YARN-4311-v5.patch, > YARN-4311-v6.patch, YARN-4311-v7.patch, YARN-4311-v8.patch, YARN-4311-v9.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: (was: YARN-4311-v16.patch) > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-branch-2.7.001.patch, > YARN-4311-branch-2.7.002.patch, YARN-4311-branch-2.7.003.patch, > YARN-4311-branch-2.7.004.patch, YARN-4311-v1.patch, YARN-4311-v10.patch, > YARN-4311-v11.patch, YARN-4311-v11.patch, YARN-4311-v12.patch, > YARN-4311-v13.patch, YARN-4311-v13.patch, YARN-4311-v14.patch, > YARN-4311-v15.patch, YARN-4311-v16.patch, YARN-4311-v2.patch, > YARN-4311-v3.patch, YARN-4311-v4.patch, YARN-4311-v5.patch, > YARN-4311-v6.patch, YARN-4311-v7.patch, YARN-4311-v8.patch, YARN-4311-v9.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v16.patch Fixing checkstyle and findbugs issues. Test failures are locally passing and unrelated. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-branch-2.7.001.patch, > YARN-4311-branch-2.7.002.patch, YARN-4311-branch-2.7.003.patch, > YARN-4311-branch-2.7.004.patch, YARN-4311-v1.patch, YARN-4311-v10.patch, > YARN-4311-v11.patch, YARN-4311-v11.patch, YARN-4311-v12.patch, > YARN-4311-v13.patch, YARN-4311-v13.patch, YARN-4311-v14.patch, > YARN-4311-v15.patch, YARN-4311-v16.patch, YARN-4311-v2.patch, > YARN-4311-v3.patch, YARN-4311-v4.patch, YARN-4311-v5.patch, > YARN-4311-v6.patch, YARN-4311-v7.patch, YARN-4311-v8.patch, YARN-4311-v9.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v15.patch Updated patch for addressing rebooted, lost NMs which adds logic to the Node Removal Timer logic. Also added tests for the two and additional test to see if Unhealthy nodes also follow the removal protocol. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-branch-2.7.001.patch, > YARN-4311-branch-2.7.002.patch, YARN-4311-branch-2.7.003.patch, > YARN-4311-branch-2.7.004.patch, YARN-4311-v1.patch, YARN-4311-v10.patch, > YARN-4311-v11.patch, YARN-4311-v11.patch, YARN-4311-v12.patch, > YARN-4311-v13.patch, YARN-4311-v13.patch, YARN-4311-v14.patch, > YARN-4311-v15.patch, YARN-4311-v2.patch, YARN-4311-v3.patch, > YARN-4311-v4.patch, YARN-4311-v5.patch, YARN-4311-v6.patch, > YARN-4311-v7.patch, YARN-4311-v8.patch, YARN-4311-v9.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-4311: - Target Version/s: 2.8.0, 2.7.4 Fix Version/s: (was: 2.8.0) I reverted this from trunk, branch-2, and branch-2.8 until we can work through the lost, rebooted nodes case and make sure the node metrics won't be adversely affected by this change. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-branch-2.7.001.patch, > YARN-4311-branch-2.7.002.patch, YARN-4311-branch-2.7.003.patch, > YARN-4311-branch-2.7.004.patch, YARN-4311-v1.patch, YARN-4311-v10.patch, > YARN-4311-v11.patch, YARN-4311-v11.patch, YARN-4311-v12.patch, > YARN-4311-v13.patch, YARN-4311-v13.patch, YARN-4311-v14.patch, > YARN-4311-v2.patch, YARN-4311-v3.patch, YARN-4311-v4.patch, > YARN-4311-v5.patch, YARN-4311-v6.patch, YARN-4311-v7.patch, > YARN-4311-v8.patch, YARN-4311-v9.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-branch-2.7.004.patch TestResourceTrackerService#testNodeRemoval passes locally but is flaky since the wait period was set to the minimum wait time which may get exceeded during removal. I have added {{waitForNodeRemoval}} that waits for the node to get removed before asserting. This fix should be added to the test in trunk although it was not seen in the precommits. I will do so after some review comments on this. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 2.8.0 > > Attachments: YARN-4311-branch-2.7.001.patch, > YARN-4311-branch-2.7.002.patch, YARN-4311-branch-2.7.003.patch, > YARN-4311-branch-2.7.004.patch, YARN-4311-v1.patch, YARN-4311-v10.patch, > YARN-4311-v11.patch, YARN-4311-v11.patch, YARN-4311-v12.patch, > YARN-4311-v13.patch, YARN-4311-v13.patch, YARN-4311-v14.patch, > YARN-4311-v2.patch, YARN-4311-v3.patch, YARN-4311-v4.patch, > YARN-4311-v5.patch, YARN-4311-v6.patch, YARN-4311-v7.patch, > YARN-4311-v8.patch, YARN-4311-v9.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-branch-2.7.003.patch More fixes and nits. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 2.8.0 > > Attachments: YARN-4311-branch-2.7.001.patch, > YARN-4311-branch-2.7.002.patch, YARN-4311-branch-2.7.003.patch, > YARN-4311-v1.patch, YARN-4311-v10.patch, YARN-4311-v11.patch, > YARN-4311-v11.patch, YARN-4311-v12.patch, YARN-4311-v13.patch, > YARN-4311-v13.patch, YARN-4311-v14.patch, YARN-4311-v2.patch, > YARN-4311-v3.patch, YARN-4311-v4.patch, YARN-4311-v5.patch, > YARN-4311-v6.patch, YARN-4311-v7.patch, YARN-4311-v8.patch, YARN-4311-v9.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-branch-2.7.002.patch Addressed failure of TestNMExpiry along with most of the checkstyle issues except Javadoc missing comments suggestions. ASF warnings, whitespace issues and findbugs warnings are unrelated. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 2.8.0 > > Attachments: YARN-4311-branch-2.7.001.patch, > YARN-4311-branch-2.7.002.patch, YARN-4311-v1.patch, YARN-4311-v10.patch, > YARN-4311-v11.patch, YARN-4311-v11.patch, YARN-4311-v12.patch, > YARN-4311-v13.patch, YARN-4311-v13.patch, YARN-4311-v14.patch, > YARN-4311-v2.patch, YARN-4311-v3.patch, YARN-4311-v4.patch, > YARN-4311-v5.patch, YARN-4311-v6.patch, YARN-4311-v7.patch, > YARN-4311-v8.patch, YARN-4311-v9.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-branch-2.7.001.patch Attaching branch-2.7 version of the patch. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Fix For: 2.8.0 > > Attachments: YARN-4311-branch-2.7.001.patch, YARN-4311-v1.patch, > YARN-4311-v10.patch, YARN-4311-v11.patch, YARN-4311-v11.patch, > YARN-4311-v12.patch, YARN-4311-v13.patch, YARN-4311-v13.patch, > YARN-4311-v14.patch, YARN-4311-v2.patch, YARN-4311-v3.patch, > YARN-4311-v4.patch, YARN-4311-v5.patch, YARN-4311-v6.patch, > YARN-4311-v7.patch, YARN-4311-v8.patch, YARN-4311-v9.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v14.patch Same patch, trying to get PreCommit to pick it up. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-v1.patch, YARN-4311-v10.patch, > YARN-4311-v11.patch, YARN-4311-v11.patch, YARN-4311-v12.patch, > YARN-4311-v13.patch, YARN-4311-v13.patch, YARN-4311-v14.patch, > YARN-4311-v2.patch, YARN-4311-v3.patch, YARN-4311-v4.patch, > YARN-4311-v5.patch, YARN-4311-v6.patch, YARN-4311-v7.patch, > YARN-4311-v8.patch, YARN-4311-v9.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v13.patch > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-v1.patch, YARN-4311-v10.patch, > YARN-4311-v11.patch, YARN-4311-v11.patch, YARN-4311-v12.patch, > YARN-4311-v13.patch, YARN-4311-v13.patch, YARN-4311-v2.patch, > YARN-4311-v3.patch, YARN-4311-v4.patch, YARN-4311-v5.patch, > YARN-4311-v6.patch, YARN-4311-v7.patch, YARN-4311-v8.patch, YARN-4311-v9.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v13.patch Thank you [~jlowe]! Updated patch with the one minor change in yarn-default.xml. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-v1.patch, YARN-4311-v10.patch, > YARN-4311-v11.patch, YARN-4311-v11.patch, YARN-4311-v12.patch, > YARN-4311-v13.patch, YARN-4311-v2.patch, YARN-4311-v3.patch, > YARN-4311-v4.patch, YARN-4311-v5.patch, YARN-4311-v6.patch, > YARN-4311-v7.patch, YARN-4311-v8.patch, YARN-4311-v9.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v12.patch Thanks a lot for all the reviews! Updating patch as per Jason's comments. Additionally added a log line at info level when the untracked node is removed upon timeout. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-v1.patch, YARN-4311-v10.patch, > YARN-4311-v11.patch, YARN-4311-v11.patch, YARN-4311-v12.patch, > YARN-4311-v2.patch, YARN-4311-v3.patch, YARN-4311-v4.patch, > YARN-4311-v5.patch, YARN-4311-v6.patch, YARN-4311-v7.patch, > YARN-4311-v8.patch, YARN-4311-v9.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v11.patch Re-attaching the v11 patch with no changes to trigger another pre-commit since TestResourceTrackerService failure are not reproducible locally and from investigation seem related to the sleep based wait. Need to see if this failure is consistent. Also checked that it applies clean to trunk. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-v1.patch, YARN-4311-v10.patch, > YARN-4311-v11.patch, YARN-4311-v11.patch, YARN-4311-v2.patch, > YARN-4311-v3.patch, YARN-4311-v4.patch, YARN-4311-v5.patch, > YARN-4311-v6.patch, YARN-4311-v7.patch, YARN-4311-v8.patch, YARN-4311-v9.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v11.patch Rebasing patch after YARN-3223. Requesting [~jlowe], [~templedf] for review/comments. Thanks a lot! > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-v1.patch, YARN-4311-v10.patch, > YARN-4311-v11.patch, YARN-4311-v2.patch, YARN-4311-v3.patch, > YARN-4311-v4.patch, YARN-4311-v5.patch, YARN-4311-v6.patch, > YARN-4311-v7.patch, YARN-4311-v8.patch, YARN-4311-v9.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v10.patch Rebasing patch after YARN-3102. Tests from ResourceManager pass locally. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-v1.patch, YARN-4311-v10.patch, > YARN-4311-v2.patch, YARN-4311-v3.patch, YARN-4311-v4.patch, > YARN-4311-v5.patch, YARN-4311-v6.patch, YARN-4311-v7.patch, > YARN-4311-v8.patch, YARN-4311-v9.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v9.patch Fixed the one checkstyle issue I introduced. The javac and compile failures are from hadoop-common package where I have not changed any code. Not sure if this is related to HADOOP-8887. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-v1.patch, YARN-4311-v2.patch, > YARN-4311-v3.patch, YARN-4311-v4.patch, YARN-4311-v5.patch, > YARN-4311-v6.patch, YARN-4311-v7.patch, YARN-4311-v8.patch, YARN-4311-v9.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v8.patch Thank you [~jlowe] for the review comments. I have updated the patch. The interval is 1/2 the value of the timeout config field( or should it be 1/3 like the NM expiry interval, although that could be excessive in my opinion). The default timeout is 1 minute. The interval is capped at 10 minutes. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-v1.patch, YARN-4311-v2.patch, > YARN-4311-v3.patch, YARN-4311-v4.patch, YARN-4311-v5.patch, > YARN-4311-v6.patch, YARN-4311-v7.patch, YARN-4311-v8.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v7.patch While updating the test for YARN-3102, I modified the {{writeToHostsFile}} to have the file object passed as an argument. I am updating this patch with the same change to be consistent. No changes to the non-test code. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-v1.patch, YARN-4311-v2.patch, > YARN-4311-v3.patch, YARN-4311-v4.patch, YARN-4311-v5.patch, > YARN-4311-v6.patch, YARN-4311-v7.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v6.patch Thank you Daniel for the comments. Attaching revised patch with corrected indents etc. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-v1.patch, YARN-4311-v2.patch, > YARN-4311-v3.patch, YARN-4311-v4.patch, YARN-4311-v5.patch, YARN-4311-v6.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v5.patch Added a check before setting the timestamp of an untracked node such that its updated the first time it transitions as an untracked node and never again. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-v1.patch, YARN-4311-v2.patch, > YARN-4311-v3.patch, YARN-4311-v4.patch, YARN-4311-v5.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v4.patch Thanks [~templedf]. Attaching an updated patch. I removed the null check on entry set after some thought on how the entries are updated for an iterator. Made the suggested changes as well. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-v1.patch, YARN-4311-v2.patch, > YARN-4311-v3.patch, YARN-4311-v4.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated YARN-4311: --- Attachment: (was: YARN-4421.001.patch) > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-v1.patch, YARN-4311-v2.patch, > YARN-4311-v3.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated YARN-4311: --- Attachment: YARN-4421.001.patch > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-v1.patch, YARN-4311-v2.patch, > YARN-4311-v3.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v3.patch Fixed one test failure in TestYarnConfigurationFields by adding the new configs to yarn-default. TestAMAuthorization and TestClientRMTokens are unrelated and fail as per YARN-4318 and YARN-4306. Corrected check-style issues. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-v1.patch, YARN-4311-v2.patch, > YARN-4311-v3.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v2.patch This patch addresses graceful and other versions of refreshNodes and also adds a time stamp based check for nodes per {{RM_NODE_REMOVAL_CHK_INTERVAL_MSEC}} in the inactive list that should be untracked and removes nodes based on {{RM_NODE_REMOVAL_TIMEOUT_MSEC}}. A decommissioned node is not transitioned to shutdown but timer acts on it just as it would on a shutdown node. A decommissioning node will transition to shutdown if it was found to be 'untracked'. The unit test tries out several scenarios to check if the metrics and node lists are proper. I can break it into more tests if the idea behind it looks acceptable. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-v1.patch, YARN-4311-v2.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list
[ https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuhu Shukla updated YARN-4311: -- Attachment: YARN-4311-v1.patch This is a preliminary proposal patch. Through this change, any node is checked if it should be removed from all lists by the isInvalidAndAbsent() method. There are however different outcomes based on the initial state of the node. If the node state is running (,is part of the include list) and is taken out of the list, followed by -refreshNodes (and it is not of course in the exclude list), the node is shutdown and shutdown node count is incremented . The node is not considered a decommed node. If the node is in both exclude and include list and -refreshNodes has been done, (meaning it is a decommissioned node,) then removing it from both those lists takes it out completely not showing up in shutdown or decommed or unhealthy or active lists. There is one case where shutdown counters are misleading and which this patch hasnt addressed. If the node was running and was taken out of include list, and it comes back up after being added to the include list, the shutdown counters still stick to the same value. This needs to be changed since current state transitions dont account for it. Need some inputs from the community on the semantics of such a fix. It may be a good idea to have them counted (but not listed) as shutdown nodes in the first case since any mistake in configuring the include list will lose all information about the nodes (counters and nodelists) which may be undesirable. Appreciate any comments/suggestions/caveats. I have not fixed associated test failures through this patch. > Removing nodes from include and exclude lists will not remove them from > decommissioned nodes list > - > > Key: YARN-4311 > URL: https://issues.apache.org/jira/browse/YARN-4311 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.1 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla > Attachments: YARN-4311-v1.patch > > > In order to fully forget about a node, removing the node from include and > exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The > tricky part that [~jlowe] pointed out was the case when include lists are not > used, in that case we don't want the nodes to fall off if they are not active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)