[jira] [Commented] (YARN-11360) Add number of decommissioning nodes to YARN cluster metrics.

2022-10-24 Thread Mikayla Konst (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17623457#comment-17623457
 ] 

Mikayla Konst commented on YARN-11360:
--

It looks like number of node managers in state SHUTDOWN are also missing from 
YarnClusterMetrics despite being present in ClusterMetrics - could we add this 
as well?

> Add number of decommissioning nodes to YARN cluster metrics.
> 
>
> Key: YARN-11360
> URL: https://issues.apache.org/jira/browse/YARN-11360
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: client, resourcemanager
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
>Priority: Major
>  Labels: pull-request-available
>
> YARN cluster metrics expose counts of NodeManagers in various states 
> including active and decommissioned. However, these metrics don't expose 
> NodeManagers that are currently in the process of decommissioning. This can 
> look a little spooky to a consumer of these metrics. First, the node drops 
> out of the active count, so it seems like a node just vanished. Then, later 
> (possibly hours later with consideration of graceful decommission), it comes 
> back into existence in the decommissioned count.
> This issue tracks adding the decommissioning count to the metrics 
> ResourceManager RPC. This also enables exposing it in the {{yarn top}} 
> output. This metric is already visible through the REST API, so there isn't 
> any change required there.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9011) Race condition during decommissioning

2020-02-03 Thread Mikayla Konst (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17029396#comment-17029396
 ] 

Mikayla Konst commented on YARN-9011:
-

We experienced this exact same race condition recently (resource manager 
sending SHUTDOWN signal to node manager because it received a heartbeat from 
the node manager *after* the HostDetails reference was updated, but *before* 
the node was transitioned to state DECOMMISSIONING).

I think this patch is a huge improvement over the previous behavior, but I 
think there is still a narrow race that can happen when refresh nodes is called 
multiple times in a row in quick succession with the same set of nodes in the 
exclude file:
 # lazy-loaded HostDetails reference is updated
 # nodes are added to gracefullyDecommissionableNodes set
 # current HostDetails reference is updated
 # event to update node status to DECOMMISSIONING is added to asynchronous 
event handler's event queue, but hasn't been processed yet
 # refresh nodes is called a second time
 # lazy-loaded HostDetails reference is updated
 # gracefullyDecommissionableNodes set is cleared
 # node manager heartbeats to resource manager. It is not in state 
DECOMMISSIONING and not in the gracefullyDecommissionableNodes set, but is an 
excluded node in the HostDetails, so it is sent a SHUTDOWN signal
 # node is added to gracefullyDecommissionableNodes set
 # event handler transitions node to state DECOMMISSIONING at some point

This would be fixed if you used an AtomicReference for your set of 
"gracefullyDecommissionableNodes" and swapped out the reference, similar to how 
you handled the HostDetails.

Alternatively, instead of using an asynchronous event handler to update the 
state of the nodes to DECOMMISSIONING, you could update the state 
synchronously. You could grab a lock, then update HostDetails and synchronously 
update the states of the nodes being gracefully decommissioned, then release 
the lock. When the resource tracker service receives a heartbeat and needs to 
check if a node should be shutdown (if it is excluded and in state 
decommissioning), it would grab the lock right before doing the check. Having 
the resource tracker service wait on a lock doesn't sound great, but it would 
likely be on the order of milliseconds, and only when refresh nodes is called.

> Race condition during decommissioning
> -
>
> Key: YARN-9011
> URL: https://issues.apache.org/jira/browse/YARN-9011
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.1.1
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 3.3.0, 3.2.2, 3.1.4
>
> Attachments: YARN-9011-001.patch, YARN-9011-002.patch, 
> YARN-9011-003.patch, YARN-9011-004.patch, YARN-9011-005.patch, 
> YARN-9011-006.patch, YARN-9011-007.patch, YARN-9011-008.patch, 
> YARN-9011-009.patch, YARN-9011-branch-3.1.001.patch, 
> YARN-9011-branch-3.2.001.patch
>
>
> During internal testing, we found a nasty race condition which occurs during 
> decommissioning.
> Node manager, incorrect behaviour:
> {noformat}
> 2018-06-18 21:00:17,634 WARN 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Received 
> SHUTDOWN signal from Resourcemanager as part of heartbeat, hence shutting 
> down.
> 2018-06-18 21:00:17,634 WARN 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Message from 
> ResourceManager: Disallowed NodeManager nodeId: node-6.hostname.com:8041 
> hostname:node-6.hostname.com
> {noformat}
> Node manager, expected behaviour:
> {noformat}
> 2018-06-18 21:07:37,377 WARN 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Received 
> SHUTDOWN signal from Resourcemanager as part of heartbeat, hence shutting 
> down.
> 2018-06-18 21:07:37,377 WARN 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Message from 
> ResourceManager: DECOMMISSIONING  node-6.hostname.com:8041 is ready to be 
> decommissioned
> {noformat}
> Note the two different messages from the RM ("Disallowed NodeManager" vs 
> "DECOMMISSIONING"). The problem is that {{ResourceTrackerService}} can see an 
> inconsistent state of nodes while they're being updated:
> {noformat}
> 2018-06-18 21:00:17,575 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.NodesListManager: hostsReader 
> include:{172.26.12.198,node-7.hostname.com,node-2.hostname.com,node-5.hostname.com,172.26.8.205,node-8.hostname.com,172.26.23.76,172.26.22.223,node-6.hostname.com,172.26.9.218,node-4.hostname.com,node-3.hostname.com,172.26.13.167,node-9.hostname.com,172.26.21.221,172.26.10.219}
>  exclude:{node-6.hostname.com}
> 2018-06-18 21:00:17,575 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.NodesListManager: Gracefully 
> decommission node node-6.

[jira] [Created] (YARN-9106) Add option to graceful decommission to not wait for applications

2018-12-10 Thread Mikayla Konst (JIRA)
Mikayla Konst created YARN-9106:
---

 Summary: Add option to graceful decommission to not wait for 
applications
 Key: YARN-9106
 URL: https://issues.apache.org/jira/browse/YARN-9106
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Reporter: Mikayla Konst


Add property 
yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-applications.

If true (the default), the resource manager waits for all containers, as well 
as all applications associated with those containers, to finish before 
gracefully decommissioning a node.

If false, the resource manager only waits for containers, but not applications, 
to finish. For map-only jobs or other jobs in which mappers do not need to 
serve shuffle data, this allows nodes to be decommissioned as soon as their 
containers are finished as opposed to when the job is done.

Add property 
yarn.resourcemanager.decommissioning-nodes-watcher.wait-for-app-masters.

If false, during graceful decommission, when the resource manager waits for all 
containers on a node to finish, it will not wait for app master containers to 
finish. Defaults to true. This property should only be set to false if app 
master failure is recoverable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org