[ 
https://issues.apache.org/jira/browse/YARN-10475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17221774#comment-17221774
 ] 

Jim Brennan commented on YARN-10475:
------------------------------------

This adds the following {{yarn.resourcemanager.nodemanagers}} configuration 
properties:

{{heartbeat-interval-scaling-enable}}
 * enables heartbeat interval scaling, defaults to false

{{heartbeat-interval-min-ms}}
 * If heart-beat interval scaling is enabled, this is the minimum heart-beat 
interval in milliseconds.

{{heartbeat-interval-max-ms}}
 * If heart-beat interval scaling is enabled, this is the maximum heart-beat 
interval in milliseconds.

{{heartbeat-interval-speedup-factor}}
 * This controls the degree of adjustment when speeding up heartbeat intervals. 
At 1.0, 20% lesser than average CPU utilization will result in a 20% decrease 
in heartbeat interval.

 {{heartbeat-interval-slowdown-factor}}
* This controls the degree of adjustment when slowing down heartbeat intervals. 
At 1.0, 20% greater than average CPU utilization will result in a 20% increase 
in heartbeat interval.


> Scale RM-NM heartbeat interval based on node utilization
> --------------------------------------------------------
>
>                 Key: YARN-10475
>                 URL: https://issues.apache.org/jira/browse/YARN-10475
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: yarn
>    Affects Versions: 2.10.1, 3.4.1
>            Reporter: Jim Brennan
>            Assignee: Jim Brennan
>            Priority: Minor
>
> Add the ability to scale the RM-NM heartbeat interval based on node cpu 
> utilization compared to overall cluster cpu utilization.  If a node is 
> over-utilized compared to the rest of the cluster, it's heartbeat interval 
> slows down.  If it is under-utilized compared to the rest of the cluster, 
> it's heartbeat interval speeds up.
> This is a feature we have been running with internally in production for 
> several years.  It was developed by [~nroberts], based on the observation 
> that larger faster nodes on our cluster were under-utilized compared to 
> smaller slower nodes. 
> This feature is dependent on [YARN-10450], which added cluster-wide 
> utilization metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to