[ 
https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15372905#comment-15372905
 ] 

Nathan Roberts edited comment on YARN-5356 at 7/12/16 2:07 PM:
---------------------------------------------------------------

bq. Nathan Roberts, I understand that your problem is that with the current 
approach you know that you have 6 cores available to the NM and 4 of them are 
used. However, the machine is not that utilized (~30%). Correct? In that case, 
we would only need to report the actual size of the machine at registration 
time as it would never change. Not sure that ResourceUtilization would be the 
right place for that as it would be reported in every heartbeat continuously.

[~elgoiri], Yep, that's exactly correct. I think reporting the physical 
capabilities of the machine during registration should be ok. At least with 
linux it is technically possible for the machine to change (e.g. echo 0 > 
/sys/devices/system/cpu/cpu3/online, OR memory gets automatically removed 
because it's getting ECC errors, OR something reserves a bunch of memory for 
huge pages, OR NIC re-negotiates from 10G to 1G), but I think these might be 
unusual enough that we could ignore them. I originally suggested tweaking 
ResourceUtilization due to this small chance of a physical resource changing 
but am happy to go either way. 


was (Author: nroberts):
bq. Nathan Roberts, I understand that your problem is that with the current 
approach you know that you have 6 cores available to the NM and 4 of them are 
used. However, the machine is not that utilized (~30%). Correct? In that case, 
we would only need to report the actual size of the machine at registration 
time as it would never change. Not sure that ResourceUtilization would be the 
right place for that as it would be reported in every heartbeat continuously.

[~elgoiri], Yep, that's exactly correct. I think reporting the physical 
capabilities of the machine during registration should be ok. At least with 
linux it is technically possible for the machine to change (e.g. echo 0 > 
/sys/devices/system/cpu/cpu3, OR memory gets automatically removed because it's 
getting ECC errors, OR something reserves a bunch of memory for huge pages, OR 
NIC re-negotiates from 10G to 1G), but I think these might be unusual enough 
that we could ignore them. I originally suggested tweaking ResourceUtilization 
due to this small chance of a physical resource changing but am happy to go 
either way. 

> ResourceUtilization should also include resource availability
> -------------------------------------------------------------
>
>                 Key: YARN-5356
>                 URL: https://issues.apache.org/jira/browse/YARN-5356
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager, resourcemanager
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Nathan Roberts
>
> Currently ResourceUtilization contains absolute quantities of resource used 
> (e.g. 4096MB memory used). It would be good if it also included how much of 
> that resource is actually available on the node so that the RM can use this 
> data to schedule more effectively (overcommit, etc)
> Currently the only available information is the Resource the node registered 
> with (or later updated using updateNodeResource). However, these aren't 
> really sufficient to get a good view of how utilized a resource is. For 
> example, if a node reports 400% CPU utilization, does that mean it's 
> completely full, or barely utilized? Today there is no reliable way to figure 
> this out.
> [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you 
> have thoughts/opinions on this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to