[ 
https://issues.apache.org/jira/browse/YARN-10760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri reassigned YARN-10760:
----------------------------------

    Assignee: Andrew Chung

> Number of allocated OPPORTUNISTIC containers can dip below 0
> ------------------------------------------------------------
>
>                 Key: YARN-10760
>                 URL: https://issues.apache.org/jira/browse/YARN-10760
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 3.1.2
>            Reporter: Andrew Chung
>            Assignee: Andrew Chung
>            Priority: Minor
>
> {{AbstractYarnScheduler.completedContainers}} can potentially be called from 
> multiple sources, yet it appears that there are scenarios in which the caller 
> does not hold the appropriate lock, which can lead to the count of 
> {{OpportunisticSchedulerMetrics.AllocatedOContainers}} falling below 0.
> To prevent double counting when releasing allocated O containers, a simple 
> fix might be to check if the {{RMContainer}} has already been removed 
> beforehand, though that may not fix the underlying issue that causes the race 
> condition.
> Following is "capture" of 
> {{OpportunisticSchedulerMetrics.AllocatedOContainers}} falling below 0 via a 
> JMX query:
> {noformat}
> {
>     "name" : 
> "Hadoop:service=ResourceManager,name=OpportunisticSchedulerMetrics",
>     "modelerType" : "OpportunisticSchedulerMetrics",
>     "tag.OpportunisticSchedulerMetrics" : "ResourceManager",
>     "tag.Context" : "yarn",
>     "tag.Hostname" : "",
>     "AllocatedOContainers" : -2716,
>     "AggregateOContainersAllocated" : 306020,
>     "AggregateOContainersReleased" : 308736,
>     "AggregateNodeLocalOContainersAllocated" : 0,
>     "AggregateRackLocalOContainersAllocated" : 0,
>     "AggregateOffSwitchOContainersAllocated" : 306020,
>     "AllocateLatencyOQuantilesNumOps" : 0,
>     "AllocateLatencyOQuantiles50thPercentileTime" : 0,
>     "AllocateLatencyOQuantiles75thPercentileTime" : 0,
>     "AllocateLatencyOQuantiles90thPercentileTime" : 0,
>     "AllocateLatencyOQuantiles95thPercentileTime" : 0,
>     "AllocateLatencyOQuantiles99thPercentileTime" : 0
>   }
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to