[ 
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550894#comment-14550894
 ] 

zhihai xu commented on YARN-3619:
---------------------------------

I uploaded a patch YARN-3619.000.patch for review. I added a configuration 
NM_CONTAINER_METRICS_UNREGISTER_DELAY_MS to configure when to unregister the 
container metrics after it is finished. Because it may have potential memory 
leak If I schedule a thread to do unregistration at getMetrics.
It looks like getMetrics will be called from two 
places:MetricsSystemImpl#sampleMetrics and MetricsSourceAdapter#getMBeanInfo.
sampleMetrics won't be called if no sinks in MetricsSystemImpl. getMBeanInfo 
may not be called after registration if JMXJsonServlet#doGet is not called(no 
http Get request from JMX clients). It looks like there is a possibility that 
getMetrics won't be called after registration.


> ContainerMetrics unregisters during getMetrics and leads to 
> ConcurrentModificationException
> -------------------------------------------------------------------------------------------
>
>                 Key: YARN-3619
>                 URL: https://issues.apache.org/jira/browse/YARN-3619
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.7.0
>            Reporter: Jason Lowe
>            Assignee: zhihai xu
>         Attachments: YARN-3619.000.patch, test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method, 
> but that method can be called by MetricsSystemImpl.sampleMetrics which is 
> trying to iterate the sources.  This leads to a 
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN 
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to