[ https://issues.apache.org/jira/browse/YARN-10354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lee young gon updated YARN-10354: --------------------------------- Description: Could not get information about jmx in nodemanager. and I found deadlock through thread dump. Below is the deadlock threads. {code:java} "Timer for 'NodeManager' metrics system" - Thread t@42 java.lang.Thread.State: BLOCKED at org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics.getMetrics(ContainerMetrics.java:235) - waiting to lock <7668d6f0> (a org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics) owned by "NM ContainerManager dispatcher" t@299 at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:200) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.snapshotMetrics(MetricsSystemImpl.java:419) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.sampleMetrics(MetricsSystemImpl.java:406) - locked <3b956878> (a org.apache.hadoop.metrics2.impl.MetricsSystemImpl) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.onTimerEvent(MetricsSystemImpl.java:381) - locked <3b956878> (a org.apache.hadoop.metrics2.impl.MetricsSystemImpl) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$4.run(MetricsSystemImpl.java:368) at java.util.TimerThread.mainLoop(Timer.java:555) at java.util.TimerThread.run(Timer.java:505) Locked ownable synchronizers: - None "NM ContainerManager dispatcher" - Thread t@299 java.lang.Thread.State: BLOCKED at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.unregisterSource(MetricsSystemImpl.java:247) - waiting to lock <3b956878> (a org.apache.hadoop.metrics2.impl.MetricsSystemImpl) owned by "Timer for 'NodeManager' metrics system" t@42 at org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics.unregisterContainerMetrics(ContainerMetrics.java:228) - locked <4e31c3ec> (a java.lang.Class) at org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics.finished(ContainerMetrics.java:255) - locked <7668d6f0> (a org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics) at org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl.updateContainerMetrics(ContainersMonitorImpl.java:813) at org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl.onStopMonitoringContainer(ContainersMonitorImpl.java:935) at org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl.handle(ContainersMonitorImpl.java:900) at org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl.handle(ContainersMonitorImpl.java:57) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - None {code} was: Could not get information about jmx in nodemanager. and I found deadlock through thread dump. Below is the deadlock threads. {code:java} "Timer for 'NodeManager' metrics system" - Thread t@42 java.lang.Thread.State: BLOCKED at org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics.getMetrics(ContainerMetrics.java:235) - waiting to lock <7668d6f0> (a org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics) owned by "NM ContainerManager dispatcher" t@299 at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:200) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.snapshotMetrics(MetricsSystemImpl.java:419) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.sampleMetrics(MetricsSystemImpl.java:406) - locked <3b956878> (a org.apache.hadoop.metrics2.impl.MetricsSystemImpl) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.onTimerEvent(MetricsSystemImpl.java:381) - locked <3b956878> (a org.apache.hadoop.metrics2.impl.MetricsSystemImpl) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$4.run(MetricsSystemImpl.java:368) at java.util.TimerThread.mainLoop(Timer.java:555) at java.util.TimerThread.run(Timer.java:505) Locked ownable synchronizers: - None "NM ContainerManager dispatcher" - Thread t@299 java.lang.Thread.State: BLOCKED at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.unregisterSource(MetricsSystemImpl.java:247) - waiting to lock <3b956878> (a org.apache.hadoop.metrics2.impl.MetricsSystemImpl) owned by "Timer for 'NodeManager' metrics system" t@42 at org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics.unregisterContainerMetrics(ContainerMetrics.java:228) - locked <4e31c3ec> (a java.lang.Class) at org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics.finished(ContainerMetrics.java:255) - locked <7668d6f0> (a org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics) at org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl.updateContainerMetrics(ContainersMonitorImpl.java:813) at org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl.onStopMonitoringContainer(ContainersMonitorImpl.java:935) at org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl.handle(ContainersMonitorImpl.java:900) at org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl.handle(ContainersMonitorImpl.java:57) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - None {code} > deadlock in ContainerMetrics and MetricsSystemImpl > -------------------------------------------------- > > Key: YARN-10354 > URL: https://issues.apache.org/jira/browse/YARN-10354 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Environment: hadoop 3.1.2 > Reporter: Lee young gon > Priority: Major > Attachments: full_thread_dump.txt > > > Could not get information about jmx in nodemanager. and I found deadlock > through thread dump. > Below is the deadlock threads. > {code:java} > "Timer for 'NodeManager' metrics system" - Thread t@42 > java.lang.Thread.State: BLOCKED > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics.getMetrics(ContainerMetrics.java:235) > - waiting to lock <7668d6f0> (a > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics) > owned by "NM ContainerManager dispatcher" t@299 > at > org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:200) > at > org.apache.hadoop.metrics2.impl.MetricsSystemImpl.snapshotMetrics(MetricsSystemImpl.java:419) > at > org.apache.hadoop.metrics2.impl.MetricsSystemImpl.sampleMetrics(MetricsSystemImpl.java:406) > - locked <3b956878> (a > org.apache.hadoop.metrics2.impl.MetricsSystemImpl) > at > org.apache.hadoop.metrics2.impl.MetricsSystemImpl.onTimerEvent(MetricsSystemImpl.java:381) > - locked <3b956878> (a > org.apache.hadoop.metrics2.impl.MetricsSystemImpl) > at > org.apache.hadoop.metrics2.impl.MetricsSystemImpl$4.run(MetricsSystemImpl.java:368) > at java.util.TimerThread.mainLoop(Timer.java:555) > at java.util.TimerThread.run(Timer.java:505) Locked ownable > synchronizers: > - None > "NM ContainerManager dispatcher" - Thread t@299 > java.lang.Thread.State: BLOCKED > at > org.apache.hadoop.metrics2.impl.MetricsSystemImpl.unregisterSource(MetricsSystemImpl.java:247) > - waiting to lock <3b956878> (a > org.apache.hadoop.metrics2.impl.MetricsSystemImpl) owned by "Timer for > 'NodeManager' metrics system" t@42 > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics.unregisterContainerMetrics(ContainerMetrics.java:228) > - locked <4e31c3ec> (a java.lang.Class) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics.finished(ContainerMetrics.java:255) > - locked <7668d6f0> (a > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainerMetrics) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl.updateContainerMetrics(ContainersMonitorImpl.java:813) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl.onStopMonitoringContainer(ContainersMonitorImpl.java:935) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl.handle(ContainersMonitorImpl.java:900) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl.handle(ContainersMonitorImpl.java:57) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) > at java.lang.Thread.run(Thread.java:745) Locked ownable > synchronizers: > - None > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org