[ https://issues.apache.org/jira/browse/IMPALA-6857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16575434#comment-16575434 ]
ASF subversion and git services commented on IMPALA-6857: --------------------------------------------------------- Commit 4976ff3c07f465915ac31312ca67519a600212e6 in impala's branch refs/heads/master from Bharath Vissapragada [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=4976ff3 ] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics Pause monitor: ============= This commit adds a stripped down version of Hadoop's JvmPauseMonitor class (https://bit.ly/2O6qSwm) . The core implementaion is borrowed from hadoop-common project and the hadoop dependencies are removed. - Removed dependency on AbstractService. - Not relying on Hadoop's Configuration object for reading confs. - Switched to Guava's implementation of Stopwatch. This utility class can detect both GC/non-GC pauses. In case of GC pauses, the GC metrics during the pause period are logged. Sample Output: ============= Detected pause in JVM or host machine (eg GC): pause of approximately 2356ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2241ms GC pool 'PS Scavenge' had collection(s): count=3 time=352ms Detected pause in JVM or host machine (eg GC): pause of approximately 1964ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2082ms GC pool 'PS Scavenge' had collection(s): count=1 time=251ms Detected pause in JVM or host machine (eg GC): pause of approximately 2120ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2454ms Detected pause in JVM or host machine (eg GC): pause of approximately 2238ms GC pool 'PS MarkSweep' had collection(s): count=5 time=13464ms Detected pause in JVM or host machine (eg GC): pause of approximately 2233ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2733ms JMX Metrics: ============ JMX metrics are now emmitted for Impala and Catalog JVMs at the web end point /jmx. - Impalad: http(s)://<impalad-host>:25000/jmx - Catalogd: http(s)://<catalogd-host>:25020/jmx Misc: ==== Renamed JvmMetric -> JvmMemoryMetric to make the intent more clear. It doesn't relate to the functionality of the patch in anyway. Testing: ======= - Tested it manually with kill -SIGSTOP/-SIGCONT <pid>. Made sure that the non-GC JVM pauses are logged. - This class' functionality is tested manually by invoking it's main() - Injected a memory leak into the Catalog server code and made sure the GC is detected. Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Reviewed-on: http://gerrit.cloudera.org:8080/10998 Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> > Add JVM Pause Monitor to Impala Processes > ----------------------------------------- > > Key: IMPALA-6857 > URL: https://issues.apache.org/jira/browse/IMPALA-6857 > Project: IMPALA > Issue Type: Improvement > Components: Catalog, Frontend > Reporter: Philip Zeyliger > Priority: Major > Labels: ramp-up, supportability > Fix For: Impala 3.1.0 > > > In IMPALA-3114, we added a pause monitor for Impala. In addition to that, we > should port/borrow Hadoop's JvmPauseMonitor > [https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/JvmPauseMonitor.java.] > I believe that when the JVM is aggressively GCing, the C++ threads will > continue to get scheduled (and won't log), but the Java ones will log. (I've > definitely seen JvmPauseMonitor be accurate many times.) > [~bharathv], when you were testing this, were you able to reproduce it > triggering when the JVM half was in "GC hell"? -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org