[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 14: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 14 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 09 Aug 2018 04:33:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics Pause monitor: = This commit adds a stripped down version of Hadoop's JvmPauseMonitor class (https://bit.ly/2O6qSwm) . The core implementaion is borrowed from hadoop-common project and the hadoop dependencies are removed. - Removed dependency on AbstractService. - Not relying on Hadoop's Configuration object for reading confs. - Switched to Guava's implementation of Stopwatch. This utility class can detect both GC/non-GC pauses. In case of GC pauses, the GC metrics during the pause period are logged. Sample Output: = Detected pause in JVM or host machine (eg GC): pause of approximately 2356ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2241ms GC pool 'PS Scavenge' had collection(s): count=3 time=352ms Detected pause in JVM or host machine (eg GC): pause of approximately 1964ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2082ms GC pool 'PS Scavenge' had collection(s): count=1 time=251ms Detected pause in JVM or host machine (eg GC): pause of approximately 2120ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2454ms Detected pause in JVM or host machine (eg GC): pause of approximately 2238ms GC pool 'PS MarkSweep' had collection(s): count=5 time=13464ms Detected pause in JVM or host machine (eg GC): pause of approximately 2233ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2733ms JMX Metrics: JMX metrics are now emmitted for Impala and Catalog JVMs at the web end point /jmx. - Impalad: http(s)://:25000/jmx - Catalogd: http(s)://:25020/jmx Misc: Renamed JvmMetric -> JvmMemoryMetric to make the intent more clear. It doesn't relate to the functionality of the patch in anyway. Testing: === - Tested it manually with kill -SIGSTOP/-SIGCONT . Made sure that the non-GC JVM pauses are logged. - This class' functionality is tested manually by invoking it's main() - Injected a memory leak into the Catalog server code and made sure the GC is detected. Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Reviewed-on: http://gerrit.cloudera.org:8080/10998 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/common/init.cc M be/src/service/impala-http-handler.cc M be/src/util/default-path-handlers.cc M be/src/util/jni-util.cc M be/src/util/jni-util.h M be/src/util/memory-metrics.cc M be/src/util/memory-metrics.h M be/src/util/metrics-test.cc M be/src/util/webserver-test.cc M be/src/util/webserver.cc M be/src/util/webserver.h M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/common/JniUtil.java A fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java A fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java A fe/src/test/java/org/apache/impala/util/JMXJsonUtilTest.java A fe/src/test/java/org/apache/impala/util/JniUtilTest.java A tests/custom_cluster/test_pause_monitor.py M tests/webserver/test_web_pages.py 19 files changed, 850 insertions(+), 90 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 15 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 14: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2966/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 14 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 09 Aug 2018 01:20:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 14: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 14 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 09 Aug 2018 01:20:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 13: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 13 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 08 Aug 2018 23:02:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 13: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/258/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 13 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 08 Aug 2018 23:00:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/10998/11/be/src/util/default-path-handlers.cc File be/src/util/default-path-handlers.cc: http://gerrit.cloudera.org:8080/#/c/10998/11/be/src/util/default-path-handlers.cc@261 PS11, Line 261: webserver->RegisterUrlCallback("/jmx", "raw_text.tmpl", JmxHandler); > It seems like we could change raw_callback() to populate a content_type and Done -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 11 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 08 Aug 2018 22:40:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Hello Tianyi Wang, Philip Zeyliger, Todd Lipcon, Impala Public Jenkins, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10998 to look at the new patch set (#13). Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics Pause monitor: = This commit adds a stripped down version of Hadoop's JvmPauseMonitor class (https://bit.ly/2O6qSwm) . The core implementaion is borrowed from hadoop-common project and the hadoop dependencies are removed. - Removed dependency on AbstractService. - Not relying on Hadoop's Configuration object for reading confs. - Switched to Guava's implementation of Stopwatch. This utility class can detect both GC/non-GC pauses. In case of GC pauses, the GC metrics during the pause period are logged. Sample Output: = Detected pause in JVM or host machine (eg GC): pause of approximately 2356ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2241ms GC pool 'PS Scavenge' had collection(s): count=3 time=352ms Detected pause in JVM or host machine (eg GC): pause of approximately 1964ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2082ms GC pool 'PS Scavenge' had collection(s): count=1 time=251ms Detected pause in JVM or host machine (eg GC): pause of approximately 2120ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2454ms Detected pause in JVM or host machine (eg GC): pause of approximately 2238ms GC pool 'PS MarkSweep' had collection(s): count=5 time=13464ms Detected pause in JVM or host machine (eg GC): pause of approximately 2233ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2733ms JMX Metrics: JMX metrics are now emmitted for Impala and Catalog JVMs at the web end point /jmx. - Impalad: http(s)://:25000/jmx - Catalogd: http(s)://:25020/jmx Misc: Renamed JvmMetric -> JvmMemoryMetric to make the intent more clear. It doesn't relate to the functionality of the patch in anyway. Testing: === - Tested it manually with kill -SIGSTOP/-SIGCONT . Made sure that the non-GC JVM pauses are logged. - This class' functionality is tested manually by invoking it's main() - Injected a memory leak into the Catalog server code and made sure the GC is detected. Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 --- M be/src/common/init.cc M be/src/service/impala-http-handler.cc M be/src/util/default-path-handlers.cc M be/src/util/jni-util.cc M be/src/util/jni-util.h M be/src/util/memory-metrics.cc M be/src/util/memory-metrics.h M be/src/util/metrics-test.cc M be/src/util/webserver-test.cc M be/src/util/webserver.cc M be/src/util/webserver.h M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/common/JniUtil.java A fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java A fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java A fe/src/test/java/org/apache/impala/util/JMXJsonUtilTest.java A fe/src/test/java/org/apache/impala/util/JniUtilTest.java A tests/custom_cluster/test_pause_monitor.py M tests/webserver/test_web_pages.py 19 files changed, 850 insertions(+), 90 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/10998/13 -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 13 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/10998/11/be/src/util/default-path-handlers.cc File be/src/util/default-path-handlers.cc: http://gerrit.cloudera.org:8080/#/c/10998/11/be/src/util/default-path-handlers.cc@261 PS11, Line 261: webserver->RegisterUrlCallback("/jmx", "raw_text.tmpl", JmxHandler); > You are right, the template here is of no use since we do a non-HTML render It seems like we could change raw_callback() to populate a content_type and/or add url_handler->content_type() if we wanted to. But, regardless, I'm fine with this as is. let's leave a comment about the potential improvement and move on. -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 11 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 08 Aug 2018 15:39:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 12: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/234/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 12 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 08 Aug 2018 00:56:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 11: (2 comments) http://gerrit.cloudera.org:8080/#/c/10998/11/be/src/util/default-path-handlers.cc File be/src/util/default-path-handlers.cc: http://gerrit.cloudera.org:8080/#/c/10998/11/be/src/util/default-path-handlers.cc@207 PS11, Line 207: // Parse the JSON string returned from fe. > It's pretty weird that we're doing an extra round of parsing. I think it's It is unfortunate. We use two different libraries to parse the same json, once in C++ and one in Java and there is no intermediate state that is compatible with both to avoid re-parsing. Yea, the whole rendering framework relies on the JSON content underneath that requires the content to be populated that way for it to parse. Clarified. http://gerrit.cloudera.org:8080/#/c/10998/11/be/src/util/default-path-handlers.cc@261 PS11, Line 261: webserver->RegisterUrlCallback("/jmx", "raw_text.tmpl", JmxHandler); > Do you think raw_text.tmpl is ever ever used? I think because we set the "_ You are right, the template here is of no use since we do a non-HTML rendering anyway. I thought about this but it turns out we set a PLAIN content-type for RawCallbacks. Now we need to again fix the following codepath to infer from the arguments map whether it is a JSON and fix the content type accordingly. I felt it was not worth the effort. Thoughts? if (!url_handler->use_templates()) { content_type = PLAIN; url_handler->raw_callback()(arguments, ); <- } else { RenderUrlWithTemplate(arguments, *url_handler, , _type); } -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 11 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 08 Aug 2018 00:26:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Hello Tianyi Wang, Philip Zeyliger, Todd Lipcon, Impala Public Jenkins, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10998 to look at the new patch set (#12). Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics Pause monitor: = This commit adds a stripped down version of Hadoop's JvmPauseMonitor class (https://bit.ly/2O6qSwm) . The core implementaion is borrowed from hadoop-common project and the hadoop dependencies are removed. - Removed dependency on AbstractService. - Not relying on Hadoop's Configuration object for reading confs. - Switched to Guava's implementation of Stopwatch. This utility class can detect both GC/non-GC pauses. In case of GC pauses, the GC metrics during the pause period are logged. Sample Output: = Detected pause in JVM or host machine (eg GC): pause of approximately 2356ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2241ms GC pool 'PS Scavenge' had collection(s): count=3 time=352ms Detected pause in JVM or host machine (eg GC): pause of approximately 1964ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2082ms GC pool 'PS Scavenge' had collection(s): count=1 time=251ms Detected pause in JVM or host machine (eg GC): pause of approximately 2120ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2454ms Detected pause in JVM or host machine (eg GC): pause of approximately 2238ms GC pool 'PS MarkSweep' had collection(s): count=5 time=13464ms Detected pause in JVM or host machine (eg GC): pause of approximately 2233ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2733ms JMX Metrics: JMX metrics are now emmitted for Impala and Catalog JVMs at the web end point /jmx. - Impalad: http(s)://:25000/jmx - Catalogd: http(s)://:25020/jmx Misc: Renamed JvmMetric -> JvmMemoryMetric to make the intent more clear. It doesn't relate to the functionality of the patch in anyway. Testing: === - Tested it manually with kill -SIGSTOP/-SIGCONT . Made sure that the non-GC JVM pauses are logged. - This class' functionality is tested manually by invoking it's main() - Injected a memory leak into the Catalog server code and made sure the GC is detected. Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 --- M be/src/common/init.cc M be/src/service/impala-http-handler.cc M be/src/util/default-path-handlers.cc M be/src/util/jni-util.cc M be/src/util/jni-util.h M be/src/util/memory-metrics.cc M be/src/util/memory-metrics.h M be/src/util/metrics-test.cc M be/src/util/webserver-test.cc M be/src/util/webserver.cc M be/src/util/webserver.h M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/common/JniUtil.java A fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java A fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java A fe/src/test/java/org/apache/impala/util/JMXJsonUtilTest.java A fe/src/test/java/org/apache/impala/util/JniUtilTest.java A tests/custom_cluster/test_pause_monitor.py M tests/webserver/test_web_pages.py 19 files changed, 847 insertions(+), 90 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/10998/12 -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 12 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 11: (2 comments) Thanks! I think the interface for the callbacks is a little bit weird. Specifically, for a given handler, we seem to put "Webserver::ENABLE_RAW_JSON_KEY = "__raw__"" into the JSON-y document so that the template can render it, but we could have done the switch without using a templating language. But that's neither here nor there with your change, so I think it's fine. I think we should maybe be using the RawUrlCallback here since I think we're never using the template. If you disagree, I'm fine to +2 this as is as well, but I want to hear your thoughts on the comments below. http://gerrit.cloudera.org:8080/#/c/10998/11/be/src/util/default-path-handlers.cc File be/src/util/default-path-handlers.cc: http://gerrit.cloudera.org:8080/#/c/10998/11/be/src/util/default-path-handlers.cc@207 PS11, Line 207: // Parse the JSON string returned from fe. It's pretty weird that we're doing an extra round of parsing. I think it's harmless enough, but we should explain it more. Specifically, I think the issue is that our framework for showing nice HTML with these things requires this? The other approach would have been to use RawUrlCallback which returns a byte stream. Anyway, could you expand the comment here to indicate why we have to do this parsing? http://gerrit.cloudera.org:8080/#/c/10998/11/be/src/util/default-path-handlers.cc@261 PS11, Line 261: webserver->RegisterUrlCallback("/jmx", "raw_text.tmpl", JmxHandler); Do you think raw_text.tmpl is ever ever used? I think because we set the "__json__" key in the handler, the template will never be used, so this is really a RawCallback of sorts. -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 11 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 07 Aug 2018 18:56:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/216/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 11 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 07 Aug 2018 06:59:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 11: Also, forgot to add, in PS11 the /jmx endpoint directly returns a plain text page unlike the earlier versions which required the caller to pass ?raw parameter. This makes it compatible with the rest of the Hadoop stack. Modified the commit message and tests to accommodate this. -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 11 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 07 Aug 2018 06:37:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 10: (9 comments) Rebased this on top of IMPALA-7387. The earlier PS was dumping the entire JSON string returned by the fe as a single string. That does not work well with the template rendering logic and application/json content-type. So the latest PS parses the JSON returned by the fe and pretty prints it in the web page with the right content-type. Had to make a few minor changes in the webserver class to get this right. Phil, can you take another quick look at this? http://gerrit.cloudera.org:8080/#/c/10998/10/be/src/util/memory-metrics.cc File be/src/util/memory-metrics.cc: http://gerrit.cloudera.org:8080/#/c/10998/10/be/src/util/memory-metrics.cc@161 PS10, Line 161: JvmMemoryMetric* JvmMemoryMetric::CreateAndRegister(MetricGroup* metrics, const string& key, > line too long (92 > 90) Done http://gerrit.cloudera.org:8080/#/c/10998/10/be/src/util/memory-metrics.cc@166 PS10, Line 166: return metrics->RegisterMetric(new JvmMemoryMetric(MetricDefs::Get(key, pool_name_for_key), > line too long (93 > 90) Done http://gerrit.cloudera.org:8080/#/c/10998/10/be/src/util/memory-metrics.cc@184 PS10, Line 184: JvmMemoryMetric::CreateAndRegister(metrics, "jvm.$0.max-usage-bytes", usage.name, MAX); > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/10998/10/be/src/util/memory-metrics.cc@187 PS10, Line 187: JvmMemoryMetric::CreateAndRegister(metrics, "jvm.$0.committed-usage-bytes", usage.name, > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/10998/10/be/src/util/memory-metrics.cc@189 PS10, Line 189: JvmMemoryMetric::CreateAndRegister(metrics, "jvm.$0.init-usage-bytes", usage.name, INIT); > line too long (93 > 90) Done http://gerrit.cloudera.org:8080/#/c/10998/10/be/src/util/memory-metrics.cc@192 PS10, Line 192: JvmMemoryMetric::CreateAndRegister(metrics, "jvm.$0.peak-current-usage-bytes", usage.name, > line too long (94 > 90) Done http://gerrit.cloudera.org:8080/#/c/10998/10/be/src/util/memory-metrics.cc@194 PS10, Line 194: JvmMemoryMetric::CreateAndRegister(metrics, "jvm.$0.peak-committed-usage-bytes", usage.name, > line too long (96 > 90) Done http://gerrit.cloudera.org:8080/#/c/10998/10/be/src/util/memory-metrics.cc@196 PS10, Line 196: JvmMemoryMetric::CreateAndRegister(metrics, "jvm.$0.peak-init-usage-bytes", usage.name, > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/10998/10/tests/custom_cluster/test_pause_monitor.py File tests/custom_cluster/test_pause_monitor.py: http://gerrit.cloudera.org:8080/#/c/10998/10/tests/custom_cluster/test_pause_monitor.py@23 PS10, Line 23: class TestPauseMonitor(CustomClusterTestSuite): > flake8: E302 expected 2 blank lines, found 1 Didn't know about this. Fixed https://lintlyci.github.io/Flake8Rules/rules/E302.html -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 10 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 07 Aug 2018 06:33:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Hello Tianyi Wang, Philip Zeyliger, Todd Lipcon, Impala Public Jenkins, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10998 to look at the new patch set (#11). Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics Pause monitor: = This commit adds a stripped down version of Hadoop's JvmPauseMonitor class (https://bit.ly/2O6qSwm) . The core implementaion is borrowed from hadoop-common project and the hadoop dependencies are removed. - Removed dependency on AbstractService. - Not relying on Hadoop's Configuration object for reading confs. - Switched to Guava's implementation of Stopwatch. This utility class can detect both GC/non-GC pauses. In case of GC pauses, the GC metrics during the pause period are logged. Sample Output: = Detected pause in JVM or host machine (eg GC): pause of approximately 2356ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2241ms GC pool 'PS Scavenge' had collection(s): count=3 time=352ms Detected pause in JVM or host machine (eg GC): pause of approximately 1964ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2082ms GC pool 'PS Scavenge' had collection(s): count=1 time=251ms Detected pause in JVM or host machine (eg GC): pause of approximately 2120ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2454ms Detected pause in JVM or host machine (eg GC): pause of approximately 2238ms GC pool 'PS MarkSweep' had collection(s): count=5 time=13464ms Detected pause in JVM or host machine (eg GC): pause of approximately 2233ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2733ms JMX Metrics: JMX metrics are now emmitted for Impala and Catalog JVMs at the web end point /jmx. - Impalad: http(s)://:25000/jmx - Catalogd: http(s)://:25020/jmx Misc: Renamed JvmMetric -> JvmMemoryMetric to make the intent more clear. It doesn't relate to the functionality of the patch in anyway. Testing: === - Tested it manually with kill -SIGSTOP/-SIGCONT . Made sure that the non-GC JVM pauses are logged. - This class' functionality is tested manually by invoking it's main() - Injected a memory leak into the Catalog server code and made sure the GC is detected. Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 --- M be/src/common/init.cc M be/src/service/impala-http-handler.cc M be/src/util/default-path-handlers.cc M be/src/util/jni-util.cc M be/src/util/jni-util.h M be/src/util/memory-metrics.cc M be/src/util/memory-metrics.h M be/src/util/metrics-test.cc M be/src/util/webserver-test.cc M be/src/util/webserver.cc M be/src/util/webserver.h M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/common/JniUtil.java A fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java A fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java A fe/src/test/java/org/apache/impala/util/JMXJsonUtilTest.java A fe/src/test/java/org/apache/impala/util/JniUtilTest.java A tests/custom_cluster/test_pause_monitor.py M tests/webserver/test_web_pages.py 19 files changed, 844 insertions(+), 90 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/10998/11 -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 11 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 10: (9 comments) http://gerrit.cloudera.org:8080/#/c/10998/10/be/src/util/memory-metrics.cc File be/src/util/memory-metrics.cc: http://gerrit.cloudera.org:8080/#/c/10998/10/be/src/util/memory-metrics.cc@161 PS10, Line 161: JvmMemoryMetric* JvmMemoryMetric::CreateAndRegister(MetricGroup* metrics, const string& key, line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/10998/10/be/src/util/memory-metrics.cc@166 PS10, Line 166: return metrics->RegisterMetric(new JvmMemoryMetric(MetricDefs::Get(key, pool_name_for_key), line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/10998/10/be/src/util/memory-metrics.cc@184 PS10, Line 184: JvmMemoryMetric::CreateAndRegister(metrics, "jvm.$0.max-usage-bytes", usage.name, MAX); line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/10998/10/be/src/util/memory-metrics.cc@187 PS10, Line 187: JvmMemoryMetric::CreateAndRegister(metrics, "jvm.$0.committed-usage-bytes", usage.name, line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/10998/10/be/src/util/memory-metrics.cc@189 PS10, Line 189: JvmMemoryMetric::CreateAndRegister(metrics, "jvm.$0.init-usage-bytes", usage.name, INIT); line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/10998/10/be/src/util/memory-metrics.cc@192 PS10, Line 192: JvmMemoryMetric::CreateAndRegister(metrics, "jvm.$0.peak-current-usage-bytes", usage.name, line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/10998/10/be/src/util/memory-metrics.cc@194 PS10, Line 194: JvmMemoryMetric::CreateAndRegister(metrics, "jvm.$0.peak-committed-usage-bytes", usage.name, line too long (96 > 90) http://gerrit.cloudera.org:8080/#/c/10998/10/be/src/util/memory-metrics.cc@196 PS10, Line 196: JvmMemoryMetric::CreateAndRegister(metrics, "jvm.$0.peak-init-usage-bytes", usage.name, line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/10998/10/tests/custom_cluster/test_pause_monitor.py File tests/custom_cluster/test_pause_monitor.py: http://gerrit.cloudera.org:8080/#/c/10998/10/tests/custom_cluster/test_pause_monitor.py@23 PS10, Line 23: class TestPauseMonitor(CustomClusterTestSuite): flake8: E302 expected 2 blank lines, found 1 -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 10 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 07 Aug 2018 06:16:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Hello Tianyi Wang, Philip Zeyliger, Todd Lipcon, Impala Public Jenkins, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10998 to look at the new patch set (#10). Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics Pause monitor: = This commit adds a stripped down version of Hadoop's JvmPauseMonitor class (https://bit.ly/2O6qSwm) . The core implementaion is borrowed from hadoop-common project and the hadoop dependencies are removed. - Removed dependency on AbstractService. - Not relying on Hadoop's Configuration object for reading confs. - Switched to Guava's implementation of Stopwatch. This utility class can detect both GC/non-GC pauses. In case of GC pauses, the GC metrics during the pause period are logged. Sample Output: = Detected pause in JVM or host machine (eg GC): pause of approximately 2356ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2241ms GC pool 'PS Scavenge' had collection(s): count=3 time=352ms Detected pause in JVM or host machine (eg GC): pause of approximately 1964ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2082ms GC pool 'PS Scavenge' had collection(s): count=1 time=251ms Detected pause in JVM or host machine (eg GC): pause of approximately 2120ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2454ms Detected pause in JVM or host machine (eg GC): pause of approximately 2238ms GC pool 'PS MarkSweep' had collection(s): count=5 time=13464ms Detected pause in JVM or host machine (eg GC): pause of approximately 2233ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2733ms JMX Metrics: JMX metrics are now emmitted for Impala and Catalog JVMs at the web end point /jmx. - Impalad: http(s)://:25000/jmx - Catalogd: http(s)://:25020/jmx Misc: Renamed JvmMetric -> JvmMemoryMetric to make the intent more clear. It doesn't relate to the functionality of the patch in anyway. Testing: === - Tested it manually with kill -SIGSTOP/-SIGCONT . Made sure that the non-GC JVM pauses are logged. - This class' functionality is tested manually by invoking it's main() - Injected a memory leak into the Catalog server code and made sure the GC is detected. Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 --- M be/src/common/init.cc M be/src/service/impala-http-handler.cc M be/src/util/default-path-handlers.cc M be/src/util/jni-util.cc M be/src/util/jni-util.h M be/src/util/memory-metrics.cc M be/src/util/memory-metrics.h M be/src/util/metrics-test.cc M be/src/util/webserver-test.cc M be/src/util/webserver.cc M be/src/util/webserver.h M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/common/JniUtil.java A fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java A fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java A fe/src/test/java/org/apache/impala/util/JMXJsonUtilTest.java A fe/src/test/java/org/apache/impala/util/JniUtilTest.java A tests/custom_cluster/test_pause_monitor.py M tests/webserver/test_web_pages.py 19 files changed, 834 insertions(+), 84 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/10998/10 -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 10 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 9: (2 comments) http://gerrit.cloudera.org:8080/#/c/10998/9/fe/src/test/java/org/apache/impala/util/JMXJsonUtilTest.java File fe/src/test/java/org/apache/impala/util/JMXJsonUtilTest.java: http://gerrit.cloudera.org:8080/#/c/10998/9/fe/src/test/java/org/apache/impala/util/JMXJsonUtilTest.java@20 PS9, Line 20: import static org.junit.Assert.*; > I think our style guide avoids wildcard imports. You're only using assertTr Done http://gerrit.cloudera.org:8080/#/c/10998/9/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/10998/9/tests/webserver/test_web_pages.py@75 PS9, Line 75: assert "text/plain" in response.headers['Content-Type'] > I think this should be application/json. (https://stackoverflow.com/questio It turns out that ?raw is not the right thing to use in this case and ?json is intended for this purpose. I fixed setting the mime type in a separate patch[1]. Will rebase this on top of it once it is committed. I had to make a few other changes to get this one working as expected. [1] https://gerrit.cloudera.org/#/c/0/ -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 9 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 02 Aug 2018 06:31:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 9: Code-Review+2 (2 comments) Thanks! Please see about the wildcard import and see what you think about changing the content-type. http://gerrit.cloudera.org:8080/#/c/10998/9/fe/src/test/java/org/apache/impala/util/JMXJsonUtilTest.java File fe/src/test/java/org/apache/impala/util/JMXJsonUtilTest.java: http://gerrit.cloudera.org:8080/#/c/10998/9/fe/src/test/java/org/apache/impala/util/JMXJsonUtilTest.java@20 PS9, Line 20: import static org.junit.Assert.*; I think our style guide avoids wildcard imports. You're only using assertTrue anyway. http://gerrit.cloudera.org:8080/#/c/10998/9/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/10998/9/tests/webserver/test_web_pages.py@75 PS9, Line 75: assert "text/plain" in response.headers['Content-Type'] I think this should be application/json. (https://stackoverflow.com/questions/477816/what-is-the-correct-json-content-type) Could you see how miserable it would be to get that right? I think it's a minor bug here: if (arguments.find("raw") != arguments.end()) { document.AddMember(ENABLE_RAW_JSON_KEY, "true", document.GetAllocator()); } if (document.HasMember(ENABLE_RAW_JSON_KEY)) { *content_type = PLAIN; } My feelings won't be hurt if we put that change into its own commit. It should be a one-liner unless there are sneaky tests testing text/plain :) -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 9 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 01 Aug 2018 04:14:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/128/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 9 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 01 Aug 2018 02:01:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/127/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 8 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 01 Aug 2018 01:57:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/126/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 7 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 01 Aug 2018 01:50:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Hello Tianyi Wang, Philip Zeyliger, Todd Lipcon, Impala Public Jenkins, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10998 to look at the new patch set (#9). Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics Pause monitor: = This commit adds a stripped down version of Hadoop's JvmPauseMonitor class (https://bit.ly/2O6qSwm) . The core implementaion is borrowed from hadoop-common project and the hadoop dependencies are removed. - Removed dependency on AbstractService. - Not relying on Hadoop's Configuration object for reading confs. - Switched to Guava's implementation of Stopwatch. This utility class can detect both GC/non-GC pauses. In case of GC pauses, the GC metrics during the pause period are logged. Sample Output: = Detected pause in JVM or host machine (eg GC): pause of approximately 2356ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2241ms GC pool 'PS Scavenge' had collection(s): count=3 time=352ms Detected pause in JVM or host machine (eg GC): pause of approximately 1964ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2082ms GC pool 'PS Scavenge' had collection(s): count=1 time=251ms Detected pause in JVM or host machine (eg GC): pause of approximately 2120ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2454ms Detected pause in JVM or host machine (eg GC): pause of approximately 2238ms GC pool 'PS MarkSweep' had collection(s): count=5 time=13464ms Detected pause in JVM or host machine (eg GC): pause of approximately 2233ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2733ms JMX Metrics: JMX metrics are now emmitted for Impala and Catalog JVMs at the web end point /jmx. - Impalad: http(s)://:25000/jmx - Catalogd: http(s)://:25020/jmx A text only version (without HTML formatting) of the JMX metrics can be obtained by passing 'raw' parameter to the /jmx endpoint. For ex: http://:25000/jmx?raw This is useful for downstream tools that want to consume and process this output. Misc: Renamed JvmMetric -> JvmMemoryMetric to make the intent more clear. It doesn't relate to the functionality of the patch in anyway. Testing: === - Tested it manually with kill -SIGSTOP/-SIGCONT . Made sure that the non-GC JVM pauses are logged. - This class' functionality is tested manually by invoking it's main() - Injected a memory leak into the Catalog server code and made sure the GC is detected. Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 --- M be/src/common/init.cc M be/src/util/default-path-handlers.cc M be/src/util/jni-util.cc M be/src/util/jni-util.h M be/src/util/memory-metrics.cc M be/src/util/memory-metrics.h M be/src/util/metrics-test.cc M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/common/JniUtil.java A fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java A fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java A fe/src/test/java/org/apache/impala/util/JMXJsonUtilTest.java A fe/src/test/java/org/apache/impala/util/JniUtilTest.java A tests/custom_cluster/test_pause_monitor.py M tests/webserver/test_web_pages.py 15 files changed, 805 insertions(+), 73 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/10998/9 -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 9 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 5: (6 comments) http://gerrit.cloudera.org:8080/#/c/10998/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10998/6//COMMIT_MSG@68 PS6, Line 68: - Tested it manually with kill -SIGSTOP/-SIGCONT . Made sure that : the non-GC JVM pauses are logged. > This test could be automated into our suite easily enough. Done http://gerrit.cloudera.org:8080/#/c/10998/6/be/src/util/jni-util.cc File be/src/util/jni-util.cc: http://gerrit.cloudera.org:8080/#/c/10998/6/be/src/util/jni-util.cc@214 PS6, Line 214: if (!jni_util_cl_) return Status ("JniUtil::Init() not called."); > please remove the space after Status here. aah, how did I miss it. Done. http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java File fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java: http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java@173 PS5, Line 173: : if (extraSleepTime > warnThresholdMs_) { : LOG.warn(formatMessage( : extraSleepTime, gcTimesAfterSleep, gcTimesBeforeSleep)); : } else if (extraSleepTime > infoThresholdMs_) { : LOG.info(formatMessage( : extraSleepTime, gcTimesAfterSleep, gcTimesBeforeSleep)); : } > Ok. Testing can be made a little bit easier by having a counter for how man I implemented the log-search based approach, but ofcourse once we add these metrics, we can confirm that they are updated too. http://gerrit.cloudera.org:8080/#/c/10998/6/fe/src/test/java/org/apache/impala/util/JniUtilTest.java File fe/src/test/java/org/apache/impala/util/JniUtilTest.java: http://gerrit.cloudera.org:8080/#/c/10998/6/fe/src/test/java/org/apache/impala/util/JniUtilTest.java@55 PS6, Line 55: > Could you move this to JmxJsonUtilTest and test JmxJsonUtil.getJmxJson() di Done http://gerrit.cloudera.org:8080/#/c/10998/6/fe/src/test/java/org/apache/impala/util/JniUtilTest.java@71 PS6, Line 71: > For completeness, we should test that it gets the beans we expect. Somethin Done http://gerrit.cloudera.org:8080/#/c/10998/6/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/10998/6/tests/webserver/test_web_pages.py@73 PS6, Line 73: assert response.status_code == requests.codes.ok\ > Can you see about the content-type that we're getting back here? Done -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 5 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 01 Aug 2018 01:23:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Hello Tianyi Wang, Philip Zeyliger, Todd Lipcon, Impala Public Jenkins, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10998 to look at the new patch set (#8). Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics Pause monitor: = This commit adds a stripped down version of Hadoop's JvmPauseMonitor class (https://bit.ly/2O6qSwm) . The core implementaion is borrowed from hadoop-common project and the hadoop dependencies are removed. - Removed dependency on AbstractService. - Not relying on Hadoop's Configuration object for reading confs. - Switched to Guava's implementation of Stopwatch. This utility class can detect both GC/non-GC pauses. In case of GC pauses, the GC metrics during the pause period are logged. Sample Output: = Detected pause in JVM or host machine (eg GC): pause of approximately 2356ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2241ms GC pool 'PS Scavenge' had collection(s): count=3 time=352ms Detected pause in JVM or host machine (eg GC): pause of approximately 1964ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2082ms GC pool 'PS Scavenge' had collection(s): count=1 time=251ms Detected pause in JVM or host machine (eg GC): pause of approximately 2120ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2454ms Detected pause in JVM or host machine (eg GC): pause of approximately 2238ms GC pool 'PS MarkSweep' had collection(s): count=5 time=13464ms Detected pause in JVM or host machine (eg GC): pause of approximately 2233ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2733ms JMX Metrics: JMX metrics are now emmitted for Impala and Catalog JVMs at the web end point /jmx. - Impalad: http(s)://:25000/jmx - Catalogd: http(s)://:25020/jmx A text only version (without HTML formatting) of the JMX metrics can be obtained by passing 'raw' parameter to the /jmx endpoint. For ex: http://:25000/jmx?raw This is useful for downstream tools that want to consume and process this output. Misc: Renamed JvmMetric -> JvmMemoryMetric to make the intent more clear. It doesn't relate to the functionality of the patch in anyway. Testing: === - Tested it manually with kill -SIGSTOP/-SIGCONT . Made sure that the non-GC JVM pauses are logged. - This class' functionality is tested manually by invoking it's main() - Injected a memory leak into the Catalog server code and made sure the GC is detected. Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 --- M be/src/common/init.cc M be/src/util/default-path-handlers.cc M be/src/util/jni-util.cc M be/src/util/jni-util.h M be/src/util/memory-metrics.cc M be/src/util/memory-metrics.h M be/src/util/metrics-test.cc M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/common/JniUtil.java A fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java A fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java A fe/src/test/java/org/apache/impala/util/JMXJsonUtilTest.java A fe/src/test/java/org/apache/impala/util/JniUtilTest.java A tests/custom_cluster/test_pause_monitor.py M tests/webserver/test_web_pages.py 15 files changed, 824 insertions(+), 73 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/10998/8 -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 8 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Hello Tianyi Wang, Philip Zeyliger, Todd Lipcon, Impala Public Jenkins, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10998 to look at the new patch set (#7). Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics Pause monitor: = This commit adds a stripped down version of Hadoop's JvmPauseMonitor class (https://bit.ly/2O6qSwm) . The core implementaion is borrowed from hadoop-common project and the hadoop dependencies are removed. - Removed dependency on AbstractService. - Not relying on Hadoop's Configuration object for reading confs. - Switched to Guava's implementation of Stopwatch. This utility class can detect both GC/non-GC pauses. In case of GC pauses, the GC metrics during the pause period are logged. Sample Output: = Detected pause in JVM or host machine (eg GC): pause of approximately 2356ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2241ms GC pool 'PS Scavenge' had collection(s): count=3 time=352ms Detected pause in JVM or host machine (eg GC): pause of approximately 1964ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2082ms GC pool 'PS Scavenge' had collection(s): count=1 time=251ms Detected pause in JVM or host machine (eg GC): pause of approximately 2120ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2454ms Detected pause in JVM or host machine (eg GC): pause of approximately 2238ms GC pool 'PS MarkSweep' had collection(s): count=5 time=13464ms Detected pause in JVM or host machine (eg GC): pause of approximately 2233ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2733ms JMX Metrics: JMX metrics are now emmitted for Impala and Catalog JVMs at the web end point /jmx. - Impalad: http(s)://:25000/jmx - Catalogd: http(s)://:25020/jmx A text only version (without HTML formatting) of the JMX metrics can be obtained by passing 'raw' parameter to the /jmx endpoint. For ex: http://:25000/jmx?raw This is useful for downstream tools that want to consume and process this output. Misc: Renamed JvmMetric -> JvmMemoryMetric to make the intent more clear. It doesn't relate to the functionality of the patch in anyway. Testing: === - Tested it manually with kill -SIGSTOP/-SIGCONT . Made sure that the non-GC JVM pauses are logged. - This class' functionality is tested manually by invoking it's main() - Injected a memory leak into the Catalog server code and made sure the GC is detected. Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 --- M be/src/common/init.cc M be/src/util/default-path-handlers.cc M be/src/util/jni-util.cc M be/src/util/jni-util.h M be/src/util/memory-metrics.cc M be/src/util/memory-metrics.h M be/src/util/metrics-test.cc M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/common/JniUtil.java A fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java A fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java A fe/src/test/java/org/apache/impala/util/JMXJsonUtilTest.java A fe/src/test/java/org/apache/impala/util/JniUtilTest.java A tests/custom_cluster/test_pause_monitor.py M tests/webserver/test_web_pages.py 15 files changed, 824 insertions(+), 73 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/10998/7 -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 7 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 6: (6 comments) I think this is very close. I think we could add an end-to-end test to make sure the pause monitor is working; what do you think? Otherwise, very minor stuff left as far as I'm concerned. http://gerrit.cloudera.org:8080/#/c/10998/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10998/6//COMMIT_MSG@68 PS6, Line 68: - Tested it manually with kill -SIGSTOP/-SIGCONT . Made sure that : the non-GC JVM pauses are logged. This test could be automated into our suite easily enough. http://gerrit.cloudera.org:8080/#/c/10998/6/be/src/util/jni-util.cc File be/src/util/jni-util.cc: http://gerrit.cloudera.org:8080/#/c/10998/6/be/src/util/jni-util.cc@214 PS6, Line 214: if (!jni_util_cl_) return Status ("JniUtil::Init() not called."); please remove the space after Status here. http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java File fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java: http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java@173 PS5, Line 173: : if (extraSleepTime > warnThresholdMs_) { : LOG.warn(formatMessage( : extraSleepTime, gcTimesAfterSleep, gcTimesBeforeSleep)); : } else if (extraSleepTime > infoThresholdMs_) { : LOG.info(formatMessage( : extraSleepTime, gcTimesAfterSleep, gcTimesBeforeSleep)); : } > Yep, this came up in the comments so far. We can do a histogram of pauses a Ok. Testing can be made a little bit easier by having a counter for how many times this happens, but I'm fine either way. http://gerrit.cloudera.org:8080/#/c/10998/6/fe/src/test/java/org/apache/impala/util/JniUtilTest.java File fe/src/test/java/org/apache/impala/util/JniUtilTest.java: http://gerrit.cloudera.org:8080/#/c/10998/6/fe/src/test/java/org/apache/impala/util/JniUtilTest.java@55 PS6, Line 55: // Validates the JSON string returned by JniUtil.getJMXJson() Could you move this to JmxJsonUtilTest and test JmxJsonUtil.getJmxJson() directly? i.e., I'm not at all worried about the wrapper function, and it seems like we should be testing the thing that returns a String. http://gerrit.cloudera.org:8080/#/c/10998/6/fe/src/test/java/org/apache/impala/util/JniUtilTest.java@71 PS6, Line 71: assertNotNull("Invalid JSON: " + jmxJson, rootNode.findValue("beans")); For completeness, we should test that it gets the beans we expect. Something like the following is good enough, where 'output' is the JSON string. assertTrue(output.contains("java.lang:type=Memory")); assertTrue(output.contains("java.lang:type=Runtime")); http://gerrit.cloudera.org:8080/#/c/10998/6/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/10998/6/tests/webserver/test_web_pages.py@73 PS6, Line 73: response = requests.get(input_url) Can you see about the content-type that we're getting back here? -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 6 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 31 Jul 2018 18:57:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/107/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 6 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 31 Jul 2018 00:19:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/default-path-handlers.cc File be/src/util/default-path-handlers.cc: http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/default-path-handlers.cc@243 PS5, Line 243: webserver->RegisterUrlCallback("/jmx", "raw_text_pre.tmpl", JmxHandler); > It is not needed. PS6 uses common-pre.tmpl I see. I don't love it, but it's consistent with the other stuff, so I think we can leave it as you suggest. I haven't yet had the opportunity to pass through the tests. -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 5 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 30 Jul 2018 23:47:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/default-path-handlers.cc File be/src/util/default-path-handlers.cc: http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/default-path-handlers.cc@243 PS5, Line 243: webserver->RegisterUrlCallback("/jmx", "raw_text_pre.tmpl", JmxHandler); > I still don't see it in Gerrit. It is not needed. PS6 uses common-pre.tmpl https://gerrit.cloudera.org/#/c/10998/6/be/src/util/default-path-handlers.cc Users wouldn't add ?raw. They just do /jmx which is nicely formatted in common-pre.tmpl rather than plain text content-type. common-pre also has a header which helps navigate back. ?raw is only for tools which just want the TEXT version without HTML stuff. http://gerrit.cloudera.org:8080/#/c/10998/6/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/10998/6/tests/webserver/test_web_pages.py@72 PS6, Line 72: response = requests.get(input_url) > that's a good point. If the content-type is appropriate, then I'll get nice ?raw sets the plain content-type. Like I mentioned in the other comment, without ?raw, it is rendered in a common-pre tmpl that lets user navigate back to other places (since it has a header with a list of end points). Anyway if you prefer ?raw to be the default behavior, lemme know. -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 5 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 30 Jul 2018 23:45:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/10998/6/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/10998/6/tests/webserver/test_web_pages.py@72 PS6, Line 72: input_url = self.JMX_URL.format(port) + "?raw" > What is "?raw" doing here? This should be returning JSON even without ?raw, that's a good point. If the content-type is appropriate, then I'll get nicer rendering anyway out of my browser since I have a JSON extension set up. I imagine others do as well. -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 6 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 30 Jul 2018 23:14:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/default-path-handlers.cc File be/src/util/default-path-handlers.cc: http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/default-path-handlers.cc@243 PS5, Line 243: webserver->RegisterUrlCallback("/jmx", "raw_text_pre.tmpl", JmxHandler); > Ouch, I was experimenting with various template files and settled for commo I still don't see it in Gerrit. That said, I think it makes no sense to require the user to add ?raw here. We're not pretty-printing it or anything in the nice case. Let's just return the data if we at all can. http://gerrit.cloudera.org:8080/#/c/10998/6/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/10998/6/tests/webserver/test_web_pages.py@72 PS6, Line 72: response = requests.get(input_url) What is "?raw" doing here? This should be returning JSON even without ?raw, I would hope. -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 5 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 30 Jul 2018 23:05:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 6: (13 comments) http://gerrit.cloudera.org:8080/#/c/10998/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10998/5//COMMIT_MSG@46 PS5, Line 46: now emmitted > nit: *now* emitted, right? Ouch, you are right. http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/default-path-handlers.cc File be/src/util/default-path-handlers.cc: http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/default-path-handlers.cc@199 PS5, Line 199: Status status = JniUtil::GetJMXJson(); > is it worth adding something like: Done http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/default-path-handlers.cc@199 PS5, Line 199: Status status = JniUtil::GetJMXJson(); > Also, let's log on error here, at least once? Done http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/default-path-handlers.cc@243 PS5, Line 243: > Is raw_text_pre.tmpl missing from this review? Ouch, I was experimenting with various template files and settled for common-pre.tmpl. Forgot to git-add. Coming to the HTML question, you can pass "?raw" param to /jmx to fetch raw text contents (example in commit message). http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/jni-util.h File be/src/util/jni-util.h: http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/jni-util.h@295 PS5, Line 295: /// Gets JMX metrics of the JVM encoded as a JSON string. > "encoded as a JSON string"? Done http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/jni-util.cc File be/src/util/jni-util.cc: http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/jni-util.cc@214 PS5, Line 214: if (!jni_util_cl_) return Status ("JniUtil::Init() not called."); > Does this fail clang-tidy because of the space after Status? Probably shoul No it didn't. I'm not too familiar with it, so not sure why. Fixed it anyway. http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/common/JniUtil.java File fe/src/main/java/org/apache/impala/common/JniUtil.java: http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/common/JniUtil.java@102 PS5, Line 102: byte[] serializeToThrift(T input, F protocolFactory) throws ImpalaException { > This is boring, but please add a unit test for it; should be easy. There is an open jira to try out more efficient protocols but I'm not able to locate it. So, I'm just leaving the code as is to make it easy for that change. Lemme know if you feel strongly about the renaming and I can do it. Added the test. http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/common/JniUtil.java@211 PS5, Line 211: return serializeToThrift(jvmMetrics, protocolFactory_); > nit: You shouldn't need "JniUtil." here because you're already in the conte Done http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java File fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java: http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java@72 PS5, Line 72: // MBean server instance > instance Done http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java@73 PS5, Line 73: protected static transient MBeanServer mBeanServer = > why is this transient? I haven't put much thought into this, but looking at the hadoop code, I think that is because the JsonFactory implements a serializable interface and we don't want to serialize/persist the entire MBeanServer instance maybe? http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java@151 PS5, Line 151: Removed. http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java File fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java: http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java@60 PS5, Line 60: instance > nit: typo Done http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java@173 PS5, Line 173: : if (extraSleepTime > warnThresholdMs_) { : LOG.warn(formatMessage( : extraSleepTime, gcTimesAfterSleep, gcTimesBeforeSleep)); : } else if (extraSleepTime > infoThresholdMs_) { : LOG.info(formatMessage( : extraSleepTime, gcTimesAfterSleep, gcTimesBeforeSleep)); : } > Perhaps we should add Impala metrics for number of detected pauses? (Please Yep, this came up in the comments so far. We can do a histogram of pauses and expose metrics like - Last 3 pauses (GC/non-GC) - Max/min
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Hello Tianyi Wang, Philip Zeyliger, Todd Lipcon, Impala Public Jenkins, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10998 to look at the new patch set (#6). Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics Pause monitor: = This commit adds a stripped down version of Hadoop's JvmPauseMonitor class (https://bit.ly/2O6qSwm) . The core implementaion is borrowed from hadoop-common project and the hadoop dependencies are removed. - Removed dependency on AbstractService. - Not relying on Hadoop's Configuration object for reading confs. - Switched to Guava's implementation of Stopwatch. This utility class can detect both GC/non-GC pauses. In case of GC pauses, the GC metrics during the pause period are logged. Sample Output: = Detected pause in JVM or host machine (eg GC): pause of approximately 2356ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2241ms GC pool 'PS Scavenge' had collection(s): count=3 time=352ms Detected pause in JVM or host machine (eg GC): pause of approximately 1964ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2082ms GC pool 'PS Scavenge' had collection(s): count=1 time=251ms Detected pause in JVM or host machine (eg GC): pause of approximately 2120ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2454ms Detected pause in JVM or host machine (eg GC): pause of approximately 2238ms GC pool 'PS MarkSweep' had collection(s): count=5 time=13464ms Detected pause in JVM or host machine (eg GC): pause of approximately 2233ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2733ms JMX Metrics: JMX metrics are now emmitted for Impala and Catalog JVMs at the web end point /jmx. - Impalad: http(s)://:25000/jmx - Catalogd: http(s)://:25020/jmx A text only version (without HTML formatting) of the JMX metrics can be obtained by passing 'raw' parameter to the /jmx endpoint. For ex: http://:25000/jmx?raw This is useful for downstream tools that want to consume and process this output. Misc: Renamed JvmMetric -> JvmMemoryMetric to make the intent more clear. It doesn't relate to the functionality of the patch in anyway. Testing: === - Tested it manually with kill -SIGSTOP/-SIGCONT . Made sure that the non-GC JVM pauses are logged. - This class' functionality is tested manually by invoking it's main() - Injected a memory leak into the Catalog server code and made sure the GC is detected. Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 --- M be/src/common/init.cc M be/src/util/default-path-handlers.cc M be/src/util/jni-util.cc M be/src/util/jni-util.h M be/src/util/memory-metrics.cc M be/src/util/memory-metrics.h M be/src/util/metrics-test.cc M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/common/JniUtil.java A fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java A fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java A fe/src/test/java/org/apache/impala/util/JniUtilTest.java M tests/webserver/test_web_pages.py 13 files changed, 731 insertions(+), 73 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/10998/6 -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 6 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 6: Build Started https://jenkins.impala.io/job/gerrit-code-review-checks/107/ Running initial code review checks. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 6 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 30 Jul 2018 22:53:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/jni-util.cc File be/src/util/jni-util.cc: http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/jni-util.cc@214 PS5, Line 214: if (!jni_util_cl_) return Status ("JniUtil::Init() not called."); Does this fail clang-tidy because of the space after Status? Probably shouldn't be there. As far as I'm concerned, line 214 could be an assertion or a DCHECK, but it doesn't matter much. -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 5 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 30 Jul 2018 16:43:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 5: I should add that I think that passing through /jmx more or less verbatim seems to me like the right answer. So, thanks for doing that! -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 5 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 30 Jul 2018 16:39:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 5: (9 comments) Thanks. I'm very glad this is making it in. I've got a few comments, but nothing too major. http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/default-path-handlers.cc File be/src/util/default-path-handlers.cc: http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/default-path-handlers.cc@199 PS5, Line 199: if (!JniUtil::GetJMXJson().ok()) return; > is it worth adding something like: Also, let's log on error here, at least once? http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/default-path-handlers.cc@243 PS5, Line 243: webserver->RegisterUrlCallback("/jmx", "raw_text_pre.tmpl", JmxHandler); Is raw_text_pre.tmpl missing from this review? When I run curl against this endpoint, I should see "Content-type" header being application/json. Here's what the NN does. Your end-to-end test should validate that the HTTP headers are what we'd expect. I worry that "raw_text_pre.tmpl" is "$text", which means it's HTML, and the tools who read /jmx are expecting JSON and shouldn't be doing any more parsing than JSON-parsing. $curl --silent -v http://.../jmx | head * Trying .. * TCP_NODELAY set * Connected to ... (...) port 20101 (#0) > GET /jmx HTTP/1.1 > Host: ...:20101 > User-Agent: curl/7.52.1 > Accept: */* > < HTTP/1.1 200 OK < Date: Mon, 30 Jul 2018 16:36:01 GMT < Cache-Control: no-cache < Expires: Mon, 30 Jul 2018 16:36:01 GMT < Date: Mon, 30 Jul 2018 16:36:01 GMT < Pragma: no-cache < Content-Type: application/json; charset=utf8 < X-Content-Type-Options: nosniff < X-FRAME-OPTIONS: SAMEORIGIN < X-XSS-Protection: 1; mode=block < Access-Control-Allow-Methods: GET < Access-Control-Allow-Origin: * < Transfer-Encoding: chunked < { [12968 bytes data] { "beans" : [ { "name" : "Hadoop:service=NameNode,name=JvmMetrics", "modelerType" : "JvmMetrics", "tag.Context" : "jvm", "tag.ProcessName" : "NameNode", http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/jni-util.h File be/src/util/jni-util.h: http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/jni-util.h@295 PS5, Line 295: /// Gets JMX metrics of the JVM. "encoded as a JSON string"? http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/common/JniUtil.java File fe/src/main/java/org/apache/impala/common/JniUtil.java: http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/common/JniUtil.java@102 PS5, Line 102: byte[] serializeToThrift(T input, F protocolFactory) throws ImpalaException { This is boring, but please add a unit test for it; should be easy. Is protocolFactory always the same? I bet it is, so you can remove that argument and call it something about "serializeToThriftWithBinaryProtocol()" http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/common/JniUtil.java@211 PS5, Line 211: return JniUtil.serializeToThrift(jvmMetrics, protocolFactory_); nit: You shouldn't need "JniUtil." here because you're already in the context. Try removing it here and line 238. http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java File fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java: http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java@71 PS5, Line 71: public class JMXJsonUtil { Please add a simple unit test. Make sure it runs, generates a valid JSON, and contains something expected in it. http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java@72 PS5, Line 72: // MBean server isntance instance http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java@73 PS5, Line 73: protected static transient MBeanServer mBeanServer = why is this transient? http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java File fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java: http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java@173 PS5, Line 173: : if (extraSleepTime > warnThresholdMs_) { : LOG.warn(formatMessage( : extraSleepTime, gcTimesAfterSleep, gcTimesBeforeSleep)); : } else if (extraSleepTime > infoThresholdMs_) { : LOG.info(formatMessage( : extraSleepTime, gcTimesAfterSleep, gcTimesBeforeSleep)); : } Perhaps we should add Impala metrics for number of detected pauses? (Please take a look to see if HDFS and/or HBase do something similar.) The idea would be to let a monitoring tool quickly know that
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 5: -Code-Review Can we add a test for this as well? eg just a test which grabs /jmx and makes sure the output looks approximately right? -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 5 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 30 Jul 2018 15:43:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 5: Code-Review+2 (3 comments) http://gerrit.cloudera.org:8080/#/c/10998/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10998/5//COMMIT_MSG@46 PS5, Line 46: not emmitted nit: *now* emitted, right? http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/default-path-handlers.cc File be/src/util/default-path-handlers.cc: http://gerrit.cloudera.org:8080/#/c/10998/5/be/src/util/default-path-handlers.cc@199 PS5, Line 199: if (!JniUtil::GetJMXJson().ok()) return; is it worth adding something like: document->AddMember("error", status.ToString()) here so that we would be able to debug the case where GetJMXJson returns an error? http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java File fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java: http://gerrit.cloudera.org:8080/#/c/10998/5/fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java@60 PS5, Line 60: isntance nit: typo -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 5 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 30 Jul 2018 15:42:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/94/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 5 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Sat, 28 Jul 2018 00:19:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 5: (9 comments) http://gerrit.cloudera.org:8080/#/c/10998/4/be/src/util/default-path-handlers.h File be/src/util/default-path-handlers.h: http://gerrit.cloudera.org:8080/#/c/10998/4/be/src/util/default-path-handlers.h@34 PS4, Line 34: MetricGroup* metric_group = NULL); > why did you make the jmx path optional here? Is this because the statestore Yea, its a common code path for all of them. Looks like getJNIEnv() creates a JVM if one doesn't exist. I cleaned this up to add a flag in JniUtil which tells if a JVM exists. http://gerrit.cloudera.org:8080/#/c/10998/4/fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java File fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java: http://gerrit.cloudera.org:8080/#/c/10998/4/fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java@71 PS4, Line 71: public class JMXJsonUtil { > I didn't review this file particularly since it's just cross-ported. Any pa Nothing much. Looks straightforward. I commented a couple of refactors that I did incase you want to diff with the upstream version http://gerrit.cloudera.org:8080/#/c/10998/4/fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java@79 PS4, Line 79: public static String getJMXJson() { Removed a bunch of servlet code. https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/jmx/JMXJsonServlet.java#L170 http://gerrit.cloudera.org:8080/#/c/10998/4/fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java@89 PS4, Line 89: Removed the 'qry' functionality. https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/jmx/JMXJsonServlet.java#L206 http://gerrit.cloudera.org:8080/#/c/10998/4/fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java@198 PS4, Line 198: trace Moved this from upstream debug() to trace() since the default logging in Impala fe is debug(). This was firing for a few attributes and was dumping a bunch of stacks and looks like for the same reason it was made debug in the Hadoop code. https://github.com/apache/hadoop/commit/2d1406e9e7b75a833f79c0159031dae2ba3e6134 http://gerrit.cloudera.org:8080/#/c/10998/4/fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java File fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java: http://gerrit.cloudera.org:8080/#/c/10998/4/fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java@42 PS4, Line 42: public class JvmPauseMonitor { > same here, any changes worth noting to take a look at? Commented the refactoring I did. Nothing major. http://gerrit.cloudera.org:8080/#/c/10998/4/fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java@42 PS4, Line 42: { Removed dependency on AbstractService http://gerrit.cloudera.org:8080/#/c/10998/4/fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java@69 PS4, Line 69: private JvmPauseMonitor() { : this(INFO_THRESHOLD_MS, WARN_THRESHOLD_MS); : } Not relying on Hadoop Conf objects. http://gerrit.cloudera.org:8080/#/c/10998/4/fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java@160 PS4, Line 160: Stopwatch Uses Stopwatch from Guava. -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 5 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 27 Jul 2018 23:46:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Hello Tianyi Wang, Todd Lipcon, Impala Public Jenkins, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10998 to look at the new patch set (#5). Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics Pause monitor: = This commit adds a stripped down version of Hadoop's JvmPauseMonitor class (https://bit.ly/2O6qSwm) . The core implementaion is borrowed from hadoop-common project and the hadoop dependencies are removed. - Removed dependency on AbstractService. - Not relying on Hadoop's Configuration object for reading confs. - Switched to Guava's implementation of Stopwatch. This utility class can detect both GC/non-GC pauses. In case of GC pauses, the GC metrics during the pause period are logged. Sample Output: = Detected pause in JVM or host machine (eg GC): pause of approximately 2356ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2241ms GC pool 'PS Scavenge' had collection(s): count=3 time=352ms Detected pause in JVM or host machine (eg GC): pause of approximately 1964ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2082ms GC pool 'PS Scavenge' had collection(s): count=1 time=251ms Detected pause in JVM or host machine (eg GC): pause of approximately 2120ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2454ms Detected pause in JVM or host machine (eg GC): pause of approximately 2238ms GC pool 'PS MarkSweep' had collection(s): count=5 time=13464ms Detected pause in JVM or host machine (eg GC): pause of approximately 2233ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2733ms JMX Metrics: JMX metrics are not emmitted for Impala and Catalog JVMs at the web end point /jmx. - Impalad: http(s)://:25000/jmx - Catalogd: http(s)://:25020/jmx A text only version (without HTML formatting) of the JMX metrics can be obtained by passing 'raw' parameter to the /jmx endpoint. For ex: http://:25000/jmx?raw This is useful for downstream tools that want to consume and process this output. Misc: Renamed JvmMetric -> JvmMemoryMetric to make the intent more clear. It doesn't relate to the functionality of the patch in anyway. Testing: === - Tested it manually with kill -SIGSTOP/-SIGCONT . Made sure that the non-GC JVM pauses are logged. - This class' functionality is tested manually by invoking it's main() - Injected a memory leak into the Catalog server code and made sure the GC is detected. Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 --- M be/src/common/init.cc M be/src/util/default-path-handlers.cc M be/src/util/jni-util.cc M be/src/util/jni-util.h M be/src/util/memory-metrics.cc M be/src/util/memory-metrics.h M be/src/util/metrics-test.cc M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/common/JniUtil.java A fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java A fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java 11 files changed, 633 insertions(+), 70 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/10998/5 -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 5 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 5: Build Started https://jenkins.impala.io/job/gerrit-code-review-checks/94/ Running initial code review checks. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 5 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 27 Jul 2018 23:45:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/10998/4/be/src/util/default-path-handlers.h File be/src/util/default-path-handlers.h: http://gerrit.cloudera.org:8080/#/c/10998/4/be/src/util/default-path-handlers.h@34 PS4, Line 34: MetricGroup* metric_group = NULL, bool init_jmx = false); why did you make the jmx path optional here? Is this because the statestore doesn't use Java or something? In that case could we instead just use getJNIEnv or something else to determine it rather than adding this param? http://gerrit.cloudera.org:8080/#/c/10998/4/fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java File fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java: http://gerrit.cloudera.org:8080/#/c/10998/4/fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java@71 PS4, Line 71: public class JMXJsonUtil { I didn't review this file particularly since it's just cross-ported. Any particular bits that merit a special look that you had to change? http://gerrit.cloudera.org:8080/#/c/10998/4/fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java File fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java: http://gerrit.cloudera.org:8080/#/c/10998/4/fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java@42 PS4, Line 42: public class JvmPauseMonitor { same here, any changes worth noting to take a look at? -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 4 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 27 Jul 2018 22:58:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/92/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 4 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 27 Jul 2018 22:21:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 4: Screenshots: Raw JMX metrics: https://pasteboard.co/HwsbaAD.png Formatted : https://pasteboard.co/HwsbvyI.png -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 4 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 27 Jul 2018 21:56:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/10998/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10998/3//COMMIT_MSG@42 PS3, Line 42: : JMX Metrics: : > Personally I wouldn't find those aggregate counts super useful (doesn't giv Ok, removed them. I think we can do the histogram of GCs etc as a follow on patch. I feel the patch in the current form is pretty useful and makes the metrics available to the thirdparty tools to consume. http://gerrit.cloudera.org:8080/#/c/10998/3/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: http://gerrit.cloudera.org:8080/#/c/10998/3/common/thrift/Frontend.thrift@830 PS3, Line 830: > if you don't take my other suggestion about exposing "/jmx" directly, it wo switched to /jmx -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 4 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 27 Jul 2018 21:53:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Hello Tianyi Wang, Todd Lipcon, Impala Public Jenkins, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10998 to look at the new patch set (#4). Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics Pause monitor: = This commit adds a stripped down version of Hadoop's JvmPauseMonitor class (https://bit.ly/2O6qSwm) . The core implementaion is borrowed from hadoop-common project and the hadoop dependencies are removed. - Removed dependency on AbstractService. - Not relying on Hadoop's Configuration object for reading confs. - Switched to Guava's implementation of Stopwatch. This utility class can detect both GC/non-GC pauses. In case of GC pauses, the GC metrics during the pause period are logged. Sample Output: = Detected pause in JVM or host machine (eg GC): pause of approximately 2356ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2241ms GC pool 'PS Scavenge' had collection(s): count=3 time=352ms Detected pause in JVM or host machine (eg GC): pause of approximately 1964ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2082ms GC pool 'PS Scavenge' had collection(s): count=1 time=251ms Detected pause in JVM or host machine (eg GC): pause of approximately 2120ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2454ms Detected pause in JVM or host machine (eg GC): pause of approximately 2238ms GC pool 'PS MarkSweep' had collection(s): count=5 time=13464ms Detected pause in JVM or host machine (eg GC): pause of approximately 2233ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2733ms JMX Metrics: JMX metrics are not emmitted for Impala and Catalog JVMs at the web end point /jmx. - Impalad: http(s)://:25000/jmx - Catalogd: http(s)://:25020/jmx A text only version (without HTML formatting) of the JMX metrics can be obtained by passing 'raw' parameter to the /jmx endpoint. For ex: http://:25000/jmx?raw This is useful for downstream tools that want to consume and process this output. Misc: Renamed JvmMetric -> JvmMemoryMetric to make the intent more clear. It doesn't relate to the functionality of the patch in anyway. Testing: === - Tested it manually with kill -SIGSTOP/-SIGCONT . Made sure that the non-GC JVM pauses are logged. - This class' functionality is tested manually by invoking it's main() - Injected a memory leak into the Catalog server code and made sure the GC is detected. Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 --- M be/src/catalog/catalogd-main.cc M be/src/common/init.cc M be/src/runtime/exec-env.cc M be/src/util/default-path-handlers.cc M be/src/util/default-path-handlers.h M be/src/util/jni-util.cc M be/src/util/jni-util.h M be/src/util/memory-metrics.cc M be/src/util/memory-metrics.h M be/src/util/metrics-test.cc M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/common/JniUtil.java A fe/src/main/java/org/apache/impala/util/JMXJsonUtil.java A fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java 14 files changed, 629 insertions(+), 73 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/10998/4 -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 4 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility and expose JMX metrics .. Patch Set 4: Build Started https://jenkins.impala.io/job/gerrit-code-review-checks/92/ Running initial code review checks. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 4 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 27 Jul 2018 21:50:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/10998/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10998/3//COMMIT_MSG@42 PS3, Line 42: These metrics are exposed on the daemon's web UI in /memz page under : section "JVM Garbage Collection metrics". They can also be consumed : by tools from the /metrics endpoint. > I have a working patch for /jmx end point. Before I send it out for review, Personally I wouldn't find those aggregate counts super useful (doesn't give a sense of whether the server is *currently* having GC issues) but I dont think there's harm in including them. -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 3 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 27 Jul 2018 19:22:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/10998/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10998/3//COMMIT_MSG@42 PS3, Line 42: These metrics are exposed on the daemon's web UI in /memz page under : section "JVM Garbage Collection metrics". They can also be consumed : by tools from the /metrics endpoint. > Yea, I was thinking maybe you could import JMXJsonServlet.java wholesale fr I have a working patch for /jmx end point. Before I send it out for review, I'm wondering if we should still expose the GC metrics in the /memz page and /metrics as in https://pasteboard.co/Hwc4P5f.png What do you think? -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 3 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 27 Jul 2018 02:17:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/10998/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10998/3//COMMIT_MSG@42 PS3, Line 42: These metrics are exposed on the daemon's web UI in /memz page under : section "JVM Garbage Collection metrics". They can also be consumed : by tools from the /metrics endpoint. > Yea, these are the same as JMX MBean. Yea, I was thinking maybe you could import JMXJsonServlet.java wholesale from another project and just change it to return a string which can be passed to the backend or somesuch, and then take advantage of the fact that other tools know how to parse it -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 3 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 26 Jul 2018 18:20:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/10998/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10998/3//COMMIT_MSG@42 PS3, Line 42: These metrics are exposed on the daemon's web UI in /memz page under : section "JVM Garbage Collection metrics". They can also be consumed : by tools from the /metrics endpoint. > Are these redundant with the metrics that are available in the JMX MBean? I Yea, these are the same as JMX MBean. I thought about it for a while and settled with this because the JVM memory metrics was implemented this way [1](I'm not sure why). Looking at the jmx output, it is way more detailed and can be consumed by a variety of tools. Lemme try doing that and see how it works. Will get back. [1] https://github.com/apache/impala/commit/daf0626dbd4171dc582160af467cf1638271dc60 -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 3 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 26 Jul 2018 18:08:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility .. Patch Set 3: (2 comments) didn't review the code in detail, just one high-level suggestion here http://gerrit.cloudera.org:8080/#/c/10998/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10998/3//COMMIT_MSG@42 PS3, Line 42: These metrics are exposed on the daemon's web UI in /memz page under : section "JVM Garbage Collection metrics". They can also be consumed : by tools from the /metrics endpoint. Are these redundant with the metrics that are available in the JMX MBean? If we expose JMXJsonServlet (or its moral equivalent piped through our C++ webserver) then we'll fit in with the standard of the other Hadoop daemons for exposing the JVM stats. See http://namenode:5070/jmx (eg using the namenode in your minicluster) to see what I mean. The metrics in there are a lot more granular, including info like the heap size after the last collection (a decent estimate of live heap), etc. If possible, it might be nice to include in these metrics some other info that's _not_ available above, eg a histogram of GC times or pause times. Also, would be good to expose the host machine pause time which may not be GC-related. (eg timestamp of last N pauses, length of last pauses) http://gerrit.cloudera.org:8080/#/c/10998/3/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: http://gerrit.cloudera.org:8080/#/c/10998/3/common/thrift/Frontend.thrift@830 PS3, Line 830: 3: required i64 collection_time if you don't take my other suggestion about exposing "/jmx" directly, it would be good to expose the info from "memoryUsageBeforeGc" and "memoryUsageAfterGc" elements in JMX,as well as the timestamp and duration of the most recent GC. -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 3 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 26 Jul 2018 16:09:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility
Bharath Vissapragada has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility .. IMPALA-6857: Add Jvm pause/GC Monitor utility This commit adds a stripped down version of Hadoop's JvmPauseMonitor class (https://bit.ly/2O6qSwm) . The core implementaion is borrowed from hadoop-common project and the hadoop dependencies are removed. - Removed dependency on AbstractService. - Not relying on Hadoop's Configuration object for reading confs. - Switched to Guava's implementation of Stopwatch. This utility class can detect both GC/non-GC pauses. In case of GC pauses, the GC metrics during the pause period are logged. Sample Output: = Detected pause in JVM or host machine (eg GC): pause of approximately 2356ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2241ms GC pool 'PS Scavenge' had collection(s): count=3 time=352ms Detected pause in JVM or host machine (eg GC): pause of approximately 1964ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2082ms GC pool 'PS Scavenge' had collection(s): count=1 time=251ms Detected pause in JVM or host machine (eg GC): pause of approximately 2120ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2454ms Detected pause in JVM or host machine (eg GC): pause of approximately 2238ms GC pool 'PS MarkSweep' had collection(s): count=5 time=13464ms Detected pause in JVM or host machine (eg GC): pause of approximately 2233ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2733ms Metrics: === These metrics are exposed on the daemon's web UI in /memz page under section "JVM Garbage Collection metrics". They can also be consumed by tools from the /metrics endpoint. Testing: === - Tested it manually with kill -SIGSTOP/-SIGCONT . Made sure that the non-GC JVM pauses are logged. - This class' functionality is tested manually by invoking it's main() - Injected a memory leak into the Catalog server code and made sure the GC is detected. Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 --- M be/src/common/init.cc M be/src/util/default-path-handlers.cc M be/src/util/jni-util.cc M be/src/util/jni-util.h M be/src/util/memory-metrics.cc M be/src/util/memory-metrics.h M be/src/util/metrics-test.cc M common/thrift/Frontend.thrift M common/thrift/metrics.json M fe/src/main/java/org/apache/impala/common/JniUtil.java A fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java M www/memz.tmpl 12 files changed, 504 insertions(+), 67 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/10998/3 -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 3 Gerrit-Owner: Bharath Vissapragada
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10998 ) Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility .. Patch Set 3: Build Started https://jenkins.impala.io/job/gerrit-code-review-checks/71/ Running initial code review checks. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 3 Gerrit-Owner: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 26 Jul 2018 04:53:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility
Bharath Vissapragada has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10998 Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility .. IMPALA-6857: Add Jvm pause/GC Monitor utility This commit adds a stripped down version of Hadoop's JvmPauseMonitor class (https://bit.ly/2O6qSwm) . The core implementaion is borrowed from hadoop-common project and the hadoop dependencies are removed. - Removed dependency on AbstractService. - Not relying on Hadoop's Configuration object for reading confs. - Switched to Guava's implementation of Stopwatch. This utility class can detect both GC/non-GC pauses. In case of GC pauses, the difference of GC metrics before and after pause are logged. Sample Output: = Detected pause in JVM or host machine (eg GC): pause of approximately 2356ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2241ms GC pool 'PS Scavenge' had collection(s): count=3 time=352ms Detected pause in JVM or host machine (eg GC): pause of approximately 1964ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2082ms GC pool 'PS Scavenge' had collection(s): count=1 time=251ms Detected pause in JVM or host machine (eg GC): pause of approximately 2120ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2454ms Detected pause in JVM or host machine (eg GC): pause of approximately 2238ms GC pool 'PS MarkSweep' had collection(s): count=5 time=13464ms ^CDetected pause in JVM or host machine (eg GC): pause of approximately 2233ms GC pool 'PS MarkSweep' had collection(s): count=1 time=2733ms Testing: === - Tested it manually with kill -SIGSTOP/-SIGCONT . Made sure that the non-GC JVM pauses are logged. - This class' functionality is tested manually by invoking it's main() - Injected a memory leak into the Catalog server code and made sure the GC is detected. Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 --- M be/src/common/init.cc M be/src/util/jni-util.cc M be/src/util/jni-util.h M fe/src/main/java/org/apache/impala/common/JniUtil.java A fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java 5 files changed, 228 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/10998/2 -- To view, visit http://gerrit.cloudera.org:8080/10998 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341 Gerrit-Change-Number: 10998 Gerrit-PatchSet: 2 Gerrit-Owner: Bharath Vissapragada