dschneider-pivotal commented on code in PR #7751:
URL: https://github.com/apache/geode/pull/7751#discussion_r890569224
##########
geode-core/src/main/java/org/apache/geode/internal/monitoring/ThreadsMonitoringProcess.java:
##########
@@ -119,30 +120,87 @@ private int
checkForStuckThreads(Collection<AbstractExecutor> executors, long cu
return result;
}
- @VisibleForTesting
- public static Map<Long, ThreadInfo> createThreadInfoMap(Set<Long>
stuckThreadIds) {
- /*
- * NOTE: at least some implementations of getThreadInfo(long[], boolean,
boolean)
- * will core dump if the long array contains a duplicate value.
- * That is why stuckThreadIds is a Set instead of a List.
- */
+ /**
+ * If set to true, then the JVM will be asked for what locks a thread holds.
+ * This is extra expensive to ask for on some JVMs so be careful setting
this to true.
+ */
+ private static final boolean SHOW_LOCKS =
Boolean.getBoolean("gemfire.threadmonitor.showLocks");
+ /**
+ * If set to true, then the JVM will be asked for all potential stuck
threads with one call.
+ * Since getThreadInfo on many JVMs, stops ALL threads from running, and
since getting info
+ * on multiple threads with one call is additional work, setting this can
cause an extra long
+ * stop the world that can then cause other problems (like a forced
disconnect).
+ * So be careful setting this to true.
+ */
+ private static final boolean BATCH_CALLS =
Boolean.getBoolean("gemfire.threadmonitor.batchCalls");
+
+ private static Map<Long, ThreadInfo> createThreadInfoMap(Set<Long>
stuckThreadIds) {
+ return createThreadInfoMap(stuckThreadIds, SHOW_LOCKS, BATCH_CALLS);
+ }
+
+ public static Map<Long, ThreadInfo> createThreadInfoMap(Set<Long>
stuckThreadIds,
+ boolean showLocks, boolean batchCalls) {
+ return createThreadInfoMap(ManagementFactory.getThreadMXBean(),
stuckThreadIds, showLocks,
+ batchCalls);
+ }
+
+ static Map<Long, ThreadInfo> createThreadInfoMap(ThreadMXBean threadMXBean,
+ Set<Long> stuckThreadIds,
+ final boolean showLocks, final boolean batchCalls) {
if (stuckThreadIds.isEmpty()) {
return Collections.emptyMap();
}
+ logger.info(
+ "Obtaining ThreadInfo for {} threads. Configuration: showLocks={}
batchCalls={}. This is an expensive operation for the JVM and on most JVMs
causes all threads to be paused.",
Review Comment:
It has always been true. Between these two messages we are making calls that
ask ALL threads to pause at a safe point. Once they are all paused we then get
the ThreadInfo. If one of those threads has a high TTSP (time to safe point)
then this can cause what looks like a stop the world gc but without any gc
activity. Since what we do between these two log message could cause things
like a forced disconnect I thought it was worth logging. Note that if we have
no suspects of being stuck then we don't log the messages.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]