[ https://issues.apache.org/jira/browse/SOLR-16986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767226#comment-17767226 ]
Alex Deparvu commented on SOLR-16986: ------------------------------------- very interesting. I was playing with something like this but never managed to clean it up for a proposal. the aggregation part is interesting, I would have thought per-node CPU is good in understanding 'local' problems (sudden spikes, uneven load distribution) but I don't see how global aggregation would help better. not against it, just curious. just a few random thoughts (you are probably well aware of most of this already): * collecting threadUserTime is also useful and diffing the 2 gives some more detail like total cpu, user mode time, system mode time * Solr uses tread pools so this collection needs to happen inside a given execution, otherwise metrics are not useful * along the same lines, I was also playing with per-thread allocated memory. yes there are a few hoops you need to go through but sometimes this metric is available and it looks interesting. (just an example https://github.com/scala/compiler-benchmark/blob/f7d789fbada662ed76d351a4ab5fe34b200ec770/compilation/src/main/scala/scala/tools/nsc/ExtendedThreadMxBean.java#L5) * it is very easy to expose this new data as a `MetricSet` under the `Group.jvm` metrics registry, but the prometheus exporter needs some updates to include this data > Measure and aggregate thread CPU time in distributed search > ----------------------------------------------------------- > > Key: SOLR-16986 > URL: https://issues.apache.org/jira/browse/SOLR-16986 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Reporter: David Smiley > Priority: Major > > Solr responses include "QTime", which in retrospect might have been better > named "elapsedTime". We propose adding here a "cpuTime" to return the amount > of time consumed by > ManagementFactory.getThreadMXBean().[getThreadCpuTime|https://docs.oracle.com/en/java/javase/11/docs/api/java.management/java/lang/management/ThreadMXBean.html](). > Unlike QTime, this will need to be aggregated across distributed requests. > This work item will only do the aggregation work for distributed search, > although it could be extended for other scenarios in future work items. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org