[ https://issues.apache.org/jira/browse/SOLR-15056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17262468#comment-17262468 ]
Andrzej Bialecki commented on SOLR-15056: ----------------------------------------- {{systtemCpuLoad}} is already supported and returned as one of the metrics. This comes from the (somewhat convoluted) code in {{MetricUtils.addMxBeanMetrics}} where it tries to use all known implementations and accumulates any unique bean properties that they expose. For example: {code} http://localhost:8983/solr/admin/metrics?group=jvm&prefix=os { "responseHeader": { "status": 0, "QTime": 1 }, "metrics": { "solr.jvm": { "os.arch": "x86_64", "os.availableProcessors": 12, "os.committedVirtualMemorySize": 8402419712, "os.freePhysicalMemorySize": 41504768, "os.freeSwapSpaceSize": 804519936, "os.maxFileDescriptorCount": 8192, "os.name": "Mac OS X", "os.openFileDescriptorCount": 195, "os.processCpuLoad": 0.0017402379609634876, "os.processCpuTime": 10492010000, "os.systemCpuLoad": 0.1268950796343933, "os.systemLoadAverage": 4.00439453125, "os.totalPhysicalMemorySize": 34359738368, "os.totalSwapSpaceSize": 7516192768, "os.version": "10.16" } } } {code} > CPU circuit breaker needs to use CPU utilization, not Unix load average > ----------------------------------------------------------------------- > > Key: SOLR-15056 > URL: https://issues.apache.org/jira/browse/SOLR-15056 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: metrics > Affects Versions: 8.7 > Reporter: Walter Underwood > Priority: Major > > The config range, 50% to 95%, assumes that the circuit breaker is triggered > by a CPU utilization metric that goes from 0% to 100%. But the code uses the > metric OperatingSystemMXBean.getSystemLoadAverage(). That is an average of > the count of processes waiting to run. It is effectively unbounded. I've seen > it as high as 50 to 100. It is not bound by 1.0 (100%). > A good limit for load average would need to be aware of the number of CPUs > available to the JVM. A load average of 8 is no problem for a 32 CPU host. It > is a critical situation for a 2 CPU host. > Also, load average is a Unix OS metric. I don't know if it is even available > on Windows. > Instead, use a CPU utilization metric that goes from 0.0 to 1.0. A good > choice is OperatingSystemMXBean.getSystemCPULoad(). This name also uses > "load", but it is a usage metric. > From the Javadoc: > > Returns the "recent cpu usage" for the whole system. This value is a double > >in the [0.0,1.0] interval. A value of 0.0 means that all CPUs were idle > >during the recent period of time observed, while a value of 1.0 means that > >all CPUs were actively running 100% of the time during the recent period > >being observed. All values betweens 0.0 and 1.0 are possible depending of > >the activities going on in the system. If the system recent cpu usage is not > >available, the method returns a negative value. > https://docs.oracle.com/javase/7/docs/jre/api/management/extension/com/sun/management/OperatingSystemMXBean.html#getSystemCpuLoad() > Also update the documentation to explain which JMX metrics are used for the > memory and CPU circuit breakers. > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org