Hi All, Could someone confirm that I’m looking into correct charts in .gfsh org.apache.geode.management.MemberMXBean#getFunctionExecutionRate, org.apache.geode.management.MemberMXBean#getPutsRate and org.apache.geode.management.MemberMXBean#getGetsRate.
I do see in gfs that CachePerfStats. “gets”, “puts” and “FunctionExecution.functionExecutionCalls” have non-zero values. Or could it be that these stats are not related to MXBean methods I’m referring to ? Thanks, Vahram. From: Vahram Aharonyan <[email protected]> Sent: Tuesday, February 12, 2019 9:20 PM To: [email protected] Subject: RE: Some Geode management metrics returning 0s after OS upgrade Hi Darrel, Thanks for the feedback. Actually we continuously do region put/gets and function executions. If I’m not mistaken we have seen some non-0 values in CachePerfStats “gets” and “puts” metrics. Are these the ones that are getting propagated in org.apache.geode.management.MemberMXBean#getPutsRate, org.apache.geode.management.MemberMXBean#getGetsRate I mentioned before? And could it be that for “org.apache.geode.management.MemberMXBean#getFunctionExecutionRate” I should refer to “FunctionExecution.functionExecutionCalls” in gfs? Thanks, Vahram. From: Darrel Schneider <[email protected]<mailto:[email protected]>> Sent: Tuesday, February 12, 2019 8:08 PM To: [email protected]<mailto:[email protected]> Subject: Re: Some Geode management metrics returning 0s after OS upgrade You need to actually be executing functions and doing region puts/gets for these stats to be non-zero. The gfs files record the gets/puts in the CachePerfStats. One CachePerfStats represents a combination of all the regions. Other CachePerfStats represent just one region (those have that region name on them). You want to look at the first one as it represents the entire cache. On Tue, Feb 12, 2019 at 6:43 AM Vahram Aharonyan <[email protected]<mailto:[email protected]>> wrote: Hi All, Experiments with various experiments and long-term monitoring showed that the only real problem remains only with these 3 metrics: org.apache.geode.management.MemberMXBean#getFunctionExecutionRate org.apache.geode.management.MemberMXBean#getPutsRate org.apache.geode.management.MemberMXBean#getGetsRate All others related to either Network or Disk have some values differing from 0, but these three constantly have 0-values. These seem to be Geode-internal metrics and should not be related to system right? Could it be that there is some info on these metrics in *.gfs files, so we can see whether they have actual values or not? Thanks, Vahram. From: Vahram Aharonyan <[email protected]<mailto:[email protected]>> Sent: Thursday, February 7, 2019 5:19 PM To: [email protected]<mailto:[email protected]> Subject: RE: Some Geode management metrics returning 0s after OS upgrade Hi Kirk, We were not able to find any erroneous message from StatsSampler in our log files. Is running of these tests straightforward, do we have some doc describing this process? What kind of requirements should be met to be able to run this test? Hi Barry, Yes, we see values for other MBean attributes reported. You were right, thread is there: INFO | jvm 1 | 2019/02/07 12:15:54 | "Thread-10 StatSampler" #59 daemon prio=10 os_prio=0 tid=0x00007f1fc8951800 nid=0x2d0 in Object.wait() [0x00007f1fb14e3000] INFO | jvm 1 | 2019/02/07 12:15:54 | java.lang.Thread.State: TIMED_WAITING (on object monitor) INFO | jvm 1 | 2019/02/07 12:15:54 | at java.lang.Object.wait(Native Method) INFO | jvm 1 | 2019/02/07 12:15:54 | at org.apache.geode.internal.statistics.HostStatSampler.delay(HostStatSampler.java:520) INFO | jvm 1 | 2019/02/07 12:15:54 | - locked <0x0000000651581a68> (a org.apache.geode.internal.statistics.GemFireStatSampler) INFO | jvm 1 | 2019/02/07 12:15:54 | at org.apache.geode.internal.statistics.HostStatSampler.run(HostStatSampler.java:208) INFO | jvm 1 | 2019/02/07 12:15:54 | at java.lang.Thread.run(Thread.java:748) Could it be that this is caused by missing some privileges to access system resources ? Or is there some way to check if this information is available in the *.gfs stat files from locator or server? I was looking into these files but was not able to find anything linking me with below-mentioned metrics. Thanks, Vahram. From: Barry Oglesby <[email protected]<mailto:[email protected]>> Sent: Wednesday, February 6, 2019 11:21 PM To: [email protected]<mailto:[email protected]> Subject: Re: Some Geode management metrics returning 0s after OS upgrade Do you see values for other MBean attributes? If you do a thread dump in your server JVM(s), you should see a thread like this running: "StatSampler" #39 daemon prio=10 os_prio=31 tid=0x00007fdcbf004000 nid=0x7003 in Object.wait() [0x000070000c50a000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) at org.apache.geode.internal.statistics.HostStatSampler.delay(HostStatSampler.java:519) - locked <0x00000007a8911160> (a org.apache.geode.internal.statistics.GemFireStatSampler) at org.apache.geode.internal.statistics.HostStatSampler.run(HostStatSampler.java:219) at java.lang.Thread.run(Thread.java:745) On Wed, Feb 6, 2019 at 9:40 AM Kirk Lund <[email protected]<mailto:[email protected]>> wrote: Phantom OS might have caused the StatSampler to fail or even crash. That's the only explanation I can think of that might result in the non-OS related stats remaining zero. You might want to look through the log to see if the StatSampler logged any problems. Other than that, you could try running every statistic related test/integrationTest/distributedTest in Geode on Phantom OS to see how the tests behave. On Wed, Feb 6, 2019 at 7:49 AM Anthony Baker <[email protected]<mailto:[email protected]>> wrote: I wouldn’t be surprised if other OS -related things are broken on Phantom OS as well. We use JNA for most native calls. Look at `git grep Native.register` to see what posix-like things might be affected. Anthony On Feb 6, 2019, at 7:28 AM, Jacob Barrett <[email protected]<mailto:[email protected]>> wrote: We don’t have any hooks into the stats for this OS. On Feb 6, 2019, at 7:16 AM, Jens Deppe <[email protected]<mailto:[email protected]>> wrote: From SLES 11 to Phantom OS (I had already asked asked, but my CC got scrambled :( ) On Wed, Feb 6, 2019 at 7:10 AM Anthony Baker <[email protected]<mailto:[email protected]>> wrote: Which OS did you upgrade to? Anthony On Feb 6, 2019, at 1:25 AM, Vahram Aharonyan <[email protected]<mailto:[email protected]>> wrote: Hi All, For our troubleshooting purposes we have been collecting some data from Geode cluster members using following APIs: org.apache.geode.management.MemberMXBean#getFunctionExecutionRate org.apache.geode.management.MemberMXBean#getPutsRate org.apache.geode.management.MemberMXBean#getGetsRate org.apache.geode.management.NetworkMetrics#getBytesReceivedRate org.apache.geode.management.NetworkMetrics#getBytesSentRate org.apache.geode.management.DiskMetrics#getDiskFlushAvgLatency org.apache.geode.management.DiskMetrics#getDiskReadsRate org.apache.geode.management.DiskMetrics#getDiskWritesRate Recently we have replaced our base OS and all the values reported back by Geode during this calls become 0s. Could someone help us to understand how these metrics are being collected by Geode? Could it be that Geode uses some system utilities or system calls that existed in our previous appliance and are removed in our newer version of system causing Geode returning only 0s. Thanks, Vahram.
