Odg: negative ActiveCQCount
Hi, Thanks for the response! Decrementing is happening on both servers, I add some check to decrement on just one on which is incremented. You can find changes on https://github.com/apache/geode/pull/5397. BR, Mario Šalje: Anilkumar Gingade Poslano: 17. srpnja 2020. 21:19 Prima: dev@geode.apache.org Predmet: Re: negative ActiveCQCount Mario, Here is how the CQ register behaves: When there is a single client and two servers. When CQ is registered, with redundancy 0: - On non-partitioned region, the CQ gets registered on one server, through registerCQ(). - On partitioned region, if the region is hosted on both server, the CQ gets registered on one server through registerCQ() and another through FilterProfile.process*() In the code, I do see the stat for active CQ getting incremented correctly for both kind of registration. Seems the decrementing is also happening, but need to verify. You can add logs at CqServiceVSDStats.inc/dec methods to see if they are happening. If this is working as expected, then it could be related to how gfsh/mbean collecting the data and aggregating it. Also, need to consider the cases where not all the nodes are cache servers. -Anil. On 7/17/20, 12:47 AM, "Mario Kevo" wrote: Hi devs, Just reminder if someone is familiar with this, or someone has some idea how to resolve this issue. Thanks and BR, Mario Šalje: Mario Kevo Poslano: 7. srpnja 2020. 15:24 Prima: dev@geode.apache.org Predmet: Odg: Odg: negative ActiveCQCount Hi, Thank you all for the response! What I got for now is that when I register CQ on the one server it processMessage to the other server through FilterProfile and in the message opType is REGISTER_CQ. In fromData() method in FilterProfile.java states following: if (isCqOp(this.opType)) { this.serverCqName = in.readUTF(); if (this.opType == operationType.REGISTER_CQ || this.opType == operationType.SET_CQ_STATE) { this.cq = CqServiceProvider.readCq(in); } And there it register cq on the other server and not increment cqActiveCount, which is ok as redundancy is 0. But it now has on both server different instances of ServerCqImpl for the same cq. The ones created with constructor with arguments at the execute cq and another with empty constructor while deserializing the message with opType=REGISTER_CQ. For me this is ok as we need to follow up all changes on both servers as maybe some fullfil CQ condition on the other server. Correct me if I'm wrong. But when it is going to close cq it executes it on both server, for me it is ok that what is started should be closed. But in the close method we have decrement if stateBeforeClosing is RUNNING. So it will be good if we can somehow process cq_state of this ServerCqImpl instance which is created by constructor with parameters before closing this created by deserialization. Does anyone has an idea how to get this? Or some other idea to solve this issue? BR, Mario Šalje: Kirk Lund Poslano: 1. srpnja 2020. 19:52 Prima: dev@geode.apache.org Predmet: Re: Odg: negative ActiveCQCount Yeah, https://issues.apache.org/jira/browse/GEODE-8293 sounds like a statistic decrement bug for activeCqCount. Somewhere, each Server is decrementing it once too many times. You could find the statistics class containing activeCqCount and try adding some debugging log statements or even add some breakpoints for debugger if it's easily reproduced. On Wed, Jul 1, 2020 at 5:52 AM Mario Kevo wrote: > Hi Kirk, thanks for the response! > > I just realized that I wrongly describe the problem as I tried so many > case. Sorry! > > We have system with two servers. If the redundancy is 0 then we have > properly that on the first server is activeCqCount=1 and on the second is > activeCqCount=0. > After close CQ we got on first server activeCqCount=0 and on the second is > activeCqCount=-1. > gfsh>show metrics --categories=query > Cluster-wide Metrics > > Category | Metric | Value > | | - > query| activeCQCount| -1 > | queryRequestRate | 0.0 > > > In case we set redundancy to 1 it increments properly as expected, on both > servers by one. But when cq is closed we got on both servers > activeCqCount=-1. And show metrics command has the following output > gfsh>show metrics --categories=query > Cluster-wide Metrics > > Category | Metric | Value > | | - > query| activeCQCount| -1 > | queryRequestRate | 0.0
Odg: Odg: negative ActiveCQCount
Hi devs, Just reminder if someone is familiar with this, or someone has some idea how to resolve this issue. Thanks and BR, Mario Šalje: Mario Kevo Poslano: 7. srpnja 2020. 15:24 Prima: dev@geode.apache.org Predmet: Odg: Odg: negative ActiveCQCount Hi, Thank you all for the response! What I got for now is that when I register CQ on the one server it processMessage to the other server through FilterProfile and in the message opType is REGISTER_CQ. In fromData() method in FilterProfile.java states following: if (isCqOp(this.opType)) { this.serverCqName = in.readUTF(); if (this.opType == operationType.REGISTER_CQ || this.opType == operationType.SET_CQ_STATE) { this.cq = CqServiceProvider.readCq(in); } And there it register cq on the other server and not increment cqActiveCount, which is ok as redundancy is 0. But it now has on both server different instances of ServerCqImpl for the same cq. The ones created with constructor with arguments at the execute cq and another with empty constructor while deserializing the message with opType=REGISTER_CQ. For me this is ok as we need to follow up all changes on both servers as maybe some fullfil CQ condition on the other server. Correct me if I'm wrong. But when it is going to close cq it executes it on both server, for me it is ok that what is started should be closed. But in the close method we have decrement if stateBeforeClosing is RUNNING. So it will be good if we can somehow process cq_state of this ServerCqImpl instance which is created by constructor with parameters before closing this created by deserialization. Does anyone has an idea how to get this? Or some other idea to solve this issue? BR, Mario Šalje: Kirk Lund Poslano: 1. srpnja 2020. 19:52 Prima: dev@geode.apache.org Predmet: Re: Odg: negative ActiveCQCount Yeah, https://issues.apache.org/jira/browse/GEODE-8293 sounds like a statistic decrement bug for activeCqCount. Somewhere, each Server is decrementing it once too many times. You could find the statistics class containing activeCqCount and try adding some debugging log statements or even add some breakpoints for debugger if it's easily reproduced. On Wed, Jul 1, 2020 at 5:52 AM Mario Kevo wrote: > Hi Kirk, thanks for the response! > > I just realized that I wrongly describe the problem as I tried so many > case. Sorry! > > We have system with two servers. If the redundancy is 0 then we have > properly that on the first server is activeCqCount=1 and on the second is > activeCqCount=0. > After close CQ we got on first server activeCqCount=0 and on the second is > activeCqCount=-1. > gfsh>show metrics --categories=query > Cluster-wide Metrics > > Category | Metric | Value > | | - > query| activeCQCount| -1 > | queryRequestRate | 0.0 > > > In case we set redundancy to 1 it increments properly as expected, on both > servers by one. But when cq is closed we got on both servers > activeCqCount=-1. And show metrics command has the following output > gfsh>show metrics --categories=query > Cluster-wide Metrics > > Category | Metric | Value > | | - > query| activeCQCount| -1 > | queryRequestRate | 0.0 > > What I found is that when server register cq on one server it send message > to other servers in the system with opType=REGISTER_CQ and in that case it > creates new instance of ServerCqImpl on second server(with empty > constructor of ServerCqImpl). When we close CQ there is two different > instances on servers and it closed both of them, but as they are in RUNNING > state before closing, it decrements activeCqCount on both of them. > > BR, > Mario > > > Šalje: Kirk Lund > Poslano: 30. lipnja 2020. 19:54 > Prima: dev@geode.apache.org > Predmet: Re: negative ActiveCQCount > > I think *show metrics --categories=query* is showing you the query stats > from DistributedSystemMXBean (see > ShowMetricsCommand#writeSystemWideMetricValues). DistributedSystemMXBean > aggregates values across all members in the cluster, so I would have > expected activeCQCount to initially show a value of 2 after you create a > ServerCQImpl in 2 servers. Then after closing the CQ, it should drop to a > value of 0. > > When you create a CQ on a Server, it should be reflected asynchronously on > the CacheServerMXBean in that Server. Each Server has its own > CacheServerMXBean. Over on the Locator (JMX Manager), the > DistributedSystemMXBean aggregates the count of active CQs in > ServerClusterStatsMonitor by invoking > DistributedSystemBridge#updateCacheServer when the CacheServerMXBean state > is federated to the
Odg: Odg: negative ActiveCQCount
Hi, Thank you all for the response! What I got for now is that when I register CQ on the one server it processMessage to the other server through FilterProfile and in the message opType is REGISTER_CQ. In fromData() method in FilterProfile.java states following: if (isCqOp(this.opType)) { this.serverCqName = in.readUTF(); if (this.opType == operationType.REGISTER_CQ || this.opType == operationType.SET_CQ_STATE) { this.cq = CqServiceProvider.readCq(in); } And there it register cq on the other server and not increment cqActiveCount, which is ok as redundancy is 0. But it now has on both server different instances of ServerCqImpl for the same cq. The ones created with constructor with arguments at the execute cq and another with empty constructor while deserializing the message with opType=REGISTER_CQ. For me this is ok as we need to follow up all changes on both servers as maybe some fullfil CQ condition on the other server. Correct me if I'm wrong. But when it is going to close cq it executes it on both server, for me it is ok that what is started should be closed. But in the close method we have decrement if stateBeforeClosing is RUNNING. So it will be good if we can somehow process cq_state of this ServerCqImpl instance which is created by constructor with parameters before closing this created by deserialization. Does anyone has an idea how to get this? Or some other idea to solve this issue? BR, Mario Šalje: Kirk Lund Poslano: 1. srpnja 2020. 19:52 Prima: dev@geode.apache.org Predmet: Re: Odg: negative ActiveCQCount Yeah, https://issues.apache.org/jira/browse/GEODE-8293 sounds like a statistic decrement bug for activeCqCount. Somewhere, each Server is decrementing it once too many times. You could find the statistics class containing activeCqCount and try adding some debugging log statements or even add some breakpoints for debugger if it's easily reproduced. On Wed, Jul 1, 2020 at 5:52 AM Mario Kevo wrote: > Hi Kirk, thanks for the response! > > I just realized that I wrongly describe the problem as I tried so many > case. Sorry! > > We have system with two servers. If the redundancy is 0 then we have > properly that on the first server is activeCqCount=1 and on the second is > activeCqCount=0. > After close CQ we got on first server activeCqCount=0 and on the second is > activeCqCount=-1. > gfsh>show metrics --categories=query > Cluster-wide Metrics > > Category | Metric | Value > | | - > query| activeCQCount| -1 > | queryRequestRate | 0.0 > > > In case we set redundancy to 1 it increments properly as expected, on both > servers by one. But when cq is closed we got on both servers > activeCqCount=-1. And show metrics command has the following output > gfsh>show metrics --categories=query > Cluster-wide Metrics > > Category | Metric | Value > | | - > query| activeCQCount| -1 > | queryRequestRate | 0.0 > > What I found is that when server register cq on one server it send message > to other servers in the system with opType=REGISTER_CQ and in that case it > creates new instance of ServerCqImpl on second server(with empty > constructor of ServerCqImpl). When we close CQ there is two different > instances on servers and it closed both of them, but as they are in RUNNING > state before closing, it decrements activeCqCount on both of them. > > BR, > Mario > > > Šalje: Kirk Lund > Poslano: 30. lipnja 2020. 19:54 > Prima: dev@geode.apache.org > Predmet: Re: negative ActiveCQCount > > I think *show metrics --categories=query* is showing you the query stats > from DistributedSystemMXBean (see > ShowMetricsCommand#writeSystemWideMetricValues). DistributedSystemMXBean > aggregates values across all members in the cluster, so I would have > expected activeCQCount to initially show a value of 2 after you create a > ServerCQImpl in 2 servers. Then after closing the CQ, it should drop to a > value of 0. > > When you create a CQ on a Server, it should be reflected asynchronously on > the CacheServerMXBean in that Server. Each Server has its own > CacheServerMXBean. Over on the Locator (JMX Manager), the > DistributedSystemMXBean aggregates the count of active CQs in > ServerClusterStatsMonitor by invoking > DistributedSystemBridge#updateCacheServer when the CacheServerMXBean state > is federated to the Locator (JMX Manager). > > Based on what I see in code and in the description on GEODE-8293, I think > you might want to see if increment has a problem instead of decrement. > > I don't see anything that would limit the activeCQCount to only count the > CQs on
Re: Odg: negative ActiveCQCount
Yeah, https://issues.apache.org/jira/browse/GEODE-8293 sounds like a statistic decrement bug for activeCqCount. Somewhere, each Server is decrementing it once too many times. You could find the statistics class containing activeCqCount and try adding some debugging log statements or even add some breakpoints for debugger if it's easily reproduced. On Wed, Jul 1, 2020 at 5:52 AM Mario Kevo wrote: > Hi Kirk, thanks for the response! > > I just realized that I wrongly describe the problem as I tried so many > case. Sorry! > > We have system with two servers. If the redundancy is 0 then we have > properly that on the first server is activeCqCount=1 and on the second is > activeCqCount=0. > After close CQ we got on first server activeCqCount=0 and on the second is > activeCqCount=-1. > gfsh>show metrics --categories=query > Cluster-wide Metrics > > Category | Metric | Value > | | - > query| activeCQCount| -1 > | queryRequestRate | 0.0 > > > In case we set redundancy to 1 it increments properly as expected, on both > servers by one. But when cq is closed we got on both servers > activeCqCount=-1. And show metrics command has the following output > gfsh>show metrics --categories=query > Cluster-wide Metrics > > Category | Metric | Value > | | - > query| activeCQCount| -1 > | queryRequestRate | 0.0 > > What I found is that when server register cq on one server it send message > to other servers in the system with opType=REGISTER_CQ and in that case it > creates new instance of ServerCqImpl on second server(with empty > constructor of ServerCqImpl). When we close CQ there is two different > instances on servers and it closed both of them, but as they are in RUNNING > state before closing, it decrements activeCqCount on both of them. > > BR, > Mario > > > Šalje: Kirk Lund > Poslano: 30. lipnja 2020. 19:54 > Prima: dev@geode.apache.org > Predmet: Re: negative ActiveCQCount > > I think *show metrics --categories=query* is showing you the query stats > from DistributedSystemMXBean (see > ShowMetricsCommand#writeSystemWideMetricValues). DistributedSystemMXBean > aggregates values across all members in the cluster, so I would have > expected activeCQCount to initially show a value of 2 after you create a > ServerCQImpl in 2 servers. Then after closing the CQ, it should drop to a > value of 0. > > When you create a CQ on a Server, it should be reflected asynchronously on > the CacheServerMXBean in that Server. Each Server has its own > CacheServerMXBean. Over on the Locator (JMX Manager), the > DistributedSystemMXBean aggregates the count of active CQs in > ServerClusterStatsMonitor by invoking > DistributedSystemBridge#updateCacheServer when the CacheServerMXBean state > is federated to the Locator (JMX Manager). > > Based on what I see in code and in the description on GEODE-8293, I think > you might want to see if increment has a problem instead of decrement. > > I don't see anything that would limit the activeCQCount to only count the > CQs on primaries. So, I would expect redundancy=1 to result in a value of > 2. Does anyone else have different info about this? > > On Tue, Jun 30, 2020 at 5:31 AM Mario Kevo wrote: > > > Hi geode-dev, > > > > I have a question about CQ( > > https://issues.apache.org/jira/browse/GEODE-8293). > > If we run CQ it register cq on one of the > > servers(setPoolSubscriptionRedundancy is 1) and increment activeCQCount. > > As I understand then it processInputBuffer to another server and there is > > deserialization of the message. In case if opType is REGISTER_CQ or > > SET_CQ_STATE it will call readCq from CqServiceProvider, at the end calls > > empty contructor ServerCQImpl which is used for deserialization. > > > > The problem is when we close CQ then it has ServerCqImpl reference on > both > > servers, close them, and decrement on both of them. In that case we have > > negative value of activeCQCount in show metrics command. > > > > Does anyone knows how to get in close method which is the primary and > only > > decrement on it? > > Any advice is welcome! > > > > BR, > > Mario > > >
Odg: negative ActiveCQCount
Hi Kirk, thanks for the response! I just realized that I wrongly describe the problem as I tried so many case. Sorry! We have system with two servers. If the redundancy is 0 then we have properly that on the first server is activeCqCount=1 and on the second is activeCqCount=0. After close CQ we got on first server activeCqCount=0 and on the second is activeCqCount=-1. gfsh>show metrics --categories=query Cluster-wide Metrics Category | Metric | Value | | - query| activeCQCount| -1 | queryRequestRate | 0.0 In case we set redundancy to 1 it increments properly as expected, on both servers by one. But when cq is closed we got on both servers activeCqCount=-1. And show metrics command has the following output gfsh>show metrics --categories=query Cluster-wide Metrics Category | Metric | Value | | - query| activeCQCount| -1 | queryRequestRate | 0.0 What I found is that when server register cq on one server it send message to other servers in the system with opType=REGISTER_CQ and in that case it creates new instance of ServerCqImpl on second server(with empty constructor of ServerCqImpl). When we close CQ there is two different instances on servers and it closed both of them, but as they are in RUNNING state before closing, it decrements activeCqCount on both of them. BR, Mario Šalje: Kirk Lund Poslano: 30. lipnja 2020. 19:54 Prima: dev@geode.apache.org Predmet: Re: negative ActiveCQCount I think *show metrics --categories=query* is showing you the query stats from DistributedSystemMXBean (see ShowMetricsCommand#writeSystemWideMetricValues). DistributedSystemMXBean aggregates values across all members in the cluster, so I would have expected activeCQCount to initially show a value of 2 after you create a ServerCQImpl in 2 servers. Then after closing the CQ, it should drop to a value of 0. When you create a CQ on a Server, it should be reflected asynchronously on the CacheServerMXBean in that Server. Each Server has its own CacheServerMXBean. Over on the Locator (JMX Manager), the DistributedSystemMXBean aggregates the count of active CQs in ServerClusterStatsMonitor by invoking DistributedSystemBridge#updateCacheServer when the CacheServerMXBean state is federated to the Locator (JMX Manager). Based on what I see in code and in the description on GEODE-8293, I think you might want to see if increment has a problem instead of decrement. I don't see anything that would limit the activeCQCount to only count the CQs on primaries. So, I would expect redundancy=1 to result in a value of 2. Does anyone else have different info about this? On Tue, Jun 30, 2020 at 5:31 AM Mario Kevo wrote: > Hi geode-dev, > > I have a question about CQ( > https://issues.apache.org/jira/browse/GEODE-8293). > If we run CQ it register cq on one of the > servers(setPoolSubscriptionRedundancy is 1) and increment activeCQCount. > As I understand then it processInputBuffer to another server and there is > deserialization of the message. In case if opType is REGISTER_CQ or > SET_CQ_STATE it will call readCq from CqServiceProvider, at the end calls > empty contructor ServerCQImpl which is used for deserialization. > > The problem is when we close CQ then it has ServerCqImpl reference on both > servers, close them, and decrement on both of them. In that case we have > negative value of activeCQCount in show metrics command. > > Does anyone knows how to get in close method which is the primary and only > decrement on it? > Any advice is welcome! > > BR, > Mario >