On 27-03-2018 09:03, Sebastian Laskawiec wrote: > At the moment, the cluster health status checker enumerates all caches > in the cache manager [1] and checks whether those cashes are running and > not in degraded more [2]. > > I'm not sure how counter caches have been implemented. One thing is for > sure - they should be taken into account in this loop [3].
The private caches aren't listed by CacheManager.getCacheNames(). We have to check them via InternalCacheRegistry.getInternalCacheNames(). I'll open a JIRA if you don't mind :) > > [1] > https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/health/impl/ClusterHealthImpl.java#L22 > [2] > https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/health/impl/CacheHealthImpl.java#L25 > [3] > https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/health/impl/ClusterHealthImpl.java#L23-L24 > > On Mon, Mar 26, 2018 at 1:59 PM Thomas SEGISMONT <tsegism...@gmail.com > <mailto:tsegism...@gmail.com>> wrote: > > 2018-03-26 13:16 GMT+02:00 Pedro Ruivo <pe...@infinispan.org > <mailto:pe...@infinispan.org>>: > > > > On 23-03-2018 15:06, Thomas SEGISMONT wrote: > > Hi Pedro, > > > > 2018-03-23 13:25 GMT+01:00 Pedro Ruivo <pe...@infinispan.org > <mailto:pe...@infinispan.org> > > <mailto:pe...@infinispan.org <mailto:pe...@infinispan.org>>>: > > > > Hi Thomas, > > > > Is the test in question using any counter/lock? > > > > > > I have seen the problem on a test for counters, on another one for > > locks, as well as well as caches only. > > But Vert.x starts the ClusteredLockManager and the CounterManager > in all > > cases (even if no lock/counter is created/used) > > > > > > I did see similar behavior with the counter's in our server > test suite. > > The partition handling makes the cache degraded because nodes > are > > starting and stopping concurrently. > > > > > > As for me I was able to observe the problem even when stopping > nodes one > > after the other and waiting for cluster to go back to HEALTHY > status. > > Is it possible that the status of the counter and lock caches are > not > > taken into account in cluster health? > > The counter and lock caches are private. So, they aren't in the > cluster > health neither their name are returned by getCacheNames() method. > > > Thanks for the details. > > I'm not concerned with these internal caches not being listed when > calling getCacheNames. > > However, the cluster health status should include their status as well. > Cluster status testing is the recommended way to implement readiness > checks on Kubernetes for example. > > What do you think Sebastian? > > > > > > > > I'm not sure if there are any JIRA to tracking. Ryan, Dan > do you know? > > If there is none, it should be created. > > > > I improved the counters by making the cache start lazily > when you first > > get or define a counter [1]. This workaround solved the > issue for us. > > > > As a workaround for your test suite, I suggest to make > sure the caches > > (___counter_configuration and org.infinispan.LOCK) have > finished their > > state transfer before stopping the cache managers, by > invoking > > DefaultCacheManager.getCache(*cache-name*) in all the > caches managers. > > > > Sorry for the inconvenience and the delay in replying. > > > > > > No problem. > > > > > > Cheers, > > Pedro > > > > [1] https://issues.jboss.org/browse/ISPN-8860 > > <https://issues.jboss.org/browse/ISPN-8860> > > > > On 21-03-2018 16:16, Thomas SEGISMONT wrote: > > > Hi everyone, > > > > > > I am working on integrating Infinispan 9.2.Final in > vertx-infinispan. > > > Before merging I wanted to make sure the test suite > passed but it > > > doesn't. It's not the always the same test involved. > > > > > > In the logs, I see a lot of messages like "After merge (or > > coordinator > > > change), cache still hasn't recovered a majority of > members and must > > > stay in degraded mode. > > > The context involved are "___counter_configuration" and > > > "org.infinispan.LOCKS" > > > > > > Most often it's harmless but, sometimes, I also see > this exception > > > "ISPN000210: Failed to request state of cache" > > > Again the cache involved is either > "___counter_configuration" or > > > "org.infinispan.LOCKS" > > > After this exception, the cache manager is unable to > stop. It > > blocks in > > > method "terminate" (join on cache future). > > > > > > I thought the test suite was too rough (we stop all > nodes at the same > > > time). So I changed it to make sure that: > > > - nodes start one after the other > > > - a new node is started only when the previous one > indicates > > HEALTHY status > > > - nodes stop one after the other > > > - a node is stopped only when it indicates HEALTHY status > > > Pretty much what we do on Kubernetes for the readiness > check > > actually. > > > But it didn't get any better. > > > > > > Attached are the logs of such a failing test. > > > > > > Note that the Vert.x test itself does not fail, it's > only when > > closing > > > nodes that we have issues. > > > > > > Here's our XML config: > > > > > > > https://github.com/vert-x3/vertx-infinispan/blob/ispn92/src/main/resources/default-infinispan.xml > > > > <https://github.com/vert-x3/vertx-infinispan/blob/ispn92/src/main/resources/default-infinispan.xml> > > > > > > Does that ring a bell? Do you need more info? > > > > > > Regards, > > > Thomas > > > > > > > > > > > > _______________________________________________ > > > infinispan-dev mailing list > > > infinispan-dev@lists.jboss.org > <mailto:infinispan-dev@lists.jboss.org> > > <mailto:infinispan-dev@lists.jboss.org > <mailto:infinispan-dev@lists.jboss.org>> > > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > <https://lists.jboss.org/mailman/listinfo/infinispan-dev> > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev@lists.jboss.org > <mailto:infinispan-dev@lists.jboss.org> > <mailto:infinispan-dev@lists.jboss.org > <mailto:infinispan-dev@lists.jboss.org>> > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > <https://lists.jboss.org/mailman/listinfo/infinispan-dev> > > > > > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev@lists.jboss.org > <mailto:infinispan-dev@lists.jboss.org> > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > <mailto:infinispan-dev@lists.jboss.org> > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org <mailto:infinispan-dev@lists.jboss.org> > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev