So to make sure I understood that, this has no visible impact on the functionality of API methods, correct? Like any get operation would successfully retrieve a remote entry if one exists somewhere?
On 5 December 2012 15:42, Dan Berindei <dan.berin...@gmail.com> wrote: > > On Wed, Dec 5, 2012 at 4:20 PM, Sanne Grinovero <sa...@infinispan.org> > wrote: >> >> On 5 December 2012 14:01, Bela Ban <b...@redhat.com> wrote: >> > >> > On 12/5/12 1:23 PM, Sanne Grinovero wrote: >> >> On 5 December 2012 11:02, Galder Zamarreño <gal...@redhat.com> wrote: >> >>> On Dec 4, 2012, at 10:22 AM, Sanne Grinovero <sa...@infinispan.org> >> >>> wrote: >> >>> >> >>>> On 4 December 2012 09:14, Galder Zamarreño <gal...@redhat.com> wrote: >> >>>>> Hey Dan/Adrian, >> >>>>> >> >>>>> Re: https://issues.jboss.org/browse/ISPN-2541 >> >>>>> >> >>>>> I'm looking at this intermittent failure, and it seems to be caused >> >>>>> by the fact that the test does not wait for the cluster to be formed >> >>>>> when >> >>>>> the new node is started, which can lead a replication timeout failure >> >>>>> from >> >>>>> the new joining node. >> >>>>> >> >>>>> The test can easily be fixed by waiting for cluster to form, and >> >>>>> then do the call. >> >>>>> >> >>>> [...] >> >>>> >> >>>> I don't think the cache should ever be in an illegal state to be used >> >>>> after being started. So Infinispan should not require tests to wait >> >>>> for a "cluster to be formed", I'd rather guarantee that after a cache >> >>>> is started it's usable. >> >>> Precisely, which is why I raised the flag instead of going down the >> >>> easy path. >> >>> >> >>>> If this is not possible, then any application would also need to wait >> >>>> for that "cluster formed" event, and we should expose an API for >> >>>> that. >> >>> The problem is considering when a cluster is formed. How many nodes >> >>> should you wait for? >> >>> >> >> Why can't we rely on JGroups Discovery to know that, as a user I >> >> already specified the expected initial group size with >> >> num_initial_members >> >> Don't want to repeat that configuration ;-) >> > >> > >> > I don't understand this discussion: when a new node join, it'll return >> > from JChannel.connect() when it received a JOIN response from the >> > coordinator, with the current view... or are you guys talking about >> > Infinispan's 'service views' ? >> >> +1 >> >> That's why I'm confused too, and not understanding how it is possible >> that a Cache is returned to the application - which doesn't have a >> clue about number of expected nodes - in a state for which the >> "cluster is not formed yet". That should never happen!? >> > > It's simple: getCache() returns once the joiner has received ownership of > some segments (in distributed mode) and once it received all the data it > owner (dist and repl). This does not guarantee that the other nodes see the > joiner as a full member at the time getCache() has returned. > > This doesn't mean that the cache is not functional, on the contrary we could > return even before the joiner had received the data and the cache would > still work. But because some nodes think state transfer is still in > progress, the tests do run into state transfer corner cases that aren't > handled properly (they're getting rarer, but we still have them). > > >> >> I never understood why the test framework in Infinispan requires this >> to happen in all tests - even in the cases listed by Mircea that the >> testsuite is looking for something very specific, I would expect the >> wait to be unnecessary. (or more precisely, to have been blocked >> already for long enough) >> > > getCache() only waits enough for the cache to "work", it doesn't wait (and I > don't think it should wait) for all the other nodes to acknowledge the > joiner as a full member (i.e. in the "read" consistent hash). Because of > this, assertions made on nodes other than the joiner can fail (in addition > to the aforementioned corner cases in state transfer). > > It's also possible (and it was quite likely with older JGroups versions) > that a joiner would actually form a new cluster by itself instead of joining > the existing nodes in a single cluster. When that happens, getCache() > definitely returns without the cluster being formed, and we have to wait for > the separate clusters to find each other and merge before running our test. > > Cheers > Dan > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev