This is currently on my Infinispan fork at gitbub.com/belaban/infinispan. But again, hopefully this will get integrated into master soon. Note that my changes are experimental ...
On 4/6/11 3:23 PM, Erik Salter wrote: > Hi Bela, > > I'm interested in your changes as well, as concurrent startup vexes my cache > usage as well. Is there a wiki or JIRA I could look at to understand the > fundamental differences? > > Thanks, > > Erik > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Bela Ban > Sent: Wednesday, April 06, 2011 2:46 AM > To: [email protected] > Subject: Re: [infinispan-dev] Infinispan Large Scale support > > Hi David, > > Dan and I had a talk about integrating my changes to the distribution code to > 5.x. As I mentioned below, the current code is quite brittle wrt concurrent > startup, so this will get fixed with my changes. I hope we can backport this > to the 4.2.x branch as well. As a matter of fact, I actually made my changes > on a branch off of 4.2.x. > > On 4/5/11 2:51 PM, david marion wrote: >> >> Bela, >> >> Yes, it is a replicated cache and I used your udp-largecluster.xml file >> and just modified it slightly. It does appear that the distributed cache is >> in a deadlock (or there is a race condition), the coordinator comes up, but >> the other caches do not, they sit there and wait. I was able to get a >> distributed cache up and running on 100+ nodes, now I cannot get 5 of them >> running. >> >>> Date: Tue, 5 Apr 2011 11:09:54 +0200 >>> From: [email protected] >>> To: [email protected] >>> Subject: Re: [infinispan-dev] Infinispan Large Scale support >>> >>> >>> >>> On 4/4/11 5:45 PM, david marion wrote: >>>> >>>> >>>> Good news! I was able to use the system property from ISPN-83 and remove >>>> the FLUSH from the jgroups config with 4.2.1.FINAL, and start-up times are >>>> much much better. We have a replicated cache on about 420+ nodes up in >>>> under 2 minutes. >>> >>> >>> Great ! Just to confirm: this is 420+ Infinispan instances, with >>> replication enabled, correct ? >>> >>> Did you use a specific JGroups config (e.g. udp-largecluster.xml) ? >>> >>> >>>> I am seeing an issue with the distributed cache though with as little as 5 >>>> nodes. >>>> >>>> In the coordinator log I see >>>> >>>> org.infinispan.distribution.DistributionmanagerImpl: Detected a view >>>> change. Member list changed....... >>>> org.infinispan.distribution.DistributionmanagerImpl: This is a JOIN >>>> event! Wait for notification from new joiner<name> >>>> >>>> In the log from the joining node I see: >>>> >>>> org.infinispan.distribution.JoinTask: Commencing rehash on >>>> node:<name>. Before start, distributionManager.joinComplete=false >>>> org.infinispan.distribution.JoinTask: Requesting old consistent hash >>>> from coordinator >>>> >>>> I jstack'd the joiner, the DefaultCacheManager.getCache() method is >>>> waiting on >>>> org.infinispan.distribution.DistributionManagerImpl.waitForJoinToComplete() >>>> and the Rehasher thread is waiting on: >>>> >>>> at >>>> org.infinispan.util.concurrent.ReclosableLatch.await(ReclosableLatch >>>> .java:75) at >>>> org.infinipsan.remoting.transport.jgroups.JGroupsDistSync.blockUntil >>>> NoJoinsInProgress(JGroupsDistSync.java:113) >>>> >>>> Any thoughts? >>> >>> >>> I recently took a look at the distribution code, and this part is >>> very brittle with respect to parallel startup and merging. Plus, I >>> believe the (blocking) RPC to fetch the old CH from the coordinator >>> might deadlock in certain cases... >>> >>> I've got a pull request for a push based rebalancing versus pull >>> based rebalancing pending. It'll likely make it into 5.x, as a matter >>> of fact I've got a chat about this this afternoon. >>> >>> >>> >>> >>>>> Date: Wed, 23 Mar 2011 15:58:19 +0100 >>>>> From: [email protected] >>>>> To: [email protected] >>>>> Subject: Re: [infinispan-dev] Infinispan Large Scale support >>>>> >>>>> >>>>> >>>>> On 3/23/11 2:39 PM, david marion wrote: >>>>>> >>>>>> Bela, >>>>>> >>>>>> Is there a way to start up the JGroups stack on every node without using >>>>>> Infinispan? >>>>> >>>>> >>>>> You could use ViewDemo [1] or Draw. Or write your own small test >>>>> program; if you take a look at ViewDemo's src, you'll see that it's >>>>> onyl a page of code. >>>>> >>>>> >>>>>> Is there some functional test that I can run or something? I know >>>>>> I can't remove the FLUSH from Infinispan until 5.0.0 and I don't know if >>>>>> I can upgrade the underlying JGroups jar. >>>>> >>>>> >>>>> I suggest test with the latest JGroups (2.12.0) and +FLUSH and -FLUSH. >>>>> The +FLUSH config should be less painful now, with the introduction >>>>> of view bundling: we need to run flush fewer times than before. >>>>> >>>>> >>>>> [1] http://community.jboss.org/wiki/TestingJBoss >>>>> >>>>> -- >>>>> Bela Ban >>>>> Lead JGroups / Clustering Team >>>>> JBoss >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> [email protected] >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> [email protected] >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> -- >>> Bela Ban >>> Lead JGroups / Clustering Team >>> JBoss >>> _______________________________________________ >>> infinispan-dev mailing list >>> [email protected] >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> [email protected] >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > -- > Bela Ban > Lead JGroups / Clustering Team > JBoss > _______________________________________________ > infinispan-dev mailing list > [email protected] > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > The information contained in this message is legally privileged and > confidential, and is intended for the individual or entity to whom it is > addressed (or their designee). If this message is read by anyone other than > the intended recipient, please be advised that distribution of this message, > in any form, is strictly prohibited. If you have received this message in > error, please notify the sender immediately and delete or destroy all copies > of this message. > > _______________________________________________ > infinispan-dev mailing list > [email protected] > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Bela Ban Lead JGroups / Clustering Team JBoss _______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev
