This is currently on my Infinispan fork at 
gitbub.com/belaban/infinispan. But again, hopefully this will get 
integrated into master soon. Note that my changes are experimental ...

On 4/6/11 3:23 PM, Erik Salter wrote:
> Hi Bela,
>
> I'm interested in your changes as well, as concurrent startup vexes my cache 
> usage as well.  Is there a wiki or JIRA I could look at to understand the 
> fundamental differences?
>
> Thanks,
>
> Erik
>
> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Bela Ban
> Sent: Wednesday, April 06, 2011 2:46 AM
> To: [email protected]
> Subject: Re: [infinispan-dev] Infinispan Large Scale support
>
> Hi David,
>
> Dan and I had a talk about integrating my changes to the distribution code to 
> 5.x. As I mentioned below, the current code is quite brittle wrt concurrent 
> startup, so this will get fixed with my changes. I hope we can backport this 
> to the 4.2.x branch as well. As a matter of fact, I actually made my changes 
> on a branch off of 4.2.x.
>
> On 4/5/11 2:51 PM, david marion wrote:
>>
>> Bela,
>>
>>     Yes, it is a replicated cache and I used your udp-largecluster.xml file 
>> and just modified it slightly. It does appear that the distributed cache is 
>> in a deadlock (or there is a race condition), the coordinator comes up, but 
>> the other caches do not, they sit there and wait. I was able to get a 
>> distributed cache up and running on 100+ nodes, now I cannot get 5 of them 
>> running.
>>
>>> Date: Tue, 5 Apr 2011 11:09:54 +0200
>>> From: [email protected]
>>> To: [email protected]
>>> Subject: Re: [infinispan-dev] Infinispan Large Scale support
>>>
>>>
>>>
>>> On 4/4/11 5:45 PM, david marion wrote:
>>>>
>>>>
>>>> Good news! I was able to use the system property from ISPN-83 and remove 
>>>> the FLUSH from the jgroups config with 4.2.1.FINAL, and start-up times are 
>>>> much much better. We have a replicated cache on about 420+ nodes up in 
>>>> under 2 minutes.
>>>
>>>
>>> Great ! Just to confirm: this is 420+ Infinispan instances, with
>>> replication enabled, correct ?
>>>
>>> Did you use a specific JGroups config (e.g. udp-largecluster.xml) ?
>>>
>>>
>>>> I am seeing an issue with the distributed cache though with as little as 5 
>>>> nodes.
>>>>
>>>> In the coordinator log I see
>>>>
>>>> org.infinispan.distribution.DistributionmanagerImpl: Detected a view 
>>>> change. Member list changed.......
>>>> org.infinispan.distribution.DistributionmanagerImpl: This is a JOIN
>>>> event! Wait for notification from new joiner<name>
>>>>
>>>> In the log from the joining node I see:
>>>>
>>>> org.infinispan.distribution.JoinTask: Commencing rehash on
>>>> node:<name>. Before start, distributionManager.joinComplete=false
>>>> org.infinispan.distribution.JoinTask: Requesting old consistent hash
>>>> from coordinator
>>>>
>>>> I jstack'd the joiner, the DefaultCacheManager.getCache() method is
>>>> waiting on 
>>>> org.infinispan.distribution.DistributionManagerImpl.waitForJoinToComplete()
>>>>  and the Rehasher thread is waiting on:
>>>>
>>>> at
>>>> org.infinispan.util.concurrent.ReclosableLatch.await(ReclosableLatch
>>>> .java:75) at
>>>> org.infinipsan.remoting.transport.jgroups.JGroupsDistSync.blockUntil
>>>> NoJoinsInProgress(JGroupsDistSync.java:113)
>>>>
>>>> Any thoughts?
>>>
>>>
>>> I recently took a look at the distribution code, and this part is
>>> very brittle with respect to parallel startup and merging. Plus, I
>>> believe the (blocking) RPC to fetch the old CH from the coordinator
>>> might deadlock in certain cases...
>>>
>>> I've got a pull request for a push based rebalancing versus pull
>>> based rebalancing pending. It'll likely make it into 5.x, as a matter
>>> of fact I've got a chat about this this afternoon.
>>>
>>>
>>>
>>>
>>>>> Date: Wed, 23 Mar 2011 15:58:19 +0100
>>>>> From: [email protected]
>>>>> To: [email protected]
>>>>> Subject: Re: [infinispan-dev] Infinispan Large Scale support
>>>>>
>>>>>
>>>>>
>>>>> On 3/23/11 2:39 PM, david marion wrote:
>>>>>>
>>>>>> Bela,
>>>>>>
>>>>>> Is there a way to start up the JGroups stack on every node without using 
>>>>>> Infinispan?
>>>>>
>>>>>
>>>>> You could use ViewDemo [1] or Draw. Or write your own small test
>>>>> program; if you take a look at ViewDemo's src, you'll see that it's
>>>>> onyl a page of code.
>>>>>
>>>>>
>>>>>> Is there some functional test that I can run or something? I know
>>>>>> I can't remove the FLUSH from Infinispan until 5.0.0 and I don't know if 
>>>>>> I can upgrade the underlying JGroups jar.
>>>>>
>>>>>
>>>>> I suggest test with the latest JGroups (2.12.0) and +FLUSH and -FLUSH.
>>>>> The +FLUSH config should be less painful now, with the introduction
>>>>> of view bundling: we need to run flush fewer times than before.
>>>>>
>>>>>
>>>>> [1] http://community.jboss.org/wiki/TestingJBoss
>>>>>
>>>>> --
>>>>> Bela Ban
>>>>> Lead JGroups / Clustering Team
>>>>> JBoss
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> [email protected]
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> [email protected]
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>
>>> --
>>> Bela Ban
>>> Lead JGroups / Clustering Team
>>> JBoss
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> [email protected]
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>>
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> [email protected]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> --
> Bela Ban
> Lead JGroups / Clustering Team
> JBoss
> _______________________________________________
> infinispan-dev mailing list
> [email protected]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> The information contained in this message is legally privileged and 
> confidential, and is intended for the individual or entity to whom it is 
> addressed (or their designee). If this message is read by anyone other than 
> the intended recipient, please be advised that distribution of this message, 
> in any form, is strictly prohibited. If you have received this message in 
> error, please notify the sender immediately and delete or destroy all copies 
> of this message.
>
> _______________________________________________
> infinispan-dev mailing list
> [email protected]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

-- 
Bela Ban
Lead JGroups / Clustering Team
JBoss
_______________________________________________
infinispan-dev mailing list
[email protected]
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Reply via email to