If you send me the details, I'll take a look. I'm pretty busy with 
message batching, so I can't promise next week, but soon...

On 2/1/13 11:08 AM, Pedro Ruivo wrote:
> Hi,
>
> I had a similar problem when I tried GMU[1] in "large" cluster (40 vms),
> because the remote gets and the commit messages (I'm talking about ISPN
> commands) must wait for some conditions before being processed.
>
> I solved this problem by adding a feature in JGroups[2] that allows the
> request to be moved to another thread, releasing the OOB thread. The
> other thread will send the reply of the JGroups Request. Of course, I'm
> only moving commands that I know they can block.
>
> I can enter in some detail if you want =)
>
> Cheers,
> Pedro
>
> [1] http://www.gsd.inesc-id.pt/~romanop/files/papers/icdcs12.pdf
> [2] I would like to talk with Bela about this, because it makes my life
> easier to support total order in ISPN. I'll try to send an email this
> weekend =)
>
> On 01-02-2013 08:04, Radim Vansa wrote:
>> Hi guys,
>>
>> after dealing with the large cluster for a while I find the way how we use 
>> OOB threads in synchronous configuration non-robust.
>> Imagine a situation where node which is not an owner of the key calls PUT. 
>> Then the a RPC is called to the primary owner of that key, which reroutes 
>> the request to all other owners and after these reply, it replies back.
>> There are two problems:
>> 1) If we do simultanously X requests from non-owners to the primary owner 
>> where X is OOB TP size, all the OOB threads are waiting for the responses 
>> and there is no thread to process the OOB response and release the thread.
>> 2) Node A is primary owner of keyA, non-primary owner of keyB and B is 
>> primary of keyB and non-primary of keyA. We got many requests for both keyA 
>> and keyB from other nodes, therefore, all OOB threads from both nodes call 
>> RPC to the non-primary owner but there's noone who could process the request.
>>
>> While we wait for the requests to timeout, the nodes with depleted OOB 
>> threadpools start suspecting all other nodes because they can't receive 
>> heartbeats etc...
>>
>> You can say "increase your OOB tp size", but that's not always an option, I 
>> have currently set it to 1000 threads and it's not enough. In the end, I 
>> will be always limited by RAM and something tells me that even nodes with 
>> few gigs of RAM should be able to form a huge cluster. We use 160 hotrod 
>> worker threads in JDG, that means that 160 * clusterSize = 10240 (64 nodes 
>> in my cluster) parallel requests can be executed, and if 10% targets the 
>> same node with 1000 OOB threads, it stucks. It's about scaling and 
>> robustness.
>>
>> Not that I'd have any good solution, but I'd really like to start a 
>> discussion.
>> Thinking about it a bit, the problem is that blocking call (calling RPC on 
>> primary owner from message handler) can block non-blocking calls (such as 
>> RPC response or command that never sends any more messages). Therefore, 
>> having a flag on message "this won't send another message" could let the 
>> message be executed in different threadpool, which will be never deadlocked. 
>> In fact, the pools could share the threads but the non-blocking would have 
>> always a few threads spare.
>> It's a bad solution as maintaining which message could block in the other 
>> node is really, really hard (we can be sure only in case of RPC responses), 
>> especially when some locks come. I will welcome anything better.
>>
>> Radim
>>
>>
>> -----------------------------------------------------------
>> Radim Vansa
>> Quality Assurance Engineer
>> JBoss Datagrid
>> tel. +420532294559 ext. 62559
>>
>> Red Hat Czech, s.r.o.
>> Brno, Purkyňova 99/71, PSČ 612 45
>> Czech Republic
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev@lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

-- 
Bela Ban, JGroups lead (http://www.jgroups.org)

_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Reply via email to