On 1 Feb 2013, at 08:04, Radim Vansa wrote:

> Hi guys,
> 
> after dealing with the large cluster for a while I find the way how we use 
> OOB threads in synchronous configuration non-robust.
> Imagine a situation where node which is not an owner of the key calls PUT. 
> Then the a RPC is called to the primary owner of that key, which reroutes the 
> request to all other owners and after these reply, it replies back.

This delegation RPC pattern happens for non-transactional caches only. Do you 
have the same problem with transactional caches as well?

> There are two problems:
> 1) If we do simultanously X requests from non-owners to the primary owner 
> where X is OOB TP size, all the OOB threads are waiting for the responses and 
> there is no thread to process the OOB response and release the thread.
> 2) Node A is primary owner of keyA, non-primary owner of keyB and B is 
> primary of keyB and non-primary of keyA. We got many requests for both keyA 
> and keyB from other nodes, therefore, all OOB threads from both nodes call 
> RPC to the non-primary owner but there's noone who could process the request.
> 
> While we wait for the requests to timeout, the nodes with depleted OOB 
> threadpools start suspecting all other nodes because they can't receive 
> heartbeats etc...
> 
> You can say "increase your OOB tp size", but that's not always an option, I 
> have currently set it to 1000 threads and it's not enough. In the end, I will 
> be always limited by RAM and something tells me that even nodes with few gigs 
> of RAM should be able to form a huge cluster. We use 160 hotrod worker 
> threads in JDG, that means that 160 * clusterSize = 10240 (64 nodes in my 
> cluster) parallel requests can be executed, and if 10% targets the same node 
> with 1000 OOB threads, it stucks. It's about scaling and robustness.
> 
> Not that I'd have any good solution, but I'd really like to start a 
> discussion.
> Thinking about it a bit, the problem is that blocking call (calling RPC on 
> primary owner from message handler) can block non-blocking calls (such as RPC 
> response or command that never sends any more messages). Therefore, having a 
> flag on message "this won't send another message" could let the message be 
> executed in different threadpool, which will be never deadlocked. In fact, 
> the pools could share the threads but the non-blocking would have always a 
> few threads spare.
> It's a bad solution as maintaining which message could block in the other 
> node is really, really hard (we can be sure only in case of RPC responses), 
> especially when some locks come. I will welcome anything better.
> 
> Radim
> 
> 
> -----------------------------------------------------------
> Radim Vansa
> Quality Assurance Engineer
> JBoss Datagrid
> tel. +420532294559 ext. 62559
> 
> Red Hat Czech, s.r.o.
> Brno, Purkyňova 99/71, PSČ 612 45
> Czech Republic
> 
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev@lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)




_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Reply via email to