On Thu, Feb 7, 2013 at 8:05 PM, Mircea Markus <mmar...@redhat.com> wrote:
> > On 1 Feb 2013, at 09:54, Dan Berindei wrote: > > Yeah, I wouldn't call this a "simple" solution... > > The distribution/replication interceptors are quite high in the > interceptor stack, so we'd have to save the state of the interceptor stack > (basically the thread's stack) somehow and resume processing it on the > thread receiving the responses. In a language that supports continuations > that would be a piece of cake, but since we're in Java we'd have to > completely change the way the interceptor stack works. > > Actually we do hold the lock on modified keys while the command is > replicated to the other owners. But think locking wouldn't be a problem: we > already allow locks to be owned by transactions instead of threads, so it > would just be a matter of creating a "lite transaction" for > non-transactional caches. Obviously the TransactionSynchronizerInterceptor > would have to go, but I see that as a positive thing ;) > > The TransactionSynchronizerInterceptor protected the CacheTransaction > objects from multiple writes, we'd still need that because of the NBST > forwarding. > We wouldn't need it if access to the Collection members in CacheTransaction was properly synchronized. Perhaps hack is too strong a word, let's just say I'm seeing TransactionSynchronizerInterceptor as a temporary solution :) > So yeah, it could work, but it would take a huge amount of effort and it's > going to obfuscate the code. Plus, I'm not at all convinced that it's going > to improve performance that much compared to a new thread pool. > > +1 > > > Cheers > Dan > > > On Fri, Feb 1, 2013 at 10:59 AM, Radim Vansa <rva...@redhat.com> wrote: > >> Yeah, that would work if it is possible to break execution path into the >> FutureListener from the middle of interceptor stack - I am really not sure >> about that but as in current design no locks should be held when a RPC is >> called, it may be possible. >> >> Let's see what someone more informed (Dan?) would think about that. >> >> Thanks, Bela >> >> Radim >> >> ----- Original Message ----- >> | From: "Bela Ban" <b...@redhat.com> >> | To: infinispan-dev@lists.jboss.org >> | Sent: Friday, February 1, 2013 9:39:43 AM >> | Subject: Re: [infinispan-dev] Threadpools in a large cluster >> | >> | It looks like the core problem is an incoming RPC-1 which triggers >> | another blocking RPC-2: the thread delivering RPC-1 is blocked >> | waiting >> | for the response from RPC-2, and can therefore not be used to serve >> | other requests for the duration of RPC-2. If RPC-2 takes a while, >> | e.g. >> | waiting to acquire a lock in the remote node, then it is clear that >> | the >> | thread pool will quickly exceed its max size. >> | >> | A simple solution would be to prevent invoking blocking RPCs *from >> | within* a received RPC. Let's take a look at an example: >> | - A invokes a blocking PUT-1 on B >> | - B forwards the request as blocking PUT-2 to C and D >> | - When PUT-2 returns and B gets the responses from C and D (or the >> | first >> | one to respond, don't know exactly how this is implemented), it sends >> | the response back to A (PUT-1 terminates now at A) >> | >> | We could change this to the following: >> | - A invokes a blocking PUT-1 on B >> | - B receives PUT-1. Instead of invoking a blocking PUT-2 on C and D, >> | it >> | does the following: >> | - B invokes PUT-2 and gets a future >> | - B adds itself as a FutureListener, and it also stores the >> | address of the original sender (A) >> | - When the FutureListener is invoked, B sends back the result >> | as a >> | response to A >> | - Whenever a member leaves the cluster, the corresponding futures are >> | cancelled and removed from the hashmaps >> | >> | This could probably be done differently (e.g. by sending asynchronous >> | messages and implementing a finite state machine), but the core of >> | the >> | solution is the same; namely to avoid having an incoming thread block >> | on >> | a sync RPC. >> | >> | Thoughts ? >> | >> | >> | >> | >> | On 2/1/13 9:04 AM, Radim Vansa wrote: >> | > Hi guys, >> | > >> | > after dealing with the large cluster for a while I find the way how >> | > we use OOB threads in synchronous configuration non-robust. >> | > Imagine a situation where node which is not an owner of the key >> | > calls PUT. Then the a RPC is called to the primary owner of that >> | > key, which reroutes the request to all other owners and after >> | > these reply, it replies back. >> | > There are two problems: >> | > 1) If we do simultanously X requests from non-owners to the primary >> | > owner where X is OOB TP size, all the OOB threads are waiting for >> | > the responses and there is no thread to process the OOB response >> | > and release the thread. >> | > 2) Node A is primary owner of keyA, non-primary owner of keyB and B >> | > is primary of keyB and non-primary of keyA. We got many requests >> | > for both keyA and keyB from other nodes, therefore, all OOB >> | > threads from both nodes call RPC to the non-primary owner but >> | > there's noone who could process the request. >> | > >> | > While we wait for the requests to timeout, the nodes with depleted >> | > OOB threadpools start suspecting all other nodes because they >> | > can't receive heartbeats etc... >> | > >> | > You can say "increase your OOB tp size", but that's not always an >> | > option, I have currently set it to 1000 threads and it's not >> | > enough. In the end, I will be always limited by RAM and something >> | > tells me that even nodes with few gigs of RAM should be able to >> | > form a huge cluster. We use 160 hotrod worker threads in JDG, that >> | > means that 160 * clusterSize = 10240 (64 nodes in my cluster) >> | > parallel requests can be executed, and if 10% targets the same >> | > node with 1000 OOB threads, it stucks. It's about scaling and >> | > robustness. >> | > >> | > Not that I'd have any good solution, but I'd really like to start a >> | > discussion. >> | > Thinking about it a bit, the problem is that blocking call (calling >> | > RPC on primary owner from message handler) can block non-blocking >> | > calls (such as RPC response or command that never sends any more >> | > messages). Therefore, having a flag on message "this won't send >> | > another message" could let the message be executed in different >> | > threadpool, which will be never deadlocked. In fact, the pools >> | > could share the threads but the non-blocking would have always a >> | > few threads spare. >> | > It's a bad solution as maintaining which message could block in the >> | > other node is really, really hard (we can be sure only in case of >> | > RPC responses), especially when some locks come. I will welcome >> | > anything better. >> | >> | -- >> | Bela Ban, JGroups lead (http://www.jgroups.org) >> | >> | _______________________________________________ >> | infinispan-dev mailing list >> | infinispan-dev@lists.jboss.org >> | https://lists.jboss.org/mailman/listinfo/infinispan-dev >> | >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > Cheers, > -- > Mircea Markus > Infinispan lead (www.infinispan.org) > > > > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev >
_______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev