Very nice design, congratulations. Second picture is missing a final "4" step returning the ACK to N1 right?
Abot the third proposal I was thinking about a possible improvement, not sure if it works: consider the list defining the keys being locked, already ordered. As far as I understood about the ordering on the hashweel, you can now split this list to have the groups which need to be sent to each node: k1, k2, k3, k4, k5 -> N2{k1, k2} N3{k3, k4} N4{k5} i.e. the sequence does not need to be reordered and we can make use of that. Now instead of sending the full list in multicast to each involved node, we send the list only to the first node, and wait for an ACK from the last node involved in the sequence. This does not reduce the number of async rpcs being executed, but sends less packets on the wire, and has the advantage of never needing to go back to recover from a failed node or recover from a potential deadlock, as all keys are always taken in proper order, not "optimistically". [reusing the same picture of "Hybrid incremental approach"] assuming this list of keys, owned by these nodes: k1, k2, k3, k4, k5 -> N2{k1, k2} N3{k3, k4} N4{k5} Now if N2 receives the list, it acquires k1, k2, in order, and sends the full list over to N3 with a message like N2{k1 *taken, k2 *taken} N3{k3, k4} N4{k5} and it will also wait for an async ACK. If it receives an ACK, N2's work is done. If it doesn't receive one in a reasonable time, it will assume N3 is dead: now the list of nodes changes, but the order in which these elements need to be acquired did not. In fact, some of these might be moved to N2 himself, some to N4 since the rehashing will move them to neighbours: N3 owned keys can be taken right away (again not violating the order), the rest of the list is again forwarded to the next alive node. The last node will return an ACK to the originating node to inform that the prepare phase was successful, and when he did it will send an ACK back to it's previous node to free it's resources, this one will propagate back the ACK, etc. No deadlocks, no timeouts, no undo operations. what am I missing? Cheers, Sanne 2011/7/6 Mircea Markus <mircea.mar...@jboss.com>: > > On 6 Jul 2011, at 09:00, Paolo Romano wrote: > >> On 7/5/11 3:47 PM, Mircea Markus wrote: >>> On 5 Jul 2011, at 15:39, Manik Surtani wrote: >>> >>>> Good stuff, shame about the RPC count . ;) >>> yeah. Still very valid when there are deadlocks, guess figures will tell us >>> more precisely what the gain is >>> >> I agree with Manik's observation. >> >> When benchmarking it, it would be interesting to compare it also with >> the total order multicast approach that Pedro has been developing. > +1 >> >> When do you plan to have these locking optimizations available? > We hope to have it in for 5.1, but it has a lower priority compared with the > other lock related changes. > >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev