On Fri, Jun 28, 2013 at 3:41 PM, William Burns <mudokon...@gmail.com> wrote:
> On Fri, Jun 28, 2013 at 5:53 AM, Dan Berindei <dan.berin...@gmail.com> > wrote: > > > > > > On Fri, Jun 28, 2013 at 12:51 AM, William Burns <mudokon...@gmail.com> > > wrote: > >> > >> On Thu, Jun 27, 2013 at 4:40 PM, Dan Berindei <dan.berin...@gmail.com> > >> wrote: > >> > > >> > > >> > On Thu, Jun 27, 2013 at 4:40 PM, William Burns <mudokon...@gmail.com> > >> > wrote: > >> >> > >> >> Comments that were outstanding on PR: > >> >> > >> >> @danberindei: > >> >> > >> >> > +1 to move the discussion to the mailing list, could you summarize > >> >> your changes (preferably for both non-tx and tx cases) and send an > >> >> email to the list? > >> >> > And now to add some more to this already unwieldy discussion :) > >> >> > >> >> > 1. Many interceptors check ownership, I don't think that would be > >> >> a problem. Besides, I think creating a new L1ReadSynchronizer for > >> >> every read is just as bad for performance as locking the key on every > >> >> read, so you'd need that check either way. > >> >> > > >> >> We can't use a try lock approach for L1 invalidation when gets are > >> >> locked without possibly dropping a L1 update > >> >> We can't use a timed lock approach for L1 invalidation when writes > >> >> lock the key as we could get into a timed deadlock situation when > >> >> another node concurrently writes to a value/stripe > >> >> > >> >> I still don't see how we can get away with locking on a get. What > are > >> >> you proposing? > >> > > >> > > >> > I'm proposing something like this: > >> > > >> > 1. For get commands: acquire local lock + remote get + store in L1 + > >> > release > >> > local lock > >> > 2. For invalidate commands: acquire local lock + remove entry from L1 > + > >> > release local lock > >> > 3. For write commands: invoke on primary owner, and the primary owner > >> > sends > >> > invalidation back to originator; alternatively, skip the invalidation > >> > command to the originator and do invoke on primary owner + acquire > local > >> > lock + remove entry from L1 + release local lock > >> > >> I do like this and think it would be simpler for non-tx. However this > >> will still not be as performant I wouldn't think (just a gut feeling). > > > > > > It would indeed be less performant in some cases, e.g. if we have two get > > commands for a remote key at the same time and the key doesn't exist, my > > proposal would request the key twice. > > > > The writes to L1 *should* be rare enough that the extra lock contention > > there would be minimal. There is a chance that an invalidation will be > stuck > > waiting for a remote get, so it will again be slower than your approach. > > > >> However I am more than certain it will be slower when collisions > >> occur, which is more likely with lock striping. > >> > > > > I'm not that worried about lock striping - the user can always increase > the > > number of stripes to get less contention. And I hope they know better > than > > to use lock striping and transactions with multiple keys, or they'll get > > deadlocks with or without L1 enabled. > > I didn't think of the deadlock problem with tx and lock striping. I > am wondering should we log a warning message so that user's get an > explicit warning? > Yeah, I think a warning would be nice - especially in the documentation, it looks way too innocuous. > > >> > >> I don't think this however is feasible for tx caches. > >> > >> Optimistic - we don't want to have gets being blocked when the tx is > >> being committed - that could be quite a bit of a performance hit > >> especially if it has to do a 2 phase commit. Updating the L1 on a > >> remoteGetBeforeWrites could help remedy this (since get wouldn't need > >> to lock), however then you would be almost implementing what I have > >> and still have locking overhead. Maybe this would occur so > >> infrequently it wouldn't matter though? > >> > > > > Yes, I would propose removing the L1 commit in EntryWrappingInterceptor > as > > well, and moving it to the L1 interceptor - immediately after the value > was > > retrieved from the remote owner. The local lock would also be released > > immediately. > > > > The part I wanted to avoid from your proposal was > > L1ReadSync/L1ReadSynchronizer, because I found it kind of hard to > > understand. I guess I'd find it more approachable all the synchronization > > (the map of L1ReadSynchronizers and the interaction with it) would be in > a > > separate class, with its own tests, and the L1 interceptor would only > know > > about the high-level stuff. Like the LockManager also hides a map of > locks; > > not like the LockManager unit tests, because they're kind of lacking, but > > LockManager has the advantage that almost all the tests touch it > indirectly. > > > > True, I agree I like it as a separate class better - much better > separation. Maybe I can move that over and add tests and see if that > makes it more clear? If not we can still always use the locking > approach. > +1 > >> > >> Pessimistic - I haven't looked into pessimistic closely yet, but in > >> that case I don't think it ever acquires the local locks so it would > >> have to acquire a remote lock on a get, which would be pretty > >> disastrous (gets would indirectly have a weak FORCE_WRITE_LOCK > >> enabled). Making this to acquire local locks might work as well, but > >> then would still have the same issue as optimistic when committing - I > >> really am not as familiar with Pessimistic to say for sure. > >> > > > > Nope, acquiring a local lock should be enough. We don't want to prevent > > other txs from modifying the key at all, we just want to delay the > > invalidation commands triggered by those modifications on the local node. > > And the lock would only be held for the duration of the remote get + L1 > > update, not all the way to the commit. > > > Okay, cool. I didn't know for sure. > >> > >> > > >> > All the lock acquisitions would use a regular timeout, not a try lock > >> > approach. > >> In this case they could, yes. > >> > > >> >> > >> >> > 2. By default L1 invalidations are sent as multicasts, so I'm not > >> >> > sure > >> >> > ISPN-3273 really matters here. BTW, I wonder if we have a check to > >> >> > only send > >> >> > L1 invalidations from one node if the threshold is 0... > >> >> I agree that is the default, but we should support the operation, > >> >> although it doesn't matter for this discussion. Also I am curious as > >> >> to why multicast for L1 isn't set to say 2 by default? It seems > >> >> wasteful to send a multicast to all members that they process when > >> >> only 1 would do anything about it. Do you know why this is like > that? > >> > > >> > > >> > I suppose it's because we don't have good perf numbers for different > L1 > >> > invalidation threshold numbers... > >> > > >> > The problem is, we don't have a way to count all the requestors of a > key > >> > in > >> > the cluster, so it's reasonably likely that with a threshold of 2 > you'd > >> > get > >> > 1 unicast invalidation from one owner + 1 multicast invalidation from > >> > the > >> > other owner, making it less efficient than a single multicast > >> > invalidation. > >> > >> It is a nitpick anyways, and really shouldn't make that big of a > >> difference. > >> > >> > >> > > >> >> > > >> >> > 3a. Right, for put commands we can't hold the local lock while > >> >> > executing the remote put, or we'll have a deadlock. But I think a > >> >> > shorter > >> >> > lock, held only after the remote put completed (or after the lock > on > >> >> > the > >> >> > primary owner was acquired, with txs) should work. > >> >> Same point under 1 > >> > > >> > > >> > I don't see how we could get a deadlock if we don't hold the local > lock > >> > during the remote write invocation. > >> > >> Agree. > >> > >> > > >> >> > >> >> > > >> >> > 3b. We'd also have an ownership check before, so we'd only > serialize > >> >> > the get commands that need to go remotely for the same key. I think > >> >> > it would > >> >> > be almost the same as your solution (although it does have one ? > >> >> > disadvantage - if the key doesn't exist in the cache, all the get > >> >> > commands > >> >> > will go remotely). The number of L1 writes should be very small > >> >> > compared to > >> >> > the number of L1 reads anyway, otherwise it would be more efficient > >> >> > to get > >> >> > the key from the owner every time. > >> >> You are saying an optimization for owner nodes so they don't do the > >> >> "corralling" for keys they own? I like that. Also I don't think it > >> >> has the disadvantage, it only does remotes it if isn't an owner. > >> > > >> > > >> > I meant your corralling strategy means if you have 2 concurrent get > >> > commands > >> > and one of them retrieves a null from the entry owners, the other > >> > command > >> > will return null directly. With regular locking, the other command > >> > wouldn't > >> > find anything in L1 and it would do another remote get. > >> > > >> > I don't think there's any disadvantage in skipping the corralling for > >> > key > >> > owners, in fact I think we need to skip it if the key already exists > in > >> > L1, > >> > too. > >> +1 > >> > > >> >> > >> >> > > >> >> > It would be nice to agree on what guarantees we want to provide for > >> >> > L1 > >> >> > invalidation in non-tx caches, I'm not sure if we can do anything > to > >> >> > prevent > >> >> > this scenario: > >> >> Actually this scenario doesn't occur with non-tx since writes don't > >> >> update the L1 with their value, they just invalidate. Tx caches are > >> >> fine with this because they acquire the primary owner lock for the > >> >> duration of the write including the L1 update so you can't have this > >> >> ordering. > >> > > >> > > >> > Sounds good. > >> > > >> >> > >> >> > > >> >> > A initiates a put(k, v1) to the primary owner B > >> >> > B performs the put(k, v1), invalidates every non-owner and returns > >> >> > B performs another put(k, v2), invalidating every non-owner > >> >> > A receives the result from B and puts k=v1 in its L1 > >> >> > >> >> @pruivo: > >> >> > >> >> > The invalidation does not need to wait for the remote get. When you > >> >> > receive an invalidation, you can mark the current remote get > invalid. > >> >> > The > >> >> > invalidation command can return immediately and the remote get can > be > >> >> > repeated. Also, it removes the key from data container (if exists) > >> >> Dan hit it right in the head. Unfortunately there is no guarantee > the > >> >> cancellation can work properly, so it is a best effort and if not > wait > >> >> until we know we will invalidate properly. > >> >> > The writes can update the L1 through your L1Synchronized by adding > a > >> >> > simple method like updateL1(newValue). The blocking threads will > >> >> > return > >> >> > immediately the new value and they don't need to wait for the > reply. > >> >> Non tx cache write operations aren't safe to update L1 with the value > >> >> since they don't acquire the owning lock while updating the L1, which > >> >> means you could have interleaved writes. Which is the primary reason > >> >> I rejected ISPN- 3214. For tx caches we can't do this since the > >> >> update has to take part of the tx, which the get would be updating > the > >> >> L1 outside of a transaction. > >> >> > I see... However, I think that all the events should synchronize at > >> >> > some > >> >> > point (update by remote get, update by local put and invalidation). > >> >> I was hoping that would cover this. Other than the outstanding issue > >> >> in ISPN-2965. > >> >> > >> >> On Thu, Jun 27, 2013 at 9:18 AM, William Burns <mudokon...@gmail.com > > > >> >> wrote: > >> >> > First off I apologize for the length. > >> >> > > >> >> > There have been a few Jiras recently that have identified L1 > >> >> > consistency > >> >> > issues with both TX and non TX sync caches. Async caches with L1 > >> >> > have > >> >> > their > >> >> > own issues as well, but I only wanted to talk about sync caches. > >> >> > > >> >> > https://issues.jboss.org/browse/ISPN-3197 > >> >> > https://issues.jboss.org/browse/ISPN-2965 > >> >> > https://issues.jboss.org/browse/ISPN-2990 > >> >> > > >> >> > I have proposed a solution in > >> >> > https://github.com/infinispan/infinispan/pull/1922 which should > start > >> >> > L1 > >> >> > consistency down the right track. There are quite a few comments > on > >> >> > it > >> >> > if > >> >> > you want to look into it more, but because of that I am moving this > >> >> > to > >> >> > the > >> >> > dev mailing list. > >> >> > > >> >> > The key changes in the PR are the following (non-tx): > >> >> > > >> >> > 1. Concurrent reads for a key that can retrieve a remote value are > >> >> > "corralled" into a single thread of execution for that given key. > >> >> > This > >> >> > would reduce network traffic with concurrent gets for the same key. > >> >> > Note > >> >> > the "corralling" only happens on a per key basis. > >> >> > 2. The single thread that is doing the remote get would update the > L1 > >> >> > if > >> >> > able (without locking) and make available the value to all the > >> >> > requests > >> >> > waiting on the get. > >> >> > 3. Invalidations that are received would first check to see if > there > >> >> > is > >> >> > a > >> >> > current remote get occurring for it's keys. If there is it will > >> >> > attempt > >> >> > to > >> >> > cancel the L1 write(s) before it occurs. If it cannot cancel the > L1 > >> >> > write, > >> >> > then it must also wait on the current remote get completion and > >> >> > subsequently > >> >> > run the invalidation. Note the cancellation would fail when the > >> >> > remote > >> >> > get > >> >> > was done and it is in the middle of updating the L1, so this would > be > >> >> > very > >> >> > small window. > >> >> > 4. Local writes will also do the same thing as the invalidation > with > >> >> > cancelling or waiting. Note that non tx local writes only do L1 > >> >> > invalidations and don't write the value to the data container. > >> >> > Reasons > >> >> > why > >> >> > I found at https://issues.jboss.org/browse/ISPN-3214 > >> >> > 5. Writes that require the previous value and don't have it in the > L1 > >> >> > would > >> >> > also do it's get operations using the same "corralling" method. > >> >> > > >> >> > 4/5 are not currently implemented in PR. > >> >> > > >> >> > This approach would use no locking for non tx caches for all L1 > >> >> > operations. > >> >> > The synchronization point would be done through the "corralling" > >> >> > method > >> >> > and > >> >> > invalidations/writes communicating to it. > >> >> > > >> >> > Transactional caches would do almost the same thing as non-tx. > Note > >> >> > these > >> >> > changes are not done in any way yet. > >> >> > > >> >> > 1. Gets would now update the L1 immediately after retrieving the > >> >> > value > >> >> > without locking, but still using the "corralling" technique that > >> >> > non-tx > >> >> > does. Previously the L1 update from a get was transactional. This > >> >> > actually > >> >> > would remedy issue [1] > >> >> > 2. Writes currently acquire the remote lock when committing, which > is > >> >> > why tx > >> >> > caches are able to update the L1 with the value. Writes would do > the > >> >> > same > >> >> > cancellation/wait method as non-tx. > >> >> > 3. Writes that require the previous value and don't have it in the > L1 > >> >> > would > >> >> > also do it's get operations using the same method. > >> >> > 4. For tx cache [2] would also have to be done. > >> >> > > >> >> > [1] - > >> >> > > >> >> > > >> >> > > https://issues.jboss.org/browse/ISPN-2965?focusedCommentId=12779780&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12779780 > >> >> > [2] - https://issues.jboss.org/browse/ISPN-1540 > >> >> > > >> >> > Also rehashing is another issue, but we should be able to acquire > the > >> >> > state > >> >> > transfer lock before updating the L1 on a get, just like when an > >> >> > entry > >> >> > is > >> >> > committed to the data container. > >> >> > > >> >> > Any comments/concerns would be appreciated. > >> >> > > >> >> > Thanks, > >> >> > > >> >> > - Will > >> >> > >> >> _______________________________________________ > >> >> infinispan-dev mailing list > >> >> infinispan-dev@lists.jboss.org > >> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> > > >> > > >> > > >> > _______________________________________________ > >> > infinispan-dev mailing list > >> > infinispan-dev@lists.jboss.org > >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev@lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > > > > > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev@lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev >
_______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev