On May 13, 2014, at 14:58, Dan Berindei <dan.berin...@gmail.com> wrote:
> > > > On Mon, May 12, 2014 at 1:54 PM, Radim Vansa <rva...@redhat.com> wrote: > @Dan: It's absolutely correct to do the further writes in order to make > the cache consistent, I am not arguing against that. You've fixed the > outcome (state of cache) well. My point was that we should let the user > know that the value he gets is not 100% correct when we already know > that - and given the API, the only option to do that seems to me as > throwing an exception. > > The problem, as I see it, is that users also expect methods that throw an > exception to *not* modify the cache. > So we would break some of the users' expectations anyway. I don't see how we can guarantee that if a method throws an exception nothing has been applies without a 2PC/TX. I think this should be a expectation for non-tx caches. i.e. if an operation throws an exception, then the state of the data is inconsistent. > > > @Sanne: I was not suggesting that for now - sure, value versioning is (I > hope) on the roadmap. But that's more complicated, I though just about > making an adjustment to the current implementation. > > > Actually, just keeping a history of values would not fix the the return value > in all cases. > > When retrying a put on the new primary owner, the primary owner would still > have to compare our value with the latest value, and return the previous > value if they are equal. So we could have something like this: > > A is the originator, B is the primary owner, k = v0 > A -> B: put(k, v1) > B dies before writing v, C is now primary owner > D -> C: put(k, v1) // another put operation from D, with the same value > C -> D: null > A -> C: retry_put(k, v1) > C -> A: v0 // C assumes A is overwriting its own value, so it's returning the > previous one > > To fix that, we'd need a unique version generated by the originator - kind of > like a transaction id ;) > And to fix the HotRod use case, the HotRod client would have to be the one > generating the version. > > Cheers > Dan > > > > Radim > > On 05/12/2014 12:02 PM, Sanne Grinovero wrote: > > I don't think we are in a position to decide what is a reasonable > > compromise; we can do better. > > For example - as Radim suggested - it might seem reasonable to have > > the older value around for a little while. We'll need a little bit of > > history of values and tombstones anyway for many other reasons. > > > > > > Sanne > > > > On 12 May 2014 09:37, Dan Berindei <dan.berin...@gmail.com> wrote: > >> Radim, I would contend that the first and foremost guarantee that put() > >> makes is to leave the cache in a consistent state. So we can't just throw > >> an > >> exception and give up, leaving k=v on one owner and k=null on another. > >> > >> Secondly, put(k, v) being atomic means that it either succeeds, it writes > >> k=v in the cache, and it returns the previous value, or it doesn't succeed, > >> and it doesn't write k=v in the cache. Returning the wrong previous value > >> is > >> bad, but leaving k=v in the cache is just as bad, even if the all the > >> owners > >> have the same value. > >> > >> And last, we can't have one node seeing k=null, then k=v, then k=null > >> again, > >> when the only write we did on the cache was a put(k, v). So trying to undo > >> the write would not help. > >> > >> In the end, we have to make a compromise, and I think returning the wrong > >> value in some of the cases is a reasonable compromise. Of course, we should > >> document that :) > >> > >> I also believe ISPN-2956 could be fixed so that HotRod behaves just like > >> embedded mode after the ISPN-3422 fix, by adding a RETRY flag to the HotRod > >> protocol and to the cache itself. > >> > >> Incidentally, transactional caches have a similar problem when the > >> originator leaves the cluster: ISPN-3421 [1] > >> And we can't handle transactional caches any better than non-transactional > >> caches until we expose transactions to the HotRod client. > >> > >> [1] https://issues.jboss.org/browse/ISPN-2956 > >> > >> Cheers > >> Dan > >> > >> > >> > >> > >> On Mon, May 12, 2014 at 10:21 AM, Radim Vansa <rva...@redhat.com> wrote: > >>> Hi, > >>> > >>> recently I've stumbled upon one already expected behaviour (one instance > >>> is [1]), but which did not got much attention. > >>> > >>> In non-tx cache, when the primary owner fails after the request has been > >>> replicated to backup owner, the request is retried in the new topology. > >>> Then, the operation is executed on the new primary (the previous > >>> backup). The outcome has been already fixed in [2], but the return value > >>> may be wrong. For example, when we do a put, the return value for the > >>> second attempt will be the currently inserted value (although the entry > >>> was just created). Same situation may happen for other operations. > >>> > >>> Currently, it's not possible to return the correct value (because it has > >>> already been overwritten and we don't keep a history of values), but > >>> shouldn't we rather throw an exception if we were not able to fulfil the > >>> API contract? > >>> > >>> Radim > >>> > >>> [1] https://issues.jboss.org/browse/ISPN-2956 > >>> [2] https://issues.jboss.org/browse/ISPN-3422 > >>> > >>> -- > >>> Radim Vansa <rva...@redhat.com> > >>> JBoss DataGrid QA > >>> > >>> _______________________________________________ > >>> infinispan-dev mailing list > >>> infinispan-dev@lists.jboss.org > >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > >> > >> > >> _______________________________________________ > >> infinispan-dev mailing list > >> infinispan-dev@lists.jboss.org > >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > > infinispan-dev mailing list > > infinispan-dev@lists.jboss.org > > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > -- > Radim Vansa <rva...@redhat.com> > JBoss DataGrid QA > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev