On Oct 22, 2010, at 7:41 AM, Jérôme Verstrynge wrote: > Let's imagine that A initiates its column write at: 334450 ms with 'AAA' and > timestamp 334450 ms > Let's imagine that E initiates its column write at: 334451 ms with 'ZZZ'and > timestamp 334450 ms > (E is the latest write) > > Let's imagine that A reaches C at 334455 ms and performs its write. > Let's imagine that E reaches C at 334456 ms and attempts to performs its > write. It will loose the timestamp-tie ('AAA' is greater than 'ZZZ').
How is this any different from E's perspective than if A had come along a moment later with timestamp 334452? What you describe is an application in *desperate* need of either a serious redesign, or a distributed locking mechanism. This really isn't a Cassandra-specific problem, Cassandra just happens to be the distributed storage system at issue. Any such system without a locking mechanism will present some form of this problem, and the answer will be the same: Avoid it in the application design, or incorporate a locking mechanism into the application. > If there is a timestamp-tie, then the context becomes uncertain for E, out of > the blue. > If application E can't be sure about what has been saved in Cassandra, it > cannot rely on what it has in memory. It is a vicious circle. It can't > anticipate on the potential actions of A on the column too. And how is this different from E's data being overwritten with a later timestamp? Either way, what E thinks is in Cassandra really isn't. If you need to make sure you have consistency at this level, you *need* a locking mechanism. > This is unsual for any application, but may be this is the price to pay for > using Cassandra. Fair enough. Hardly. Any non-serial application that doesn't use some form of locking has this exact same problem at all levels of storage, possibly even in its internal variables. > > If E is not informed of the timestamp tie, then it is left alone in the dark. > Hence, this is why I say Cassandra is not deterministic to E. The result of a > write is potentially non-deterministic in what it actually performs. Cassandra is deterministic for a given input. What you're saying is you aren't properly controlling the input that your application is giving it. > If E was aware that it lost a timestamp-tie, it would know that there is a > possible gap between its internal memory representation and what it tried to > save into Cassandra. That is, EVEN if there is no further write on that same > column (or, in other words, regardless of any potential subsequent races). What is the significance of this? > > If E was informed it lost a timestamp-tie, it could re-read the column (and > let's assume that there is no further write in between, but this does not > change anything to the argument). It could spot that its write for timestamp > value 334450 ms failed, and also the reason why ('AAA' greater than 'ZZZ). It > could operate a new write, which eventually could result in another > timestamp-tie, but at least it would be informed about it too... It would > have a safety net. To what end? A and E would apparently get into some sort of never-ending fight. The application as described is broken and needs to be fixed. > > The case I am trying to cover is the case where the context for application E > becomes invalid because of a successful write call to Cassandra without > registration of 'ZZZ'. How can Cassandra call it a successful write, when in > fact, it isn't for application E? I believe Cassandra should notify > application E one way or another. This is why I mentioned an extra > timestamp-tie flag in the write ACK sent by nodes back to node E. Here's part of the problem. You're seeing E as a distinct application from A which can behave completely independently. You need to stop thinking like that. It leads to broken architectures Even if the E and A processes come from entirely different code bases, you need to start by thinking of them as one application. That application is broken. > > The subsequent question I have is: > > If 'value breaks timestamp-tie', how does Cassandra behave in case of > updates? If there is a column with value 'AAA' at 334450 ms and an > application explicitely wants to update this value to 'ZZZ' for 334450 ms, it > seems like the timestamp-tie will prevent that. Hence, the update/mutation > would be undeterministic to E. It seems like one should first delete the > existing record and write a new one (and that could lead to race conditions > and timestamp-ties too). You need a locking mechanism. Timestamps aren't the droids you're looking for. > I think this should be documented, because engineers will hit that 'local' > undeterministic issue for sure if two instances of their applications perform > 'completed writes' in the same column family. Completed does not mean > successful, even with quorum (or ALL). They ought to know it. I'm honestly not sure why they wouldn't. One need only perform a very cursory investigation of Cassandra to realize that addition of a locking mechanism is necessary for many applications, such as the one described here. -NK