Peter, thanks for extensive feedback. Much appreciated.

On 26/10/2010 0:47, Peter Schuller wrote:
This doesn't mean that your problem is somehow invalid; but it doesn't
sound like QUOROM consistency (over-writing) writes is the solution.

What is the difference, from your application's perspective, between
the timestamp tie and a write simply happening a millisecond later by
an un-coordinated concurrent writer? In both cases, the data in
cassandra will no longer match your client's view of it.
I may have been unclear about the meaning of timestamp in Cassandra. I was under the impression that any given data with the same key value and two different timestamps would result in two 'rows'. From what you say, it does not seem to be the case. Do you confirm? (In other words, whoever has the greatest timestamp destroys the previous records with lower timestamps).

I'm repeating myself but just to be clear: So again, it seems to me
such an ACK would not be useful since you would not be made aware of
any change that happens later on anyway. It does not seem semantically
"relevant" except perhaps as a probabilistic optimization. As soon as
your write completes, you have no idea what is in Cassandra,
regardless of timestamp ties (assuming you have the potential for
concurrent writers).
Assuming latest timestamp erase/overwrites previous entries, I agree.

If 'value breaks timestamp-tie', how does Cassandra behave in case of
updates? If there is a column with value 'AAA' at 334450 ms and an
application explicitely wants to update this value to 'ZZZ' for 334450 ms,
it seems like the timestamp-tie will prevent that. Hence, the
update/mutation would be undeterministic to E. It seems like one should
first delete the existing record and write a new one (and that could lead to
race conditions and timestamp-ties too).
A single client wishing to make multiple logically subsequent writes
should ensure that the same timestamp is not used for such writes.
Make sense if latest timestamp erases/overwrittes previous data.

I think this should be documented, because engineers will hit that 'local'
undeterministic issue for sure if two instances of their applications
perform 'completed writes' in the same column family. Completed does not
mean successful, even with quorum (or ALL). They ought to know it.
I think it does. I believe the results you are describing as
unexpected are fully expected fundamentally, and there is no real
difference implied in receiving a timestamp ACK flag back. I'm totally
open to being wrong or having misunderstood something (or both), but
right now I don't see it. If on the other hand I'm not wrong then
perhaps we can figure out how to document or present the functionality
of Cassandra better :)
I know I am boxing a corner case, but I have not seen in the documentation that latest timestamp erases/overwrittes previous data. Now, I may have missed something here. May be I did not rub my eyes enough or the coffee was not operating yet.

If not, I would suggest adding some small documentation on the wiki explaining:

i) That most recent timestamp overwrittes previous entries with lower timestamp.
ii) If case of timestamp ties, value breaks ties.
iii) What about ColumnFamilies and SuperColumnFamilies? Do we have the guarantee that, in case of timestamp ties, the whole record of the winner is register (I would assume yes, of course)

I believe something 'official' and explicit from Cassandra leaders would close gap on assumptions and interpretations made by newbies like me. Timestamp really looks like a 'key' to me.

Thanks,

Jérôme

Reply via email to