[ https://issues.apache.org/jira/browse/CASSANDRA-8672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14390761#comment-14390761 ]
Sylvain Lebresne commented on CASSANDRA-8672: --------------------------------------------- I don't understand your arguments. You seems to suggest that you'd be content with a new prepare flag, but it wouldn't change when we use {{WriteType.SIMPLE}}. All I'm saying is that adding a new {{CAS_PREPARE}} is only about splitting the {{CAS}} case in two and I don't understand why you're saying that distinguishing CAS and SIMPLE is pointless while it would become somehow great with a CAS_PREPARE. > Ambiguous WriteTimeoutException while completing pending CAS commits > -------------------------------------------------------------------- > > Key: CASSANDRA-8672 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8672 > Project: Cassandra > Issue Type: Bug > Components: Core > Reporter: Stefan Podkowinski > Assignee: Tyler Hobbs > Priority: Minor > Labels: CAS > Fix For: 3.0 > > > Any CAS update has a chance to trigger a pending/stalled commit of any > previously agreed on CAS update. After completing the pending commit, the CAS > operation will resume to execute the actual update and also possibly create a > new commit. See StorageProxy.cas() > Theres two possbile execution paths that might end up throwing a > WriteTimeoutException: > cas() -> beginAndRepairPaxos() -> commitPaxos() > cas() -> commitPaxos() > Unfortunatelly clients catching a WriteTimeoutException won't be able to tell > at which stage the commit failed. My guess would be that most developers are > not aware that the beginAndRepairPaxos() could also trigger a write and > assume that write timeouts would refer to a timeout while writting the actual > CAS update. Its therefor not safe to assume that successive CAS or SERIAL > read operations will cause a (write-)timeouted CAS operation to get > eventually applied. Although some [best-practices > advise|http://www.datastax.com/dev/blog/cassandra-error-handling-done-right] > claims otherwise. > At this point the safest bet is possibly to retry the complete business > transaction in case of an WriteTimeoutException. However, as theres a chance > that the timeout occurred while writing the actual CAS operation, another > write could potentially complete it and our CAS condition will get a > different result upon retry. -- This message was sent by Atlassian JIRA (v6.3.4#6332)