Christian Spriegel created CASSANDRA-14480:
----------------------------------------------

             Summary: Digest mismatch requires all replicas to be responsive
                 Key: CASSANDRA-14480
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14480
             Project: Cassandra
          Issue Type: Bug
          Components: Core
            Reporter: Christian Spriegel


I ran across a scenario where a digest mismatch causes a read-repair that 
requires all up nodes to be able to respond. If one of these nodes is not 
responding, then the read-repair is being reported to the client as 
ReadTimeoutException.

 

My expection would be that a CL=QUORUM will always succeed as long as 2 nodes 
are responding. But unfortunetaly the third node being "up" in the ring, but 
not being able to respond does lead to a RTE.

 

 

I came up with a scenario that reproduces the issue:
 # set up a 3 node cluster using ccm
 # increase the phi_convict_threshold to 16, so that nodes are permanently 
reported as up
 # create attached schema
 # run attached reader&writer (which only connects to node1&2). This should 
already produce digest mismatches
 # do a "ccm node3 pause"
 # The reader will report a read-timeout with consistency QUORUM (2 responses 
were required but only 1 replica responded). Within the DigestMismatchException 
catch-block it can be seen that the repairHandler is waiting for 3 responses, 
even though the exception says that 2 responses are required.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to