Ankit Patel created CASSANDRA-6608: -------------------------------------- Summary: Cassandra timeout on node failure Key: CASSANDRA-6608 URL: https://issues.apache.org/jira/browse/CASSANDRA-6608 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ankit Patel
We are seeing a weird issue with our Cassandra cluster(version 1.0.10). We have 6 nodes(DC1:3, DC2:3) in our cluster. So all 6 nodes are replicas of each other. All reads and writes are LOCAL_QOURUM. We see that when one of the node in DC1 fails, we see timeout errors. When we turned on DEBUG level logs, we see the following error in the Cassandra logs – DEBUG [Thrift:322] 2013-12-20 14:30:20,123 StorageProxy.java (line 676) Read timeout: java.util.concurrent.TimeoutException: Operation timed out - received only 2 responses from / xxx.xxx.xxx.IP1, xxx.xxx.xxx.IP2, . Considering that for LOCAL_QOURUM, we only need 2 nodes out of the 3 in the DC, I am surprised we are seeing this issue. Interestingly, when we connect to the third node after the second node returned timeout error, it works as expected. -- This message was sent by Atlassian JIRA (v6.1.5#6160)