Hiroyuki Yamada created CASSANDRA-15138: -------------------------------------------
Summary: A cluster (RF=3) not recovering after two nodes are stopped Key: CASSANDRA-15138 URL: https://issues.apache.org/jira/browse/CASSANDRA-15138 Project: Cassandra Issue Type: Bug Components: Cluster/Membership Reporter: Hiroyuki Yamada I faced a weird issue when recovering a cluster after two nodes are stopped. It is easily reproduce-able and looks like a bug or an issue to fix. The following is a step to reproduce it. === STEP TO REPRODUCE === * Create a 3-node cluster with RF=3 - node1(seed), node2, node3 * Start requests to the cluster with cassandra-stress (it continues until the end) - what we did: cassandra-stress mixed cl=QUORUM duration=10m -errors ignore -node node1,node2,node3 -rate threads\>=16 threads\<=256 - (It doesn't have to be this many threads. Can be 1) * Stop node3 normally (with systemctl stop or kill (not without -9)) - the system is still available as expected because the quorum of nodes is still available * Stop node2 normally (with systemctl stop or kill (not without -9)) - the system is NOT available as expected after it's stopped. - the client gets `UnavailableException: Not enough replicas available for query at consistency QUORUM` - the client gets errors right away (so few ms) - so far it's all expected * Wait for 1 mins * Bring up node2 back - {color:#FF0000}The issue happens here.{color} - the client gets ReadTimeoutException` or WriteTimeoutException depending on if the request is read or write even after the node2 is up - the client gets errors after about 5000ms or 2000ms, which are request timeout for write and read request - what node1 reports with `nodetool status` and what node2 reports are not consistent. (node2 thinks node1 is down) - It takes very long time to recover from its state === STEPS TO REPRODUCE === Some additional important information to note: * If we don't start cassandra-stress, it doesn't cause the issue. * Restarting node1 and it recovers its state right after it's restarted * Setting lower value in dynamic_snitch_reset_interval_in_ms (to 60000 or something) fixes the issue * If we `kill -9` the nodes, then it doesn't cause the issue. * Hints seems not related. I tested with hints disabled, it didn't make any difference. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org