We have three datacenters (EU, AP, US) in aws and have problems bootstrapping new nodes in the AP datacenter:
java.lang.RuntimeException: A node required to move the data consistently is down (/SOME_NODE). If you wish to move the data from a potentially inconsistent replica, restart the node with -Dcassandra.consistent.rangemovement=false Per the code increasing the ring_delay fixed the bootstrap sleep race condition. But my research brought up some other questions. We are generally frustrated with AWS networking but it is what it is, and was wondering if we could further tune gossip. We have been getting some intermittent: Gossip stage has {} pending tasks; skipping status check (no nodes will be marked down) Which is the GossipStage queue backing up, and another page indicated gossip has one thread for its pool. For large clusters (>30 nodes from what we've seen in crappy aws) , wouldn't it be good to increase the threads so more frequent communication occurs and a slow cross-region gossip connect doesn't slow things? Can the GossipStage be increased to two threads from the one? Or is the Gossiper single-threaded/not reentrant? Is the random ordering of selecting a node to gossip-check a shuffle of the known IPs to contact, or a totally random pick-one every second? The second interval is based on a comment in the code, is that the gossip interval, and can that be changed? Would it be a bad idea to create a prioritized gossip request for nodes that are up for a while to respond to bootstrapping nodes / nodes that are known to be restarting? Is that even possible? That might also get hinted handoff processing proceeding more quickly. For multidatacenter, it also would make sense to have gossip process local datacenter gossip requests more quickly (or have a thread dedicated to fastracking that) and the "far" nodes in a different thread? Again, to prevent cross-pacific gossip requests holding up other requests that can be more quickly handled?