RE: postmortem on 2.2.13 scale out difficulties

ZAIDI, ASAD A Wed, 12 Jun 2019 09:45:50 -0700


Adding one node at a time – is that successful?
Check value of streaming_socket_timeout_in_ms parameter in cassandra.yaml and 
increase if needed.
Have you tried Nodetool bootstrap resume & jvm option i.e. JVM_OPTS="$JVM_OPTS 
-Dcassandra.consistent.rangemovement=false"  ?

From: Carl Mueller [mailto:carl.muel...@smartthings.com.INVALID]
Sent: Wednesday, June 12, 2019 11:35 AM
To: user@cassandra.apache.org
Subject: Re: postmortem on 2.2.13 scale out difficulties

We only were able to scale out four nodes and then failures started occurring, 
including multiple instances of nodes joining a cluster without streaming.

Sigh.

On Tue, Jun 11, 2019 at 3:11 PM Carl Mueller 
<carl.muel...@smartthings.com<mailto:carl.muel...@smartthings.com>> wrote:
We had a three-DC (asia-tokyo/europe/us) cassandra 2.2.13 cluster, AWS, IPV6

Needed to scale out the asia datacenter, which was 5 nodes, europe and us were 
25 nodes

We were running into bootstrapping issues where the new node failed to 
bootstrap/stream, it failed with

"java.lang.RuntimeException: A node required to move the data consistently is 
down"

...even though they were all up based on nodetool status prior to adding the 
node.

First we increased the phi_convict_threshold to 12, and that did not help.

CASSANDRA-12281 appeared similar to what we had problems with, but I don't 
think we hit that. Somewhere in there someone wrote

"For us, the workaround is either deleting the data (then bootstrap again), or 
increasing the ring_delay_ms. And the larger the cluster is, the longer 
ring_delay_ms is needed. Based on our tests, for a 40 nodes cluster, it 
requires ring_delay_ms to be >50seconds. For a 70 nodes cluster, >100seconds. 
Default is 30seconds."
Given the WAN nature or our DCs, we used ring_delay_ms to 100 seconds and it 
finally worked.

side note:

During the rolling restarts for setting phi_convict_threshold we observed quite 
a lot of status map variance between nodes (we have a program to poll all of a 
datacenter or cluster's view of the gossipinfo and statuses. AWS appears to 
have variance in networking based on the phi_convict_threshold advice, I'm not 
sure if our difficulties were typical in that regard and/or if our IPV6 and/or 
globally distributed datacenters were exacerbating factors.

We could not reproduce this in loadtest, although loadtest is only eu and us 
(but is IPV6)

RE: postmortem on 2.2.13 scale out difficulties

Reply via email to