Answers inline

-- 
Jeff Jirsa


> On Aug 12, 2017, at 2:58 PM, brian.spind...@gmail.com wrote:
> 
> Hi folks, hopefully a quick one:
> 
> We are running a 12 node cluster (2.1.15) in AWS with Ec2Snitch.  It's all in 
> one region but spread across 3 availability zones.  It was nicely balanced 
> with 4 nodes in each.
> 
> But with a couple of failures and subsequent provisions to the wrong az we 
> now have a cluster with : 
> 
> 5 nodes in az A
> 5 nodes in az B
> 2 nodes in az C
> 
> Not sure why, but when adding a third node in AZ C it fails to stream after 
> getting all the way to completion and no apparent error in logs.  I've looked 
> at a couple of bugs referring to scrubbing and possible OOM bugs due to 
> metadata writing at end of streaming (sorry don't have ticket handy).  I'm 
> worried I might not be able to do much with these since the disk space usage 
> is high and they are under a lot of load given the small number of them for 
> this rack.

You'll definitely have higher load on az C instances with rf=3 in this ratio

Streaming should still work - are you sure it's not busy doing something? Like 
building secondary index or similar? jstack thread dump would be useful, or at 
least nodetool tpstats


> 
> Rather than troubleshoot this further, what I was thinking about doing was:
> - drop the replication factor on our keyspace to two

Repair before you do this, or you'll lose your consistency guarantees

> - hopefully this would reduce load on these two remaining nodes 

It should, racks awareness guarantees on replica per rack if rf==num racks, so 
right now those 2 c machines have 2.5x as much data as the others. This will 
drop that requirement and drop the load significantly 

> - run repairs/cleanup across the cluster 
> - then shoot these two nodes in the 'c' rack

Why shoot the c instances? Why not drop RF and then add 2 more C instances, 
then increase RF back to 3, run repair, then Decom the extra instances in a and 
b?


> - run repairs/cleanup across the cluster
> 
> Would this work with minimal/no disruption? 

The big risk of running rf=2 is that quorum==all - any gc pause or node 
restarting will make you lose HA or strong consistency guarantees.

> Should I update their "rack" before hand or after ?

You can't change a node's rack once it's in the cluster, it SHOULD refuse to 
start if you do that

> What else am I not thinking about? 
> 
> My main goal atm is to get back to where the cluster is in a clean consistent 
> state that allows nodes to properly bootstrap.
> 
> Thanks for your help in advance.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Reply via email to