Option 1 is a cheaper option because the cluster doesn't need to rebalance
(with the loss of a replica) post-decommission then rebalance again when
you add a new node.

The hints directory on EBS is irrelevant because it would only contain
mutations to replay to down replicas if the node was a coordinator. In the
scenario where the node itself goes down, other nodes will be storing hints
for this down node. The saved_caches are also useless if you're
bootstrapping the node into the cluster because the cache entries are only
valid for the previous data files, not the newly streamed files from the
bootstrap. Similarly, your commitlog directory will be empty -- that's the
whole point of running nodetool drain. :)

A little off-topic but *personally* I would co-locate the commitlog on the
same 950GB NVMe SSD as the data files. You would get a much better write
performance from the nodes compared to EBS and they shouldn't hurt your
reads since the NVMe disks have very high IOPS. I think they can sustain
400K+ IOPS (don't quote me). I'm sure others will comment if they have a
different experience. And of course, YMMV. Cheers!



On Fri, 14 Feb 2020 at 14:16, Sergio <lapostadiser...@gmail.com> wrote:

> We have i3xlarge instances with data directory in the XFS filesystem that
> is ephemeral and *hints*, *commit_log* and *saved_caches* in the EBS
> volume.
> Whenever AWS is going to retire the instance due to degraded hardware
> performance is it better:
>
> Option 1)
>    - Nodetool drain
>    - Stop cassandra
>    - Restart the machine from aws-cli to be restored in a different VM
> from the hypervisor
>    - Start Cassandra with -Dcassandra.replace_address
>    - We lose only the ephemeral but the commit_logs, hints, saved_cache
> will be there
>
>
> OR
>
> Option 2)
>  - Add a new node and wait for the NORMAL status
>  - Decommission the one that is going to be retired
>  - Run cleanup with cstar across the datacenters
>
> ?
>
> Thanks,
>
> Sergio
>

Reply via email to