We use backup/restore for our implementation of this concept. It has the added 
benefit that the backup / restore path gets exercised much more regularly than 
it would in normal operations, finding edge case bugs at a time when you still 
have other ways of recovering rather than in a full disaster scenario.

Cheers
Ben




From: Jordan West <jorda...@gmail.com>
Date: Sunday, 21 April 2024 at 05:38
To: dev@cassandra.apache.org <dev@cassandra.apache.org>
Subject: Re: [DISCUSS] CEP-40: Data Transfer Using Cassandra Sidecar for Live 
Migrating Instances
EXTERNAL EMAIL - USE CAUTION when clicking links or attachments


I do really like the framing of replacing a node is restoring a node and then 
kicking off a replace. That is effectively what we do internally.

I also agree we should be able to do data movement well both internal to 
Cassandra and externally for a variety of reasons.

We’ve seen great performance with “ZCS+TLS” even though it’s not full zero copy 
— nodes that previously took *days* to replace now take a few hours. But we 
have seen it put pressure on nodes and drive up latencies which is the main 
reason we still rely on an external data movement system by default — falling 
back to ZCS+TLS as needed.

Jordan

On Fri, Apr 19, 2024 at 19:15 Jon Haddad 
<j...@jonhaddad.com<mailto:j...@jonhaddad.com>> wrote:
Jeff, this is probably the best explanation and justification of the idea that 
I've heard so far.

I like it because

1) we really should have something official for backups
2) backups / object store would be great for analytics
3) it solves a much bigger problem than the single goal of moving instances.

I'm a huge +1 in favor of this perspective, with live migration being one use 
case for backup / restore.

Jon


On Fri, Apr 19, 2024 at 7:08 PM Jeff Jirsa 
<jji...@gmail.com<mailto:jji...@gmail.com>> wrote:
I think Jordan and German had an interesting insight, or at least their comment 
made me think about this slightly differently, and I’m going to repeat it so 
it’s not lost in the discussion about zerocopy / sendfile.

The CEP treats this as “move a live instance from one machine to another”. I 
know why the author wants to do this.

If you think of it instead as “change backup/restore mechanism to be able to 
safely restore from a running instance”, you may end up with a cleaner 
abstraction that’s easier to think about (and may also be easier to generalize 
in clouds where you have other tools available ).

I’m not familiar enough with the sidecar to know the state of orchestration for 
backup/restore, but “ensure the original source node isn’t running” , “migrate 
the config”, “choose and copy a snapshot” , maybe “forcibly exclude the 
original instance from the cluster” are all things the restore code is going to 
need to do anyway, and if restore doesn’t do that today, it seems like we can 
solve it once.

Backup probably needs to be generalized to support many sources, too. Object 
storage is obvious (s3 download). Block storage is obvious (snapshot and 
reattach). Reading sstables from another sidecar seems reasonable, too.

It accomplishes the original goal, in largely the same fashion, it just makes 
the logic reusable for other purposes?






On Apr 19, 2024, at 5:52 PM, Dinesh Joshi 
<djo...@apache.org<mailto:djo...@apache.org>> wrote:

On Thu, Apr 18, 2024 at 12:46 PM Ariel Weisberg 
<ar...@weisberg.ws<mailto:ar...@weisberg.ws>> wrote:

If there is a faster/better way to replace a node why not  have Cassandra 
support that natively without the sidecar so people who aren’t running the 
sidecar can benefit?

I am not the author of the CEP so take whatever I say with a pinch of salt. 
Scott and Jordan have pointed out some benefits of doing this in the Sidecar vs 
Cassandra.

Today Cassandra is able to do fast node replacements. However, this CEP is 
addressing an important corner case when Cassandra is unable to start up due to 
old / ailing hardware. Can we fix it in Cassandra so it doesn't die on old 
hardware? Sure. However, you would still need operator intervention to start it 
up in some special mode both on the old and new node so the new node can peer 
with the old node, copy over its data and join the ring. This would still 
require some orchestration outside the database. The Sidecar can do that 
orchestration for the operator. The point I'm making here is that the CEP 
addresses a real issue. The way it is currently built can improve over time 
with improvements in Cassandra.

Dinesh

Reply via email to