Re: [DISCUSS] CEP-40: Data Transfer Using Cassandra Sidecar for Live Migrating Instances

2024-04-18 Thread Francisco Guerrero
My understanding from the proposal is that Sidecar would be able to migrate
from a Cassandra instance that is already dead and cannot recover. This is a
scenario that is possible where Sidecar should still be able to migrate to a new
instance.

Alternatively, Cassandra itself could have some flag to start up with limited
subsystems enabled to allow live migration.

In any case, we'll need to weigh in the pros and cons of each alternative and
decide if the live migration process can be handled within the C* process itself
or if we allow this functionality to be handled by Sidecar.

I am looking forward to this feature though, as it will be of great value for 
many
users across the ecosystem.

On 2024/04/18 22:25:23 Jon Haddad wrote:
> Hmm... I guess if you're using encryption you can't use ZCS so there's that.
> 
> It probably makes sense to implement kernel TLS:
> https://www.kernel.org/doc/html/v5.7/networking/tls.html
> 
> Then we can get ZCS all the time, for bootstrap & replacements.
> 
> Jon
> 
> 
> On Thu, Apr 18, 2024 at 12:50 PM Jon Haddad  wrote:
> 
> > Ariel, having it in C* process makes sense to me.
> >
> > Please correct me if I'm wrong here, but shouldn't using ZCS to transfer
> > have no distinguishable difference in overhead from doing it using the
> > sidecar?  Since the underlying call is sendfile, never hitting userspace, I
> > can't see why we'd opt for the transfer in sidecar.  What's the
> > advantage of duplicating the work that's already been done?
> >
> > I can see using the sidecar for coordination to start and stop instances
> > or do things that require something out of process.
> >
> > Jon
> >
> >
> > On Thu, Apr 18, 2024 at 12:44 PM Ariel Weisberg  wrote:
> >
> >> Hi,
> >>
> >> If there is a faster/better way to replace a node why not  have Cassandra
> >> support that natively without the sidecar so people who aren’t running the
> >> sidecar can benefit?
> >>
> >> Copying files over a network shouldn’t be slow in C* and it would also
> >> already have all the connectivity issues solved.
> >>
> >> Regards,
> >> Ariel
> >>
> >> On Fri, Apr 5, 2024, at 6:46 AM, Venkata Hari Krishna Nukala wrote:
> >>
> >> Hi all,
> >>
> >> I have filed CEP-40 [1] for live migrating Cassandra instances using the
> >> Cassandra Sidecar.
> >>
> >> When someone needs to move all or a portion of the Cassandra nodes
> >> belonging to a cluster to different hosts, the traditional approach of
> >> Cassandra node replacement can be time-consuming due to repairs and the
> >> bootstrapping of new nodes. Depending on the volume of the storage service
> >> load, replacements (repair + bootstrap) may take anywhere from a few hours
> >> to days.
> >>
> >> Proposing a Sidecar based solution to address these challenges. This
> >> solution proposes transferring data from the old host (source) to the new
> >> host (destination) and then bringing up the Cassandra process at the
> >> destination, to enable fast instance migration. This approach would help to
> >> minimise node downtime, as it is based on a Sidecar solution for data
> >> transfer and avoids repairs and bootstrap.
> >>
> >> Looking forward to the discussions.
> >>
> >> [1]
> >> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-40%3A+Data+Transfer+Using+Cassandra+Sidecar+for+Live+Migrating+Instances
> >>
> >> Thanks!
> >> Hari
> >>
> >>
> >>
> 


Re: [DISCUSS] CEP-40: Data Transfer Using Cassandra Sidecar for Live Migrating Instances

2024-04-18 Thread Jon Haddad
Hmm... I guess if you're using encryption you can't use ZCS so there's that.

It probably makes sense to implement kernel TLS:
https://www.kernel.org/doc/html/v5.7/networking/tls.html

Then we can get ZCS all the time, for bootstrap & replacements.

Jon


On Thu, Apr 18, 2024 at 12:50 PM Jon Haddad  wrote:

> Ariel, having it in C* process makes sense to me.
>
> Please correct me if I'm wrong here, but shouldn't using ZCS to transfer
> have no distinguishable difference in overhead from doing it using the
> sidecar?  Since the underlying call is sendfile, never hitting userspace, I
> can't see why we'd opt for the transfer in sidecar.  What's the
> advantage of duplicating the work that's already been done?
>
> I can see using the sidecar for coordination to start and stop instances
> or do things that require something out of process.
>
> Jon
>
>
> On Thu, Apr 18, 2024 at 12:44 PM Ariel Weisberg  wrote:
>
>> Hi,
>>
>> If there is a faster/better way to replace a node why not  have Cassandra
>> support that natively without the sidecar so people who aren’t running the
>> sidecar can benefit?
>>
>> Copying files over a network shouldn’t be slow in C* and it would also
>> already have all the connectivity issues solved.
>>
>> Regards,
>> Ariel
>>
>> On Fri, Apr 5, 2024, at 6:46 AM, Venkata Hari Krishna Nukala wrote:
>>
>> Hi all,
>>
>> I have filed CEP-40 [1] for live migrating Cassandra instances using the
>> Cassandra Sidecar.
>>
>> When someone needs to move all or a portion of the Cassandra nodes
>> belonging to a cluster to different hosts, the traditional approach of
>> Cassandra node replacement can be time-consuming due to repairs and the
>> bootstrapping of new nodes. Depending on the volume of the storage service
>> load, replacements (repair + bootstrap) may take anywhere from a few hours
>> to days.
>>
>> Proposing a Sidecar based solution to address these challenges. This
>> solution proposes transferring data from the old host (source) to the new
>> host (destination) and then bringing up the Cassandra process at the
>> destination, to enable fast instance migration. This approach would help to
>> minimise node downtime, as it is based on a Sidecar solution for data
>> transfer and avoids repairs and bootstrap.
>>
>> Looking forward to the discussions.
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-40%3A+Data+Transfer+Using+Cassandra+Sidecar+for+Live+Migrating+Instances
>>
>> Thanks!
>> Hari
>>
>>
>>


Re: [DISCUSS] CEP-40: Data Transfer Using Cassandra Sidecar for Live Migrating Instances

2024-04-18 Thread Jon Haddad
Ariel, having it in C* process makes sense to me.

Please correct me if I'm wrong here, but shouldn't using ZCS to transfer
have no distinguishable difference in overhead from doing it using the
sidecar?  Since the underlying call is sendfile, never hitting userspace, I
can't see why we'd opt for the transfer in sidecar.  What's the
advantage of duplicating the work that's already been done?

I can see using the sidecar for coordination to start and stop instances or
do things that require something out of process.

Jon


On Thu, Apr 18, 2024 at 12:44 PM Ariel Weisberg  wrote:

> Hi,
>
> If there is a faster/better way to replace a node why not  have Cassandra
> support that natively without the sidecar so people who aren’t running the
> sidecar can benefit?
>
> Copying files over a network shouldn’t be slow in C* and it would also
> already have all the connectivity issues solved.
>
> Regards,
> Ariel
>
> On Fri, Apr 5, 2024, at 6:46 AM, Venkata Hari Krishna Nukala wrote:
>
> Hi all,
>
> I have filed CEP-40 [1] for live migrating Cassandra instances using the
> Cassandra Sidecar.
>
> When someone needs to move all or a portion of the Cassandra nodes
> belonging to a cluster to different hosts, the traditional approach of
> Cassandra node replacement can be time-consuming due to repairs and the
> bootstrapping of new nodes. Depending on the volume of the storage service
> load, replacements (repair + bootstrap) may take anywhere from a few hours
> to days.
>
> Proposing a Sidecar based solution to address these challenges. This
> solution proposes transferring data from the old host (source) to the new
> host (destination) and then bringing up the Cassandra process at the
> destination, to enable fast instance migration. This approach would help to
> minimise node downtime, as it is based on a Sidecar solution for data
> transfer and avoids repairs and bootstrap.
>
> Looking forward to the discussions.
>
> [1]
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-40%3A+Data+Transfer+Using+Cassandra+Sidecar+for+Live+Migrating+Instances
>
> Thanks!
> Hari
>
>
>


Re: discuss: add to_human_size function

2024-04-18 Thread Ariel Weisberg
Hi,

I think it’s a good quality of life improvement, but I am someone who believes 
in a rich set of built-in functions being a good thing.

A format function is a bit more scope and kind of orthogonal. It would still be 
good to have shorthand functions for things like size.

Ariel

On Tue, Apr 9, 2024, at 8:09 AM, Štefan Miklošovič wrote:
> Hi,
> 
> I want to propose CASSANDRA-19546. It would be possible to convert raw 
> numbers to something human-friendly. 
> There are cases when we write just a number of bytes in our system tables but 
> these numbers are just hard to parse visually. Users can indeed use this for 
> their tables too if they find it useful.
> 
> Also, a user can indeed write a UDF for this but I would prefer if we had 
> something baked in.
> 
> Does this make sense to people? Are there any other approaches to do this? 
> 
> https://issues.apache.org/jira/browse/CASSANDRA-19546
> https://github.com/apache/cassandra/pull/3239/files
> 
> Regards


Re: [DISCUSS] CEP-40: Data Transfer Using Cassandra Sidecar for Live Migrating Instances

2024-04-18 Thread Ariel Weisberg
Hi,

If there is a faster/better way to replace a node why not  have Cassandra 
support that natively without the sidecar so people who aren’t running the 
sidecar can benefit?

Copying files over a network shouldn’t be slow in C* and it would also already 
have all the connectivity issues solved.

Regards,
Ariel

On Fri, Apr 5, 2024, at 6:46 AM, Venkata Hari Krishna Nukala wrote:
> Hi all,
> 
> I have filed CEP-40 [1] for live migrating Cassandra instances using the 
> Cassandra Sidecar.
> 
> When someone needs to move all or a portion of the Cassandra nodes belonging 
> to a cluster to different hosts, the traditional approach of Cassandra node 
> replacement can be time-consuming due to repairs and the bootstrapping of new 
> nodes. Depending on the volume of the storage service load, replacements 
> (repair + bootstrap) may take anywhere from a few hours to days.
> 
> Proposing a Sidecar based solution to address these challenges. This solution 
> proposes transferring data from the old host (source) to the new host 
> (destination) and then bringing up the Cassandra process at the destination, 
> to enable fast instance migration. This approach would help to minimise node 
> downtime, as it is based on a Sidecar solution for data transfer and avoids 
> repairs and bootstrap.
> 
> Looking forward to the discussions.
> 
> [1] 
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-40%3A+Data+Transfer+Using+Cassandra+Sidecar+for+Live+Migrating+Instances
> 
> Thanks!
> Hari


Re: Welcome Alexandre Dutra, Andrew Tolbert, Bret McGuire, Olivier Michallat as Cassandra Committers

2024-04-18 Thread Paulo Motta
Congratulations Alexandre, Andy, Bret and Olivier! :-)

On Wed, Apr 17, 2024 at 1:11 PM Benjamin Lerer  wrote:

> The Apache Cassandra PMC is pleased to announce that Alexandre Dutra,
> Andrew Tolbert, Bret McGuire and Olivier Michallat have accepted the
> invitation to become committers on the java driver sub-project.
>
> Thanks for your contributions to the Java driver during all those years!
> Congratulations and welcome!
>
> The Apache Cassandra PMC members
>


Re: Welcome Alexandre Dutra, Andrew Tolbert, Bret McGuire, Olivier Michallat as Cassandra Committers

2024-04-18 Thread Tolbert, Andy
Thanks everyone!

On Thu, Apr 18, 2024 at 8:43 AM Alexander DEJANOVSKI
 wrote:
>
> Congratulations folks! And thanks for all the hard work throughout the years 
> on the Java Driver!
>
> Le jeu. 18 avr. 2024 à 08:39, Jean-Armel Luce  a écrit :
>>
>> Congratulations everyone !!!
>>
>> Le jeu. 18 avr. 2024 à 07:37, Berenguer Blasi  a 
>> écrit :
>>>
>>> Congrats all!
>>>
>>> On 17/4/24 23:23, Jeremiah Jordan wrote:
>>>
>>> Congrats all!
>>>
>>>
>>> On Apr 17, 2024 at 12:10:11 PM, Benjamin Lerer  wrote:

 The Apache Cassandra PMC is pleased to announce that Alexandre Dutra, 
 Andrew Tolbert, Bret McGuire and Olivier Michallat have accepted the 
 invitation to become committers on the java driver sub-project.

 Thanks for your contributions to the Java driver during all those years!
 Congratulations and welcome!

 The Apache Cassandra PMC members


Re: Welcome Alexandre Dutra, Andrew Tolbert, Bret McGuire, Olivier Michallat as Cassandra Committers

2024-04-18 Thread Alexander DEJANOVSKI
Congratulations folks! And thanks for all the hard work throughout the
years on the Java Driver!

Le jeu. 18 avr. 2024 à 08:39, Jean-Armel Luce  a écrit :

> Congratulations everyone !!!
>
> Le jeu. 18 avr. 2024 à 07:37, Berenguer Blasi 
> a écrit :
>
>> Congrats all!
>> On 17/4/24 23:23, Jeremiah Jordan wrote:
>>
>> Congrats all!
>>
>>
>> On Apr 17, 2024 at 12:10:11 PM, Benjamin Lerer  wrote:
>>
>>> The Apache Cassandra PMC is pleased to announce that Alexandre Dutra,
>>> Andrew Tolbert, Bret McGuire and Olivier Michallat have accepted the
>>> invitation to become committers on the java driver sub-project.
>>>
>>> Thanks for your contributions to the Java driver during all those years!
>>> Congratulations and welcome!
>>>
>>> The Apache Cassandra PMC members
>>>
>>


Re: [DISCUSS] CEP-40: Data Transfer Using Cassandra Sidecar for Live Migrating Instances

2024-04-18 Thread Claude Warren, Jr via dev
I think this solution would solve one of the problems that Aiven has with
node replacement currently.  Though TCM will probably help as well.

On Mon, Apr 15, 2024 at 11:47 PM German Eichberger via dev <
dev@cassandra.apache.org> wrote:

> Thanks for the proposal. I second Jordan that we need more abstraction in
> (1), e.g. most cloud provider allow for disk snapshots and starting nodes
> from a snapshot which would be a good mechanism if you find yourself there.
>
> German
> --
> *From:* Jordan West 
> *Sent:* Sunday, April 14, 2024 12:27 PM
> *To:* dev@cassandra.apache.org 
> *Subject:* [EXTERNAL] Re: [DISCUSS] CEP-40: Data Transfer Using Cassandra
> Sidecar for Live Migrating Instances
>
> Thanks for proposing this CEP! We have something like this internally so I
> have some familiarity with the approach and the challenges. After reading
> the CEP a couple things come to mind:
>
> 1. I would like to see more abstraction of how the files get moved / put
> in place with the proposed solution being the default implementation. That
> would allow others to plug in alternatives means of data movement like
> pulling down backups from S3 or rsync, etc.
>
> 2. I do agree with Jon’s last email that the lifecycle / orchestration
> portion is the more challenging aspect. It would be nice to address that as
> well so we don’t end up with something like repair where the building
> blocks are there but the hard parts are left to the operator. I do,
> however, see that portion being done in a follow-on CEP to limit the scope
> of CEP-40 and have a higher chance for success by incrementally adding
> these features.
>
> Jordan
>
> On Thu, Apr 11, 2024 at 12:31 Jon Haddad  wrote:
>
> First off, let me apologize for my initial reply, it came off harsher than
> I had intended.
>
> I know I didn't say it initially, but I like the idea of making it easier
> to replace a node.  I think it's probably not obvious to folks that you can
> use rsync (with stunnel, or alternatively rclone), and for a lot of teams
> it's intimidating to do so.  Whether it actually is easy or not to do with
> rsync is irrelevant.  Having tooling that does it right is better than duct
> taping things together.
>
> So with that said, if you're looking to get feedback on how to make the
> CEP more generally useful, I have a couple thoughts.
>
> > Managing the Cassandra processes like bringing them up or down while
> migrating the instances.
>
> Maybe I missed this, but I thought we already had support for managing the
> C* lifecycle with the sidecar?  Maybe I'm misremembering.  It seems to me
> that adding the ability to make this entire workflow self managed would be
> the biggest win, because having a live migrate *feature* instead of what's
> essentially a runbook would be far more useful.
>
> > To verify whether the desired file set matches with source, only file
> path and size is considered at the moment. Strict binary level verification
> is deferred for later.
>
> Scott already mentioned this is a problem and I agree, we cannot simply
> rely on file path and size.
>
> TL;DR: I like the intention of the CEP.  I think it would be better if it
> managed the entire lifecycle of the migration, but you might not have an
> appetite to implement all that.
>
> Jon
>
>
> On Thu, Apr 11, 2024 at 10:01 AM Venkata Hari Krishna Nukala <
> n.v.harikrishna.apa...@gmail.com> wrote:
>
> Thanks Jon & Scott for taking time to go through this CEP and providing
> inputs.
>
> I am completely with what Scott had mentioned earlier (I would have added
> more details into the CEP). Adding a few more points to the same.
>
> Having a solution with Sidecar can make the migration easy without
> depending on rsync. At least in the cases I have seen, rsync is not enabled
> by default and most of them want to run OS/images with as minimal
> requirements as possible. Installing rsync requires admin privileges and
> syncing data is a manual operation. If an API is provided with Sidecar,
> then tooling can be built around it reducing the scope for manual errors.
>
> From performance wise, at least in the cases I had seen, the File
> Streaming API in Sidecar performs a lot better. To give an idea on the
> performance, I would like to quote "up to 7 Gbps/instance writes (depending
> on hardware)" from CEP-28 as this CEP proposes to leverage the same.
>
> For:
>
> >When enabled for LCS, single sstable uplevel will mutate only the level
> of an SSTable in its stats metadata component, which wouldn't alter the
> filename and may not alter the length of the stats metadata component. A
> change to the level of an SSTable on the source via single sstable uplevel
> may not be caught by a digest based only on filename and length.
>
> In this case file size may not change, but the timestamp of last modified
> time would change, right? It is addressed in section MIGRATING ONE
> INSTANCE, point 2.b.ii which says "If a file is present at the destination
> but did not 

Re: Welcome Alexandre Dutra, Andrew Tolbert, Bret McGuire, Olivier Michallat as Cassandra Committers

2024-04-18 Thread Jean-Armel Luce
Congratulations everyone !!!

Le jeu. 18 avr. 2024 à 07:37, Berenguer Blasi  a
écrit :

> Congrats all!
> On 17/4/24 23:23, Jeremiah Jordan wrote:
>
> Congrats all!
>
>
> On Apr 17, 2024 at 12:10:11 PM, Benjamin Lerer  wrote:
>
>> The Apache Cassandra PMC is pleased to announce that Alexandre Dutra,
>> Andrew Tolbert, Bret McGuire and Olivier Michallat have accepted the
>> invitation to become committers on the java driver sub-project.
>>
>> Thanks for your contributions to the Java driver during all those years!
>> Congratulations and welcome!
>>
>> The Apache Cassandra PMC members
>>
>