Re: Failed disks - correct procedure

2023-01-16 Thread Tolbert, Andy
Hi Joe, I'd recommend just doing a replacement, bringing up a new node with -Dcassandra.replace_address_first_boot=ip.you.are.replacing as described here: https://cassandra.apache.org/doc/4.1/cassandra/operating/topo_changes.html#replacing-a-dead-node Before you do that, you will want to make sur

Re: Failed disks - correct procedure

2023-01-16 Thread Joe Obernberger
Thank you Andy. Is there a way to just remove the drive from the cluster and replace it later?  Ordering replacement drives isn't a fast process... What I've done so far is: Stop node Remove drive reference from /etc/cassandra/conf/cassandra.yaml Restart node Run repair Will that work?  Right n

Re: Failed disks - correct procedure

2023-01-16 Thread Tolbert, Andy
Hi Joe, Reading it back I realized I misunderstood that part of your email, so you must be using data_file_directories with 16 drives? That's a lot of drives! I imagine this may happen from time to time given that disks like to fail. That's a bit of an interesting scenario that I would have to

Re: Failed disks - correct procedure

2023-01-16 Thread Jeff Jirsa
Prior to cassandra-6696 you’d have to treat one missing disk as a failed machine, wipe all the data and re-stream it, as a tombstone for a given value may be on one disk and data on another (effectively redirecting data) So the answer has to be version dependent, too - which version were you usi

Re: Failed disks - correct procedure

2023-01-16 Thread Joe Obernberger
I'm using 4.1.0-1. I've been doing a lot of truncates lately before the drive failed (research project).  Current drives have about 100GBytes of data each, although the actual amount of data in Cassandra is much less (because of truncates and snapshots).  The cluster is not homo-genius; some no

RE: Failed disks - correct procedure

2023-01-17 Thread Marc Hoppins
. -Original Message- From: Joe Obernberger Sent: Monday, January 16, 2023 6:31 PM To: Jeff Jirsa ; user@cassandra.apache.org Subject: Re: Failed disks - correct procedure EXTERNAL I'm using 4.1.0-1. I've been doing a lot of truncates lately before the drive failed (research project). Curr

RE: Failed disks - correct procedure

2023-01-17 Thread Durity, Sean R via user
. Sean R. Durity From: Marc Hoppins Sent: Tuesday, January 17, 2023 3:57 AM To: user@cassandra.apache.org Subject: [EXTERNAL] RE: Failed disks - correct procedure HI all, I was pondering this very situation. We have a node with a crapped-out disk (not the first time). Removenode vs repairnode

Re: Failed disks - correct procedure

2023-01-17 Thread Joe Obernberger
pins *Sent:* Tuesday, January 17, 2023 3:57 AM *To:* user@cassandra.apache.org *Subject:* [EXTERNAL] RE: Failed disks - correct procedure HI all, I was pondering this very situation. We have a node with a crapped-out disk (not the first time). Removenode vs repairnode: in regard time, there is

Re: Failed disks - correct procedure

2023-01-17 Thread C. Scott Andreas
From: Joe Obernberger <joseph.obernber...@gmail.com> Sent: Monday, January 16, 2023 6:31 PM To: Jeff Jirsa <jji...@gmail.com>; user@cassandra.apache.org Subject: Re: Failed disks - correct procedure   EXTERNAL     I&

Re: Failed disks - correct procedure

2023-01-23 Thread Joe Obernberger
esented as one file system for data, which is not what the original question was asking. Sean R. Durity *From:*Marc Hoppins *Sent:* Tuesday, January 17, 2023 3:57 AM *To:* user@cassandra.apache.org *Subject:* [EXTERNAL] RE: Failed disks - correct procedure HI all, I was pondering this ve