RE: Backups in Cassandra
Just to add to this… We do snapshot, incremental and commitlog backups along with schema and config backups. All is copied to S3 although we do keep a small number of snapshots / inc / commitlog on the local node in the rare event they are needed. We have written some Ansible to restore the whole cluster. If your cluster is beyond a trivial number of nodes then some type of manageable automation is required. Cheers, R From: cclive1601你 Sent: 08 August 2019 04:30 To: user@cassandra.apache.org Subject: Re: Backups in Cassandra We have also made backup and restore for Apache Cassandra,backup process are 1.do incremental backup for flushed sstable ;do incremental backup for commitlog ; 2.do snapshot for the cluster periodically,also meta info are needed to backup(token and table info); 3.for exception like node joining and move(if exist),leave , refresh the meta info backup; restore 1.use incremental sstable to reduce the number of commitlog for restore ,for log replay cost much time ; 2.all sstable can do bulkload(just node refresh (so ,my restore node's number need to be the same as backup,for sstableloader, it cost much time than this method,though use loader does not need the node to be same as backup)) Connor Lin mailto:linba...@gmail.com>> 于2019年8月8日周四 上午10:17写道: Hi Krish, It is recommended to have backups. Although I haven't practiced it myself, but I find this might be helpful. https://thelastpickle.com/blog/2018/04/03/cassandra-backup-and-restore-aws-ebs.html Sincerely yours, Connor Lin On Thu, Aug 8, 2019 at 5:47 AM Krish Donald mailto:gotomyp...@gmail.com>> wrote: Hi Folks, First question is , Do you take backup for your cassandra cluster ? If answer is yes then question follows: 1. How do you take backup ? 1.1 ) Is it only snapshot? 1.2 ) We are on AWS with very large cluster around 51 nodes with 1TB data on each node. 1.3) Do you take backup and move it to S3 ? 2. If you take backup, how restore process worked for you? Thanks Krish -- you are the apple of my eye !
Repairs/compactions on tables with solr indexes
Hello, we are using DSE Search workload with Search and Cass running on same nodes/jvm. 1. When repairs are run, does it initiate rebuilds of solr indexes? Does it rebuild only when any data is repaired? 2. How about the compactions, does it trigger any search indexes rebuilds? I guess not, since data is not getting changed, but not sure. Or maybe when it cleans tombstones, how does solr handles deleted data? 4. Is it generally a good idea to run both Cass and Search on same node/JVM? Any potential issues which could arise from such a setup or thats a good way to setup since data is colocated on the same nodes. Regards, Ayub
Re: Backups in Cassandra
We have also made backup and restore for Apache Cassandra,backup process are 1.do incremental backup for flushed sstable ;do incremental backup for commitlog ; 2.do snapshot for the cluster periodically,also meta info are needed to backup(token and table info); 3.for exception like node joining and move(if exist),leave , refresh the meta info backup; restore 1.use incremental sstable to reduce the number of commitlog for restore ,for log replay cost much time ; 2.all sstable can do bulkload(just node refresh (so ,my restore node's number need to be the same as backup,for sstableloader, it cost much time than this method,though use loader does not need the node to be same as backup)) Connor Lin 于2019年8月8日周四 上午10:17写道: > Hi Krish, > > It is recommended to have backups. Although I haven't practiced it myself, > but I find this might be helpful. > > https://thelastpickle.com/blog/2018/04/03/cassandra-backup-and-restore-aws-ebs.html > > Sincerely yours, > > Connor Lin > > > On Thu, Aug 8, 2019 at 5:47 AM Krish Donald wrote: > >> Hi Folks, >> >> First question is , Do you take backup for your cassandra cluster ? >> If answer is yes then question follows: >> 1. How do you take backup ? >> 1.1 ) Is it only snapshot? >> 1.2 ) We are on AWS with very large cluster around 51 nodes >> with 1TB data on each node. >> 1.3) Do you take backup and move it to S3 ? >> >> 2. If you take backup, how restore process worked for you? >> >> Thanks >> Krish >> > -- you are the apple of my eye !
Re: Backups in Cassandra
Hi Krish, It is recommended to have backups. Although I haven't practiced it myself, but I find this might be helpful. https://thelastpickle.com/blog/2018/04/03/cassandra-backup-and-restore-aws-ebs.html Sincerely yours, Connor Lin On Thu, Aug 8, 2019 at 5:47 AM Krish Donald wrote: > Hi Folks, > > First question is , Do you take backup for your cassandra cluster ? > If answer is yes then question follows: > 1. How do you take backup ? > 1.1 ) Is it only snapshot? > 1.2 ) We are on AWS with very large cluster around 51 nodes > with 1TB data on each node. > 1.3) Do you take backup and move it to S3 ? > > 2. If you take backup, how restore process worked for you? > > Thanks > Krish >
Backups in Cassandra
Hi Folks, First question is , Do you take backup for your cassandra cluster ? If answer is yes then question follows: 1. How do you take backup ? 1.1 ) Is it only snapshot? 1.2 ) We are on AWS with very large cluster around 51 nodes with 1TB data on each node. 1.3) Do you take backup and move it to S3 ? 2. If you take backup, how restore process worked for you? Thanks Krish
Re: Datafile Corruption
Repair during upgrade have caused corruption too. Also, dropping and adding columns with same name but different type Regards, Nitan Cell: 510 449 9629 > On Aug 7, 2019, at 2:42 PM, Jeff Jirsa wrote: > > Is compression enabled? > > If not, bit flips on disk can corrupt data files and reads + repair may send > that corruption to other hosts in the cluster > > >> On Aug 7, 2019, at 3:46 AM, Philip Ó Condúin >> wrote: >> >> Hi All, >> >> I am currently experiencing multiple datafile corruptions across most nodes >> in my cluster, there seems to be no pattern to the corruption. I'm starting >> to think it might be a bug, we're using Cassandra 2.2.13. >> >> Without going into detail about the issue I just want to confirm something. >> >> Can someone share with me a list of scenarios that would cause corruption? >> >> 1. OS failure >> 2. Cassandra disturbed during the writing >> >> etc etc. >> >> I need to investigate each scenario and don't want to leave any out. >> >> -- >> Regards, >> Phil
Re: Datafile Corruption
Is compression enabled? If not, bit flips on disk can corrupt data files and reads + repair may send that corruption to other hosts in the cluster > On Aug 7, 2019, at 3:46 AM, Philip Ó Condúin wrote: > > Hi All, > > I am currently experiencing multiple datafile corruptions across most nodes > in my cluster, there seems to be no pattern to the corruption. I'm starting > to think it might be a bug, we're using Cassandra 2.2.13. > > Without going into detail about the issue I just want to confirm something. > > Can someone share with me a list of scenarios that would cause corruption? > > 1. OS failure > 2. Cassandra disturbed during the writing > > etc etc. > > I need to investigate each scenario and don't want to leave any out. > > -- > Regards, > Phil
Re: Datafile Corruption
Few for reasons: Sudden Power cut Disk full Issue in casandra version like Cassandra-13752 On Wed, Aug 7, 2019, 4:16 PM Philip Ó Condúin wrote: > Hi All, > > I am currently experiencing multiple datafile corruptions across most > nodes in my cluster, there seems to be no pattern to the corruption. I'm > starting to think it might be a bug, we're using Cassandra 2.2.13. > > Without going into detail about the issue I just want to confirm something. > > Can someone share with me a list of scenarios that would cause corruption? > > 1. OS failure > 2. Cassandra disturbed during the writing > > etc etc. > > I need to investigate each scenario and don't want to leave any out. > > -- > Regards, > Phil >
Point in time restore not working when primary key is blob
Hi, I'm trying to do a point in time restore using commit logs. It seems to be working fine if I have primary as text but it does not work and just restores all the rows if the primary key is blob (which is default if I create a keyspace using cassandra-stress). Is this a known issue? Thanks -- Shaurya Gupta
Datafile Corruption
Hi All, I am currently experiencing multiple datafile corruptions across most nodes in my cluster, there seems to be no pattern to the corruption. I'm starting to think it might be a bug, we're using Cassandra 2.2.13. Without going into detail about the issue I just want to confirm something. Can someone share with me a list of scenarios that would cause corruption? 1. OS failure 2. Cassandra disturbed during the writing etc etc. I need to investigate each scenario and don't want to leave any out. -- Regards, Phil