I am sorry! This is true. I forgot "*not*"! 1. It's *not* recommended to use commit log after one node failure. Cassandra has many options such as replication factor as substitute solution.
*VafaTech.com - A Total Solution for Data Gathering & Analysis* On Tue, Dec 3, 2019 at 10:42 AM Adarsh Kumar <adarsh0...@gmail.com> wrote: > Thanks Hossein, > > Just one more question is there any special SOP or consideration we have > to take for multi-site backup. > > Please share any helpful link, blog or steps documented. > > Regards, > Adarsh Kumar > > On Sun, Dec 1, 2019 at 10:40 PM Hossein Ghiyasi Mehr < > ghiyasim...@gmail.com> wrote: > >> 1. It's recommended to use commit log after one node failure. Cassandra >> has many options such as replication factor as substitute solution. >> 2. Yes, right. >> >> *VafaTech.com - A Total Solution for Data Gathering & Analysis* >> >> >> On Fri, Nov 29, 2019 at 9:33 AM Adarsh Kumar <adarsh0...@gmail.com> >> wrote: >> >>> Thanks Ahu and Hussein, >>> >>> So my understanding is: >>> >>> 1. Commit log backup is not documented for Apache Cassandra, hence >>> not standard. But can be used for restore on the same machine (For taking >>> backup from commit_log_dir). If used on other machine(s) has to be in the >>> same topology. Can it be used for replacement node? >>> 2. For periodic backup Snapshot+Incremental backup is the best option >>> >>> >>> Thanks, >>> Adarsh Kumar >>> >>> On Fri, Nov 29, 2019 at 7:28 AM guo Maxwell <cclive1...@gmail.com> >>> wrote: >>> >>>> Hossein is right , But for use , we restore to the same cassandra >>>> topology ,So it is usable to do replay .But when restore to the >>>> same machine it is also usable . >>>> Using sstableloader cost too much time and more storage(though will >>>> reduce after restored) >>>> >>>> Hossein Ghiyasi Mehr <ghiyasim...@gmail.com> 于2019年11月28日周四 下午7:40写道: >>>> >>>>> commitlog backup isn't usable in another machine. >>>>> Backup solution depends on what you want to do: periodic backup or >>>>> backup to restore on other machine? >>>>> Periodic backup is combine of snapshot and incremental backup. Remove >>>>> incremental backup after new snapshot. >>>>> Take backup to restore on other machine: You can use snapshot after >>>>> flushing memtable or Use sstableloader. >>>>> >>>>> >>>>> ---- >>>>> VafaTech.com - A Total Solution for Data Gathering & Analysis >>>>> >>>>> On Thu, Nov 28, 2019 at 6:05 AM guo Maxwell <cclive1...@gmail.com> >>>>> wrote: >>>>> >>>>>> for cassandra or datastax's documentation, commitlog's backup is not >>>>>> mentioned. >>>>>> only snapshot and incremental backup is described to do backup . >>>>>> >>>>>> Though commitlog's archive for keyspace/table is not support but >>>>>> commitlog' replay (though you must put log to commitlog_dir and restart >>>>>> the >>>>>> process) >>>>>> support the feature of keyspace/table' replay filter (using >>>>>> -Dcassandra.replayList with the keyspace1.table1,keyspace1.table2 format >>>>>> to >>>>>> replay the specified keyspace/table) >>>>>> >>>>>> Snapshot do affect the storage, for us we got snapshot one week a >>>>>> time under the low business peak and making snapshot got throttle ,for >>>>>> you >>>>>> you may >>>>>> see the issue (https://issues.apache.org/jira/browse/CASSANDRA-13019) >>>>>> >>>>>> >>>>>> >>>>>> Adarsh Kumar <adarsh0...@gmail.com> 于2019年11月28日周四 上午1:00写道: >>>>>> >>>>>>> Thanks Guo and Eric for replying, >>>>>>> >>>>>>> I have some confusions about commit log backup: >>>>>>> >>>>>>> 1. commit log archival technique is ( >>>>>>> >>>>>>> https://support.datastax.com/hc/en-us/articles/115001593706-Manual-Backup-and-Restore-with-Point-in-time-and-table-level-restore- >>>>>>> ) as good as an incremental backup, as it also captures commit logs >>>>>>> after >>>>>>> memtable flush. >>>>>>> 2. If we go for "Snapshot + Incremental bk + Commit log", here >>>>>>> we have to take commit log from commit log directory (is there any >>>>>>> SOP for >>>>>>> this?). As commit logs are not per table or ks, we will have >>>>>>> chalange in >>>>>>> restoring selective tables. >>>>>>> 3. Snapshot based backups are easy to manage and operate due to >>>>>>> its simplicity. But they are heavy on storage. Any views on this? >>>>>>> 4. Please share any successful strategy that someone is using >>>>>>> for production. We are still in the design phase and want to >>>>>>> implement the >>>>>>> best solution. >>>>>>> >>>>>>> Thanks Eric for sharing link for medusa. >>>>>>> >>>>>>> Regards, >>>>>>> Adarsh Kumar >>>>>>> >>>>>>> On Wed, Nov 27, 2019 at 5:16 PM guo Maxwell <cclive1...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> For me, I think the last one : >>>>>>>> Snapshot + Incremental + commitlog >>>>>>>> is the most meaningful way to do backup and restore, when you make >>>>>>>> the data backup to some where else like AWS S3. >>>>>>>> >>>>>>>> - Snapshot based backup // for incremental data will not be >>>>>>>> backuped and may lose data when restore to the time latter than >>>>>>>> snapshot >>>>>>>> time; >>>>>>>> - Incremental backups // better than snapshot backup .but >>>>>>>> with Insufficient data accuracy. For data remain in the memtable >>>>>>>> will be >>>>>>>> lose; >>>>>>>> - Snapshot + incremental >>>>>>>> - Snapshot + commitlog archival // better data precision than >>>>>>>> made incremental backup, but the data in the non archived >>>>>>>> commitlog(not >>>>>>>> archive and commitlog log not closed) will not restore and will >>>>>>>> lose. Also >>>>>>>> when log is too much, do log reply will cost very mucu time >>>>>>>> >>>>>>>> For me ,We use snapshot + incremental + commitlog archive. We read >>>>>>>> snapshot data and incremental data .Also the log is backuped .But we >>>>>>>> will >>>>>>>> not backup the >>>>>>>> log whose data have been flush to sstable ,for the data will be >>>>>>>> backuped by the way we do incremental backup . >>>>>>>> >>>>>>>> This way , the data will exist in the format of sstable trough >>>>>>>> snapshot backup and incremental backup . The log number will be very >>>>>>>> small >>>>>>>> .And log replay will not cost much time. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Eric LELEU <e...@strapdata.com> 于2019年11月27日周三 下午4:13写道: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> TheLastPickle & Spotify have released Medusa as Cassandra Backup >>>>>>>>> tool. >>>>>>>>> >>>>>>>>> See : >>>>>>>>> https://thelastpickle.com/blog/2019/11/05/cassandra-medusa-backup-tool-is-open-source.html >>>>>>>>> >>>>>>>>> Hope this link will help you. >>>>>>>>> >>>>>>>>> Eric >>>>>>>>> >>>>>>>>> >>>>>>>>> Le 27/11/2019 à 08:10, Adarsh Kumar a écrit : >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I was looking for the backup strategies of Cassandra. After some >>>>>>>>> study I came to know that there are the following options: >>>>>>>>> >>>>>>>>> - Snapshot based backup >>>>>>>>> - Incremental backups >>>>>>>>> - Snapshot + incremental >>>>>>>>> - Snapshot + commitlog archival >>>>>>>>> - Snapshot + Incremental + commitlog >>>>>>>>> >>>>>>>>> Which is the most suitable and feasible approach? Also which of >>>>>>>>> these is used most. >>>>>>>>> Please let me know if there is any other option to tool available. >>>>>>>>> >>>>>>>>> Thanks in advance. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Adarsh Kumar >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> you are the apple of my eye ! >>>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> you are the apple of my eye ! >>>>>> >>>>> >>>> >>>> -- >>>> you are the apple of my eye ! >>>> >>>