Re: Optimal backup strategy

guo Maxwell Thu, 28 Nov 2019 17:59:09 -0800

Hossein is right , But for use , we restore to the same cassandra topology
,So it is usable to do replay .But when restore to the
same machine it is also usable .
Using sstableloader cost too much time and more storage(though will reduce
after  restored)


Hossein Ghiyasi Mehr <ghiyasim...@gmail.com> 于2019年11月28日周四 下午7:40写道：

> commitlog backup isn't usable in another machine.
> Backup solution depends on what you want to do: periodic backup or backup
> to restore on other machine?
> Periodic backup is combine of snapshot and incremental backup. Remove
> incremental backup after new snapshot.
> Take backup to restore on other machine: You can use snapshot after
> flushing memtable or Use sstableloader.
>
>
> ----
> VafaTech.com - A Total Solution for Data Gathering & Analysis
>
> On Thu, Nov 28, 2019 at 6:05 AM guo Maxwell <cclive1...@gmail.com> wrote:
>
>> for cassandra or datastax's documentation, commitlog's backup is not
>> mentioned.
>> only snapshot and incremental backup is described to do backup .
>>
>> Though commitlog's archive for keyspace/table is not support but
>> commitlog' replay (though you must put log to commitlog_dir and restart the
>> process)
>> support the feature of keyspace/table' replay filter (using
>> -Dcassandra.replayList with the keyspace1.table1,keyspace1.table2 format to
>> replay the specified keyspace/table)
>>
>> Snapshot do affect the storage, for us we got snapshot one week a time
>> under the low business peak and making snapshot got throttle ,for you you
>> may
>> see the issue (https://issues.apache.org/jira/browse/CASSANDRA-13019)
>>
>>
>>
>> Adarsh Kumar <adarsh0...@gmail.com> 于2019年11月28日周四 上午1:00写道：
>>
>>> Thanks Guo and Eric for replying,
>>>
>>> I have some confusions about commit log backup:
>>>
>>>    1. commit log archival technique is (
>>>    
>>> https://support.datastax.com/hc/en-us/articles/115001593706-Manual-Backup-and-Restore-with-Point-in-time-and-table-level-restore-
>>>    ) as good as an incremental backup, as it also captures commit logs after
>>>    memtable flush.
>>>    2. If we go for "Snapshot + Incremental bk + Commit log", here we
>>>    have to take commit log from commit log directory (is there any SOP for
>>>    this?). As commit logs are not per table or ks, we will have chalange in
>>>    restoring selective tables.
>>>    3. Snapshot based backups are easy to manage and operate due to its
>>>    simplicity. But they are heavy on storage. Any views on this?
>>>    4. Please share any successful strategy that someone is using for
>>>    production. We are still in the design phase and want to implement the 
>>> best
>>>    solution.
>>>
>>> Thanks Eric for sharing link for medusa.
>>>
>>> Regards,
>>> Adarsh Kumar
>>>
>>> On Wed, Nov 27, 2019 at 5:16 PM guo Maxwell <cclive1...@gmail.com>
>>> wrote:
>>>
>>>> For me, I think the last one :
>>>>  Snapshot + Incremental + commitlog
>>>> is the most meaningful way to do backup and restore, when you make the
>>>> data backup to some where else like AWS S3.
>>>>
>>>>    - Snapshot based backup // for incremental data will not be
>>>>    backuped and may lose data when restore to the time latter than snapshot
>>>>    time;
>>>>    - Incremental backups // better than snapshot backup .but
>>>>    with Insufficient data accuracy. For data remain in the memtable will be
>>>>    lose;
>>>>    - Snapshot + incremental
>>>>    - Snapshot + commitlog archival // better data precision than made
>>>>    incremental backup, but the data in the non archived commitlog(not 
>>>> archive
>>>>    and commitlog log not closed) will not restore and will lose. Also when 
>>>> log
>>>>    is too much, do log reply will cost very mucu time
>>>>
>>>> For me ,We use snapshot + incremental + commitlog archive. We read
>>>> snapshot data and incremental data .Also the log is backuped .But we will
>>>> not backup the
>>>> log whose data have been flush to sstable ,for the data will be
>>>> backuped by the way we do incremental backup .
>>>>
>>>> This way , the data will exist in the format of sstable trough snapshot
>>>> backup and incremental backup . The log number will be very small .And log
>>>> replay will not cost much time.
>>>>
>>>>
>>>>
>>>> Eric LELEU <e...@strapdata.com> 于2019年11月27日周三 下午4:13写道：
>>>>
>>>>> Hi,
>>>>> TheLastPickle & Spotify have released Medusa as Cassandra Backup tool.
>>>>>
>>>>> See :
>>>>> https://thelastpickle.com/blog/2019/11/05/cassandra-medusa-backup-tool-is-open-source.html
>>>>>
>>>>> Hope this link will help you.
>>>>>
>>>>> Eric
>>>>>
>>>>>
>>>>> Le 27/11/2019 à 08:10, Adarsh Kumar a écrit :
>>>>>
>>>>> Hi,
>>>>>
>>>>> I was looking for the backup strategies of Cassandra. After some study
>>>>> I came to know that there are the following options:
>>>>>
>>>>>    - Snapshot based backup
>>>>>    - Incremental backups
>>>>>    - Snapshot + incremental
>>>>>    - Snapshot + commitlog archival
>>>>>    - Snapshot + Incremental + commitlog
>>>>>
>>>>> Which is the most suitable and feasible approach? Also which of these
>>>>> is used most.
>>>>> Please let me know if there is any other option to tool available.
>>>>>
>>>>> Thanks in advance.
>>>>>
>>>>> Regards,
>>>>> Adarsh Kumar
>>>>>
>>>>>
>>>>
>>>> --
>>>> you are the apple of my eye !
>>>>
>>>
>>
>> --
>> you are the apple of my eye !
>>
>

-- 
you are the apple of my eye !

Re: Optimal backup strategy

Reply via email to