If you are thinking about using Amazon S3 storage I wrote a tool that
performs snapshots and backups on multiple nodes.
Backups are stored compressed on S3.
https://github.com/tbarbugli/cassandra_snapshotter

Cheers,
Tommaso


2014-05-02 10:42 GMT+02:00 Artur Kronenberg <artur.kronenb...@openmarket.com
>:

> Hi,
>
> we are running a 7 node cluster with an RF of 5. Each node holds about 70%
> of the data and we are now wondering about the backup process.
>
> 1. Is there a best practice procedure or a tool that we can use to have
> one backup that holds 100 % of the data or is it necessary for us to take
> multiple backups.
>
> 2. If we have to use multiple backups, is there a way to combine them? We
> would like to be able to start up a 1 node cluster that holds 100% of data
> if necessary. Can we just chug all sstables into the data directory and
> cassandra will figure out the rest?
>
> 3. How do we handle the commitlog files from all of our nodes? Given we'd
> like to restore to a certain point in time and we have all the commitlogs,
> can we have commitlogs from multiple locations in the commitlog folder and
> cassandra will pick and execute the right thing?
>
> 4. If all of the above would work, could we in case of emergency setup a
> massive 1-node cluster that holds 100 % of the data and repair the rest of
> our cluster based of this? E.g. have the 1 node run with the correct data,
> and then hook it into our existing cluster and call repair on it to restore
> data on the rest of our nodes?
>
> Thanks for your help!
>
> Cheers,
>
> Artur
>

Reply via email to