Hardware failures are just one possible cause. If you value your data you will 
have a backup and preferably going to some sort of removable media that can be 
taken offsite, like those things that everybody keeps saying are dead…..what 
are they called….oh yeah tapes. J A online copy of your data on some sort of 
large JBOD or 2nd Ceph cluster is a good idea if you need faster access, but I 
wouldn’t rely on it for my only backup.

 

There are many things that can cause data loss, failing hardware is just one. 
As can be seen through many posts on this list, bugs in Ceph or user error is a 
much more common cause of data loss and triple replication won’t protect you 
from it. Thought should also be given to malicious actions by internal staff 
with grievances or external hackers (eg ransomware). In these cases even online 
backups like rsync…etc, might not protect you as that data can be accessed and 
deleted at the same time as the live data. I predict these sort of incidents 
will become more common in the near future.

 

 

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
????????????, ????????
Sent: 14 February 2017 09:56
To: Götz Reinicke <goetz.reini...@filmakademie.de>
Cc: ceph new <ceph-users@lists.ceph.com>
Subject: Re: [ceph-users] To backup or not to backup the classic way - How to 
backup hundreds of TB?

 

Hello!

 

  The answer is pretty much depends on your fears. If you afraid of hardware 
failures you could have more then standard 3 copies, configure your failure 
domain properly and so on. If you afraid of some big disaster that can hurt all 
of your hardware - you could consider making an async replica to a cluster in 
an another datacenter on another content. If you afraid of some kind of cluster 
software issues - then you can build an another cluster and use third-party 
tools to backup data there, but as you correctly noticed it will not be too 
convenient.

 

As a common sollution I would offer you to use the same cluster for backups as 
well (may be just a different pool\OSD tree with less expensive drives) - in 
most cases it's enough.


Best regards,

Vladimir

 

2017-02-14 14:15 GMT+05:00 Götz Reinicke <goetz.reini...@filmakademie.de 
<mailto:goetz.reini...@filmakademie.de> >:

Hi,

I guess that's a question that pops up in different places, but I could not 
find any which fits to my thoughts.

Currently we start to use ceph for file shares of our films produced by our 
students and some xen/vmware VMs. Thd VM data is already backed up; the fils 
original footage is stored in other places.

We start with some 100TB rbd and mount smb/NFS shares from the clients. May be 
we look into ceph fs soon.

The question is: How would someone handle a backup of 100 TB data? Rsyncing 
that to an other system or having a commercial backup solution looks not that 
good e.g. regarding the price.

One thought is, is there some sort of best practice in the ceph world e.g. 
replicating to an other physical independent cluster? Or use more replicas, 
odds, nodes and do snapshots in one cluster?

Having productive data and backup on the same hardware currently makes me feel 
not that good too….But the world changes :)

Long story short: How do you do backup hundreds of TB?

        Curious for suggestions and thoughts .. Thanks and Regards . Götz


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

 

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to