Re: [ceph-users] CEPH backup strategy and best practices

David Sun, 04 Jun 2017 15:00:51 -0700


> 4 juni 2017 kl. 23:23 skrev Roger Brown <rogerpbr...@gmail.com>:
> 
> I'm a n00b myself, but I'll go on record with my understanding.
> 
> On Sun, Jun 4, 2017 at 3:03 PM Benoit GEORGELIN - yulPa 
> <benoit.george...@yulpa.io <mailto:benoit.george...@yulpa.io>> wrote:
> Hi ceph users, 
> 
> Ceph have a very good documentation about technical usage, but there is a lot 
> of conceptual things missing (from my point of view) 
> It's not easy to understand all at the same time, but yes, little by little 
> it's working. 
> 
> Here are some questions about ceph , hope someone can take a little time to 
> point me where I can find answers :
> 
>  - Backup  :
> Do you backup data from a CEPH cluster or you consider a copy as a backup of 
> that file ? 
> Let's say I have replica size of 3 . Somehow , my crush map will keep 2 copy 
> in my main rack and 1 copy to another rack in another datacenter 
> Can I consider the third copy as a backup ? What would be your position ? 
> 
> Replicas are not backups. Just ask GitLab after accidental deletion. source: 
> https://www.theregister.co.uk/2017/02/01/gitlab_data_loss/ 
> <https://www.theregister.co.uk/2017/02/01/gitlab_data_loss/>
> 
> 
> - Writing process of ceph object storage using radosgw
> Simple question, but not sure about it. 
> The more replica the more slower will be my cluster ? Does CEPH have to 
> acknowledge  the number of replica before saying it's good ? 
> From what I read, CEPH will write and acknowledge the de primary OSD of the 
> pool , So if that the cas, I does not matter how many replica I want and how 
> far are situated the others OSD that would work the same. 
> Can I chose myseft the primary OSD in my zone 1 ,  have a copy on zone 2 
> (same rack) and a third zone 3 in another datacenter that might have some 
> latency . 
> 
> More replicas make slower cluster because it waits for all devices to 
> acknowledge write before reporting back. source: ?


I’d say stick with 3 replicas in one DC, then if you want to add another DC for 
better data protection (note, not backup), you’ll just add asynchronous 
mirroring between DCs (http://docs.ceph.com/docs/master/rbd/rbd-mirroring/ 
<http://docs.ceph.com/docs/master/rbd/rbd-mirroring/>) with another cluster 
there.
That way you’ll have a quick cluster (especially if you use awesome disks like 
NVME SSD journals + SSD storage or better) with a location redundancy.

> 
> 
> - Data persistance / availability 
> If crush map is by hosts and I have 3 hosts with replication of 3 
> This means , I will have 1 copy on each hosts
> Does it means I can lost 2 hosts and still have my cluster working, at least 
> on read mode ? and eventually in write too if i say , osd pool default min 
> size = 1
> 
> Yes, I think. But best practice is to have at least 5 hosts (N+2) so you can 
> lose 2 hosts and still keep 3 replicas.
>  

Keep in mind that you ”should” have enough storage free as well to be able to 
loose 2 nodes. If you fill 5 nodes to 80% and loose 2 nodes you won’t be able 
to repair it all until you get them up and running again.

> 
> Thanks for your help. 
> - 
> 
> Benoît G
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
> 
> Roger
>  
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CEPH backup strategy and best practices

Reply via email to