Re: [ceph-users] CEPH backup strategy and best practices

2017-06-04 Thread Benoit GEORGELIN - yulPa
- Mail original -
> De: "David" 
> À: "ceph-users" 
> Envoyé: Dimanche 4 Juin 2017 17:59:55
> Objet: Re: [ceph-users] CEPH backup strategy and best practices

> 4 juni 2017 kl. 23:23 skrev Roger Brown < [ mailto:rogerpbr...@gmail.com |
> rogerpbr...@gmail.com ] >:
> 
> I'm a n00b myself, but I'll go on record with my understanding.
> 
> On Sun, Jun 4, 2017 at 3:03 PM Benoit GEORGELIN - yulPa < [
> mailto:benoit.george...@yulpa.io | benoit.george...@yulpa.io ] > wrote:
> 
> BQ_BEGIN
> 
> Hi ceph users,
> 
> Ceph have a very good documentation about technical usage, but there is a lot 
> of
> conceptual things missing (from my point of view)
> It's not easy to understand all at the same time, but yes, little by little 
> it's
> working.
> 
> Here are some questions about ceph , hope someone can take a little time to
> point me where I can find answers :
> 
> - Backup :
> Do you backup data from a CEPH cluster or you consider a copy as a backup of
> that file ?
> Let's say I have replica size of 3 . Somehow , my crush map will keep 2 copy 
> in
> my main rack and 1 copy to another rack in another datacenter
> Can I consider the third copy as a backup ? What would be your position ?
> 
> 
> 
> 
> Replicas are not backups. Just ask GitLab after accidental deletion. source: [
> https://www.theregister.co.uk/2017/02/01/gitlab_data_loss/ |
> https://www.theregister.co.uk/2017/02/01/gitlab_data_loss/ ]
> 
> 
> BQ_BEGIN
> 
> 
> - Writing process of ceph object storage using radosgw
> Simple question, but not sure about it.
> The more replica the more slower will be my cluster ? Does CEPH have to
> acknowledge the number of replica before saying it's good ?
> From what I read, CEPH will write and acknowledge the de primary OSD of the 
> pool
> , So if that the cas, I does not matter how many replica I want and how far 
> are
> situated the others OSD that would work the same.
> Can I chose myseft the primary OSD in my zone 1 , have a copy on zone 2 (same
> rack) and a third zone 3 in another datacenter that might have some latency .
> 
> BQ_END
> 
> 
> More replicas make slower cluster because it waits for all devices to
> acknowledge write before reporting back. source: ?
> 
> BQ_END
> 
> 
> I’d say stick with 3 replicas in one DC, then if you want to add another DC 
> for
> better data protection (note, not backup), you’ll just add asynchronous
> mirroring between DCs ( [ http://docs.ceph.com/docs/master/rbd/rbd-mirroring/ 
> |
> http://docs.ceph.com/docs/master/rbd/rbd-mirroring/ ] ) with another cluster
> there.
> That way you’ll have a quick cluster (especially if you use awesome disks like
> NVME SSD journals + SSD storage or better) with a location redundancy.
> 
> 
> BQ_BEGIN
> 
> 
> 
> BQ_BEGIN
> 
> 
> - Data persistance / availability
> If crush map is by hosts and I have 3 hosts with replication of 3
> This means , I will have 1 copy on each hosts
> Does it means I can lost 2 hosts and still have my cluster working, at least 
> on
> read mode ? and eventually in write too if i say , osd pool default min size =
> 1
> 
> BQ_END
> 
> 
> Yes, I think. But best practice is to have at least 5 hosts (N+2) so you can
> lose 2 hosts and still keep 3 replicas.
> 
> BQ_END
> 
> 
> Keep in mind that you ”should” have enough storage free as well to be able to
> loose 2 nodes. If you fill 5 nodes to 80% and loose 2 nodes you won’t be able
> to repair it all until you get them up and running again.
> 

This apply only if you crush map is just OSD and not HOST right ?
Because if I have 5 NODES, replica 3 and all nodes are 80% used , if i loose 2 
nodes, I still have 3 Nodes up at 80%, the cluster should be told not to 
recover the copies missing.. that the only thing 


 
> 
> 
> 
>
> 
> 
> Thanks for your help.
> -
> 
> Benoît G
> ___
> ceph-users mailing list
> [ mailto:ceph-users@lists.ceph.com | ceph-users@lists.ceph.com ]
> [ http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com |
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ]
> BQ_END
> 
> 
> Roger
> ___
> ceph-users mailing list
> [ mailto:ceph-users@lists.ceph.com | ceph-users@lists.ceph.com ]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> BQ_END
> 
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CEPH backup strategy and best practices

2017-06-04 Thread David


> 4 juni 2017 kl. 23:23 skrev Roger Brown :
> 
> I'm a n00b myself, but I'll go on record with my understanding.
> 
> On Sun, Jun 4, 2017 at 3:03 PM Benoit GEORGELIN - yulPa 
> mailto:benoit.george...@yulpa.io>> wrote:
> Hi ceph users, 
> 
> Ceph have a very good documentation about technical usage, but there is a lot 
> of conceptual things missing (from my point of view) 
> It's not easy to understand all at the same time, but yes, little by little 
> it's working. 
> 
> Here are some questions about ceph , hope someone can take a little time to 
> point me where I can find answers :
> 
>  - Backup  :
> Do you backup data from a CEPH cluster or you consider a copy as a backup of 
> that file ? 
> Let's say I have replica size of 3 . Somehow , my crush map will keep 2 copy 
> in my main rack and 1 copy to another rack in another datacenter 
> Can I consider the third copy as a backup ? What would be your position ? 
> 
> Replicas are not backups. Just ask GitLab after accidental deletion. source: 
> https://www.theregister.co.uk/2017/02/01/gitlab_data_loss/ 
> 
> 
> 
> - Writing process of ceph object storage using radosgw
> Simple question, but not sure about it. 
> The more replica the more slower will be my cluster ? Does CEPH have to 
> acknowledge  the number of replica before saying it's good ? 
> From what I read, CEPH will write and acknowledge the de primary OSD of the 
> pool , So if that the cas, I does not matter how many replica I want and how 
> far are situated the others OSD that would work the same. 
> Can I chose myseft the primary OSD in my zone 1 ,  have a copy on zone 2 
> (same rack) and a third zone 3 in another datacenter that might have some 
> latency . 
> 
> More replicas make slower cluster because it waits for all devices to 
> acknowledge write before reporting back. source: ?

I’d say stick with 3 replicas in one DC, then if you want to add another DC for 
better data protection (note, not backup), you’ll just add asynchronous 
mirroring between DCs (http://docs.ceph.com/docs/master/rbd/rbd-mirroring/ 
) with another cluster 
there.
That way you’ll have a quick cluster (especially if you use awesome disks like 
NVME SSD journals + SSD storage or better) with a location redundancy.

> 
> 
> - Data persistance / availability 
> If crush map is by hosts and I have 3 hosts with replication of 3 
> This means , I will have 1 copy on each hosts
> Does it means I can lost 2 hosts and still have my cluster working, at least 
> on read mode ? and eventually in write too if i say , osd pool default min 
> size = 1
> 
> Yes, I think. But best practice is to have at least 5 hosts (N+2) so you can 
> lose 2 hosts and still keep 3 replicas.
>  

Keep in mind that you ”should” have enough storage free as well to be able to 
loose 2 nodes. If you fill 5 nodes to 80% and loose 2 nodes you won’t be able 
to repair it all until you get them up and running again.

> 
> Thanks for your help. 
> - 
> 
> Benoît G
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> 
> 
> Roger
>  
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CEPH backup strategy and best practices

2017-06-04 Thread Roger Brown
I'm a n00b myself, but I'll go on record with my understanding.

On Sun, Jun 4, 2017 at 3:03 PM Benoit GEORGELIN - yulPa <
benoit.george...@yulpa.io> wrote:

> Hi ceph users,
>
> Ceph have a very good documentation about technical usage, but there is a
> lot of conceptual things missing (from my point of view)
> It's not easy to understand all at the same time, but yes, little by
> little it's working.
>
> Here are some questions about ceph , hope someone can take a little time
> to point me where I can find answers :
>
>  - Backup  :
> Do you backup data from a CEPH cluster or you consider a copy as a backup
> of that file ?
> Let's say I have replica size of 3 . Somehow , my crush map will keep 2
> copy in my main rack and 1 copy to another rack in another datacenter
> Can I consider the third copy as a backup ? What would be your position ?
>

Replicas are not backups. Just ask GitLab after accidental deletion.
source: https://www.theregister.co.uk/2017/02/01/gitlab_data_loss/


> - Writing process of ceph object storage using radosgw
> Simple question, but not sure about it.
> The more replica the more slower will be my cluster ? Does CEPH have to
> acknowledge  the number of replica before saying it's good ?
> From what I read, CEPH will write and acknowledge the de primary OSD of
> the pool , So if that the cas, I does not matter how many replica I want
> and how far are situated the others OSD that would work the same.
> Can I chose myseft the primary OSD in my zone 1 ,  have a copy on zone 2
> (same rack) and a third zone 3 in another datacenter that might have some
> latency .
>

More replicas make slower cluster because it waits for all devices to
acknowledge write before reporting back. source: ?


> - Data persistance / availability
> If crush map is by hosts and I have 3 hosts with replication of 3
> This means , I will have 1 copy on each hosts
> Does it means I can lost 2 hosts and still have my cluster working, at
> least on read mode ? and eventually in write too if i say , osd pool
> default min size = 1
>

Yes, I think. But best practice is to have at least 5 hosts (N+2) so you
can lose 2 hosts and still keep 3 replicas.


>
> Thanks for your help.
> -
>
> Benoît G
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Roger
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com