Re: [ceph-users] CEPH backup strategy and best practices

2017-06-04 Thread Benoit GEORGELIN - yulPa
- Mail original -
> De: "David" <da...@visions.se>
> À: "ceph-users" <ceph-users@lists.ceph.com>
> Envoyé: Dimanche 4 Juin 2017 17:59:55
> Objet: Re: [ceph-users] CEPH backup strategy and best practices

> 4 juni 2017 kl. 23:23 skrev Roger Brown < [ mailto:rogerpbr...@gmail.com |
> rogerpbr...@gmail.com ] >:
> 
> I'm a n00b myself, but I'll go on record with my understanding.
> 
> On Sun, Jun 4, 2017 at 3:03 PM Benoit GEORGELIN - yulPa < [
> mailto:benoit.george...@yulpa.io | benoit.george...@yulpa.io ] > wrote:
> 
> BQ_BEGIN
> 
> Hi ceph users,
> 
> Ceph have a very good documentation about technical usage, but there is a lot 
> of
> conceptual things missing (from my point of view)
> It's not easy to understand all at the same time, but yes, little by little 
> it's
> working.
> 
> Here are some questions about ceph , hope someone can take a little time to
> point me where I can find answers :
> 
> - Backup :
> Do you backup data from a CEPH cluster or you consider a copy as a backup of
> that file ?
> Let's say I have replica size of 3 . Somehow , my crush map will keep 2 copy 
> in
> my main rack and 1 copy to another rack in another datacenter
> Can I consider the third copy as a backup ? What would be your position ?
> 
> 
> 
> 
> Replicas are not backups. Just ask GitLab after accidental deletion. source: [
> https://www.theregister.co.uk/2017/02/01/gitlab_data_loss/ |
> https://www.theregister.co.uk/2017/02/01/gitlab_data_loss/ ]
> 
> 
> BQ_BEGIN
> 
> 
> - Writing process of ceph object storage using radosgw
> Simple question, but not sure about it.
> The more replica the more slower will be my cluster ? Does CEPH have to
> acknowledge the number of replica before saying it's good ?
> From what I read, CEPH will write and acknowledge the de primary OSD of the 
> pool
> , So if that the cas, I does not matter how many replica I want and how far 
> are
> situated the others OSD that would work the same.
> Can I chose myseft the primary OSD in my zone 1 , have a copy on zone 2 (same
> rack) and a third zone 3 in another datacenter that might have some latency .
> 
> BQ_END
> 
> 
> More replicas make slower cluster because it waits for all devices to
> acknowledge write before reporting back. source: ?
> 
> BQ_END
> 
> 
> I’d say stick with 3 replicas in one DC, then if you want to add another DC 
> for
> better data protection (note, not backup), you’ll just add asynchronous
> mirroring between DCs ( [ http://docs.ceph.com/docs/master/rbd/rbd-mirroring/ 
> |
> http://docs.ceph.com/docs/master/rbd/rbd-mirroring/ ] ) with another cluster
> there.
> That way you’ll have a quick cluster (especially if you use awesome disks like
> NVME SSD journals + SSD storage or better) with a location redundancy.
> 
> 
> BQ_BEGIN
> 
> 
> 
> BQ_BEGIN
> 
> 
> - Data persistance / availability
> If crush map is by hosts and I have 3 hosts with replication of 3
> This means , I will have 1 copy on each hosts
> Does it means I can lost 2 hosts and still have my cluster working, at least 
> on
> read mode ? and eventually in write too if i say , osd pool default min size =
> 1
> 
> BQ_END
> 
> 
> Yes, I think. But best practice is to have at least 5 hosts (N+2) so you can
> lose 2 hosts and still keep 3 replicas.
> 
> BQ_END
> 
> 
> Keep in mind that you ”should” have enough storage free as well to be able to
> loose 2 nodes. If you fill 5 nodes to 80% and loose 2 nodes you won’t be able
> to repair it all until you get them up and running again.
> 

This apply only if you crush map is just OSD and not HOST right ?
Because if I have 5 NODES, replica 3 and all nodes are 80% used , if i loose 2 
nodes, I still have 3 Nodes up at 80%, the cluster should be told not to 
recover the copies missing.. that the only thing 


 
> 
> 
> 
>
> 
> 
> Thanks for your help.
> -
> 
> Benoît G
> ___
> ceph-users mailing list
> [ mailto:ceph-users@lists.ceph.com | ceph-users@lists.ceph.com ]
> [ http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com |
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ]
> BQ_END
> 
> 
> Roger
> ___
> ceph-users mailing list
> [ mailto:ceph-users@lists.ceph.com | ceph-users@lists.ceph.com ]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> BQ_END
> 
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CEPH backup strategy and best practices

2017-06-04 Thread David


> 4 juni 2017 kl. 23:23 skrev Roger Brown :
> 
> I'm a n00b myself, but I'll go on record with my understanding.
> 
> On Sun, Jun 4, 2017 at 3:03 PM Benoit GEORGELIN - yulPa 
> > wrote:
> Hi ceph users, 
> 
> Ceph have a very good documentation about technical usage, but there is a lot 
> of conceptual things missing (from my point of view) 
> It's not easy to understand all at the same time, but yes, little by little 
> it's working. 
> 
> Here are some questions about ceph , hope someone can take a little time to 
> point me where I can find answers :
> 
>  - Backup  :
> Do you backup data from a CEPH cluster or you consider a copy as a backup of 
> that file ? 
> Let's say I have replica size of 3 . Somehow , my crush map will keep 2 copy 
> in my main rack and 1 copy to another rack in another datacenter 
> Can I consider the third copy as a backup ? What would be your position ? 
> 
> Replicas are not backups. Just ask GitLab after accidental deletion. source: 
> https://www.theregister.co.uk/2017/02/01/gitlab_data_loss/ 
> 
> 
> 
> - Writing process of ceph object storage using radosgw
> Simple question, but not sure about it. 
> The more replica the more slower will be my cluster ? Does CEPH have to 
> acknowledge  the number of replica before saying it's good ? 
> From what I read, CEPH will write and acknowledge the de primary OSD of the 
> pool , So if that the cas, I does not matter how many replica I want and how 
> far are situated the others OSD that would work the same. 
> Can I chose myseft the primary OSD in my zone 1 ,  have a copy on zone 2 
> (same rack) and a third zone 3 in another datacenter that might have some 
> latency . 
> 
> More replicas make slower cluster because it waits for all devices to 
> acknowledge write before reporting back. source: ?

I’d say stick with 3 replicas in one DC, then if you want to add another DC for 
better data protection (note, not backup), you’ll just add asynchronous 
mirroring between DCs (http://docs.ceph.com/docs/master/rbd/rbd-mirroring/ 
) with another cluster 
there.
That way you’ll have a quick cluster (especially if you use awesome disks like 
NVME SSD journals + SSD storage or better) with a location redundancy.

> 
> 
> - Data persistance / availability 
> If crush map is by hosts and I have 3 hosts with replication of 3 
> This means , I will have 1 copy on each hosts
> Does it means I can lost 2 hosts and still have my cluster working, at least 
> on read mode ? and eventually in write too if i say , osd pool default min 
> size = 1
> 
> Yes, I think. But best practice is to have at least 5 hosts (N+2) so you can 
> lose 2 hosts and still keep 3 replicas.
>  

Keep in mind that you ”should” have enough storage free as well to be able to 
loose 2 nodes. If you fill 5 nodes to 80% and loose 2 nodes you won’t be able 
to repair it all until you get them up and running again.

> 
> Thanks for your help. 
> - 
> 
> Benoît G
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> 
> 
> Roger
>  
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CEPH backup strategy and best practices

2017-06-04 Thread Roger Brown
I'm a n00b myself, but I'll go on record with my understanding.

On Sun, Jun 4, 2017 at 3:03 PM Benoit GEORGELIN - yulPa <
benoit.george...@yulpa.io> wrote:

> Hi ceph users,
>
> Ceph have a very good documentation about technical usage, but there is a
> lot of conceptual things missing (from my point of view)
> It's not easy to understand all at the same time, but yes, little by
> little it's working.
>
> Here are some questions about ceph , hope someone can take a little time
> to point me where I can find answers :
>
>  - Backup  :
> Do you backup data from a CEPH cluster or you consider a copy as a backup
> of that file ?
> Let's say I have replica size of 3 . Somehow , my crush map will keep 2
> copy in my main rack and 1 copy to another rack in another datacenter
> Can I consider the third copy as a backup ? What would be your position ?
>

Replicas are not backups. Just ask GitLab after accidental deletion.
source: https://www.theregister.co.uk/2017/02/01/gitlab_data_loss/


> - Writing process of ceph object storage using radosgw
> Simple question, but not sure about it.
> The more replica the more slower will be my cluster ? Does CEPH have to
> acknowledge  the number of replica before saying it's good ?
> From what I read, CEPH will write and acknowledge the de primary OSD of
> the pool , So if that the cas, I does not matter how many replica I want
> and how far are situated the others OSD that would work the same.
> Can I chose myseft the primary OSD in my zone 1 ,  have a copy on zone 2
> (same rack) and a third zone 3 in another datacenter that might have some
> latency .
>

More replicas make slower cluster because it waits for all devices to
acknowledge write before reporting back. source: ?


> - Data persistance / availability
> If crush map is by hosts and I have 3 hosts with replication of 3
> This means , I will have 1 copy on each hosts
> Does it means I can lost 2 hosts and still have my cluster working, at
> least on read mode ? and eventually in write too if i say , osd pool
> default min size = 1
>

Yes, I think. But best practice is to have at least 5 hosts (N+2) so you
can lose 2 hosts and still keep 3 replicas.


>
> Thanks for your help.
> -
>
> Benoît G
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Roger
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] CEPH backup strategy and best practices

2017-06-04 Thread Benoit GEORGELIN - yulPa
Hi ceph users, 

Ceph have a very good documentation about technical usage, but there is a lot 
of conceptual things missing (from my point of view) 
It's not easy to understand all at the same time, but yes, little by little 
it's working. 

Here are some questions about ceph , hope someone can take a little time to 
point me where I can find answers : 

- Backup : 
Do you backup data from a CEPH cluster or you consider a copy as a backup of 
that file ? 
Let's say I have replica size of 3 . Somehow , my crush map will keep 2 copy in 
my main rack and 1 copy to another rack in another datacenter 
Can I consider the third copy as a backup ? What would be your position ? 

- Writing process of ceph object storage using radosgw 
Simple question, but not sure about it. 
The more replica the more slower will be my cluster ? Does CEPH have to 
acknowledge the number of replica before saying it's good ? 
>From what I read, CEPH will write and acknowledge the de primary OSD of the 
>pool , So if that the cas, I does not matter how many replica I want and how 
>far are situated the others OSD that would work the same. 
Can I chose myseft the primary OSD in my zone 1 , have a copy on zone 2 (same 
rack) and a third zone 3 in another datacenter that might have some latency . 

- Data persistance / availability 
If crush map is by hosts and I have 3 hosts with replication of 3 
This means , I will have 1 copy on each hosts 
Does it means I can lost 2 hosts and still have my cluster working, at least on 
read mode ? and eventually in write too if i say , osd pool default min size = 
1 

Thanks for your help. 
- 

Benoît G 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com