Re: [ceph-users] Major ceph disaster

Lionel Bouton Mon, 13 May 2019 13:52:11 -0700

Le 13/05/2019 à 16:20, Kevin Flöh a écrit :
> Dear ceph experts,
>
> [...] We have 4 nodes with 24 osds each and use 3+1 erasure coding. [...]
> Here is what happened: One osd daemon could not be started and
> therefore we decided to mark the osd as lost and set it up from
> scratch. Ceph started recovering and then we lost another osd with the
> same behavior. We did the same as for the first osd.


With 3+1 you only allow a single OSD failure per pg at a given time. You
have 4096 pgs and 96 osds, having 2 OSD fail at the same time on 2
separate servers (assuming standard crush rules) is a death sentence for
the data on some pgs using both of those OSD (the ones not fully
recovered before the second failure).

Depending on the data stored (CephFS ?) you probably can recover most of
it but some of it is irremediably lost.

If you can recover the data from the failed OSD at the time they failed
you might be able to recover some of your lost data (with the help of
Ceph devs), if not there's nothing to do.

In the later case I'd add a new server to use at least 3+2 for a fresh
pool instead of 3+1 and begin moving the data to it.

The 12.2 + 13.2 mix is a potential problem in addition to the one above
but it's a different one.

Best regards,

Lionel

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Major ceph disaster

Reply via email to