Could someone help me understand why it's a bad idea to set min_size of
erasure-coded pools to k?

>From what I've read, the argument for k+1 is that if min_size is k and you
lose an OSD during recovery after a failure of m OSDs, data will become
unavailable. But how does setting min_size to k+1 help? If m=2, if you
experience a double failure followed by another failure during recovery you
still lost 3 OSDs and therefore your data because the pool wasn't set up to
handle 3 concurrent failures, and the value of min_size is irrelevant.

https://github.com/ceph/ceph/pull/8008 mentions inability to peer if
min_size = k, but I don't understand why. Does that mean that if min_size=k
and I lose m OSDs, and then an OSD is restarted during recovery, PGs will
not peer even after the restarted OSD comes back online?


Vlad
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to