I think we're deviating from the original thread quite a bit and I would never argue that in a production environment with plenty OSDs you should go for R=2 or K+1, so my example cluster which happens to be 2+1 is a bit unlucky.

However I'm interested in the following

On 11/16/20 11:31 AM, Janne Johansson wrote:
> So while one could always say "one more drive is better than your
> amount", there are people losing data with repl=2 or K+1 because some
> more normal operation was in flight and _then_ a single surprise
> happens.  So you can have a weird reboot, causing those PGs needing
> backfill later, and if one of the uptodate hosts have any single
> surprise during the recovery, the cluster will lack some of the current
> data even if two disks were never down at the same time.

I'm not sure I follow, from a logical perspective they *are* down at the same time right? In your scenario 1 up-to-date replica was left, but even that had a surprise. Okay well that's the risk you take with R=2, but it's not intrinsically different than R=3.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to