Many of us deploy ceph as a solution to storage high-availability.

During the time, I've encountered a couple of moments when ceph refused to
deliver I/O to VMs even when a tiny part of the PGs were stuck in
non-active states due to challenges on the OSDs.
So I found myself in very unpleasant situations when an entire cluster went
down because of 1 single node, even if that cluster was supposed to be
fault-tolerant.

Regardless of the reason, the cluster itself can be a single point of
failure, even if it's has a lot of nodes.

How do you segment your deployments so that your business doesn't
get jeopardised in the case when your ceph cluster misbehaves?

Does anyone even use ceph for a very large clusters, or do you prefer to
separate everything into smaller clusters?
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to