This is my planned OSD configuration:

root
    room1
        OSD host1
        OSD host2
    room2
        OSD host3
        OSD host4

There are 6 OSDs per host.

Is it possible to configure crush map such that it will tolerate "room" 
failure? In my case, there is one network switch per room and one power supply 
per room, which makes a single point of (room) failure. This is what I would 
like to mitigate.

I could not find any crush rule that would make this configuration redundant 
and safe.

Namely, to tolerate a sudden room (switch, power) failure, there must be a rule 
to "ack" write only after BOTH rooms make the "ack". The problem is that this 
rule holds only until both rooms are up. As soon as one room goes down (with 
the rule like this) the cluster won't be able to write any more data since the 
"ack" is not allowed by the rule. It looks like impossible task with a fix 
crush map rule. The cluster would somehow need to switch rules to make this 
redundant. What am I missing?

In general: can ceph tolerate sudden loss of half of the OSDs?
If not, what is the best redundancy I could get out of my configuration?
Is there any workaround with some external tools maybe to detect such failure 
and reconfigure ceph automatically?

regards,
Zoran
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to