On 10/01/2014 09:21 AM, Indra Pramana wrote: > Dear all, > > Anyone using CloudStack with Ceph RBD as primary storage? I am using > CloudStack 4.2.0 with KVM hypervisors and Ceph latest stable version of > dumpling. >
I am :) > Based on what I see, when Ceph cluster is in degraded state (not > active+clean), for example due to one node is down and in recovering > process, it might affect CloudStack operations. For example: > > - Stopped VM cannot be started, because it says cannot find suitable > storage pool. > > - Disconnected host cannot be reconnected easily, even after restarting > agent and libvirt on agent side, and restarting management server on the > server side. Need to keep on trying and suddenly it will be connected/up by > itself. > It really depends on the size of the cluster. It could be that the Ceph is cluster is so busy with recovery that it can't process the I/O coming from CloudStack and thus stalls. This is not a Ceph or CloudStack problem, but probably the size of your cluster. Wido > Once Ceph has recovered and back to active+clean state, then CloudStack > operations will be back to normal. Host agents will be up, and VMs can be > started. > > Anyone seeing similar behaviour? > > Looking forward to your reply, thank you. > > Cheers. >