On Mon, 4 Aug 2014 11:03:37 +0800 飞 wrote:

> hello, I have running a ceph cluster(RBD) on production environment to
> host 200 VMs, Under normal circumstances, ceph's performance is quite
> good. but when I delete a snapshot or image, ceph cluster will be
> appear ‍a lot of blocked requests(generally morn than 1000‍), then , the
> whole cluster have slow down, many VMs are very slow, any idea ? than you
> the hardware of my cluster----------------------------------
> my cluster have 3 nodes,every node have 2TB sata * 10  and 120G SSD * 1

I suspect your cluster is pretty close to full capacity when operating
normally and overwhelmed when something very intensive like an image
deletion (that has to touch every last object of the image) comes along.

It would be nice if operations like these would have (more and
better) configuration options like with scrub (load) and recovery

Monitor your cluster with atop on all 3 nodes in parallel, observe the
utilization of your HDDs and SSDs, CPU and network during a time of normal

Compare that to what you see when you delete an image (use a small one

About your cluster, what OS, Ceph version, replication factor? 
What CPU, memory and network configuration? 

A single 120GB SSD (which model?) as journal for 10 HDDs will be
definitely be the limiting factor when it comes to write speed, but should
handle the IOPS hopefully well enough.

Christian Balzer        Network/Systems Engineer                
ch...@gol.com           Global OnLine Japan/Fusion Communications
ceph-users mailing list

Reply via email to