On 04/12/2018 04:36 AM, ? ?? wrote:
> Hi, 
> 
> For anybody who may be interested, here I share a process of locating the 
> reason for ceph cluster performance slow down in our environment.
> 
> Internally, we have a cluster with capacity 1.1PB, used 800TB, and raw user 
> data is about 500TB. Each day, 3TB' data is uploaded and 3TB oldest data is 
> lifecycled (we are using s3 object store, and bucket lifecycle is enabled). 
> As time goes by, the cluster becomes some slower, we doubt the xfs 
> fragmentation is the fiend. 
> 
> After some testing, we do find xfs fragmentation slow down filestore's 
> performance, for example, at 15% fragmentation, the performance is 85% of the 
> original, and at 25%, the performance is 74.73% of the original.
> 
> But the main reason for our cluster's deterioration of performance is not the 
> xfs fragmentation.
> 
> Initially, our ceph cluster contains only osds with 4TB's disk, as time goes 
> by, we scale out our cluster by adding some new osds with 8TB's disk. And as 
> the new disk's capacity is double times of the old disks, so each new osd's 
> weight is double of the old osd. And new osd has double pgs than old osd, and 
> new osd used double disk space than the old osd. Everything looks good and 
> fine.
> 
> But even though the new osd has double capacity than the old osd, the new 
> osd's performance is not double than the old osd. After digging into our 
> internal system stats, we find the new added's disk io util is about two 
> times than the old. And from time to time, the new disks' io util rises up to 
> 100%. The new added osds are the performance killer. They slow down the whole 
> cluster's performance.
> 
> As the reason is found, the solution is very simple. After lower new added 
> osds's weight, the annoying slow request warnings have died away.
> 

This is to be expected. However, lowering the weight of new disks means
that you can't fully use their storage capacity.

This is the nature of having a heterogeneous cluster with Ceph.
Different disks of different sizes mean that performance will fluctuate.

Wido

> So the conclusion is: in cluster with different osd's disk size, osd's weight 
> is not only determined by its capacity, we should also have a look at its 
> performance.>
> Best wishes,
> Yao Zongyou
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to