> I think the short answer is "because you have so wildly varying sizes
> both for drives and hosts".

Arguably OP's OSDs *are* balanced in that their PGs are roughly in line with 
their sizes, but indeed the size disparity is problematic in some ways.

Notably, the 500GB OSD should just be removed.  I think balancing doesn't 
account for WAL/DB/other overhead, so it won't be accurately accounted for and 
can't hold much data nyway.

This cluster shows evidence of reweight-by-utilization having been run, but 
only on two of the hosts.  If the balancer module is active, those override 
weights will confound it.


> 
> If your drive sizes span from 0.5 to 9.5, there will naturally be
> skewed data, and it is not a huge surprise that the automation has
> some troubles getting it "good". When the balancer places a PG on a
> 0.5-sized drive compared to a 9.5-sized one, it eats up 19x more of
> the "free space" on the smaller one, so there are very few good
> options when the sizes are so different. Even if you placed all PGs
> correctly due to size, the 9.5-sized disk would end up getting 19x
> more IO than the small drive and for hdd, it seldom is possible to
> gracefully handle a 19-fold increase in IO, most of the time will
> probably be spent on seeks.

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to