[ceph-users] Re: Upmap balancer after node failure

2021-04-02 Thread Dan van der Ster
Hi Andras. Assuming that you've already tightened the mgr/balancer/upmap_max_deviation to 1, I suspect that this cluster already has too many upmaps. Last time I checked, the balancer implementation is not able to improve a pg-upmap-items entry if one already exists for a PG. (It can add an OSD m

[ceph-users] Re: Upmap balancer after node failure

2021-04-02 Thread Dan van der Ster
Hi again, Oops, I'd missed the part about some PGs being degraded, which prevents the balancer from continuing. So I assume that you have PGs which are simultaneously undersized+backfill_toofull? That case does indeed sound tricky. To solve that you would either need to move PGs out of the tooful

[ceph-users] Re: Upmap balancer after node failure

2021-04-02 Thread Janne Johansson
Den fre 2 apr. 2021 kl 11:23 skrev Dan van der Ster : > > Hi again, > > Oops, I'd missed the part about some PGs being degraded, which > prevents the balancer from continuing. > any upmaps which are directing PGs *to* those toofull OSDs. Or maybe > it will be enough to just reweight those OSDs to 0

[ceph-users] Re: Upmap balancer after node failure

2021-04-02 Thread Andras Pataki
Lowering the weight is what I ended up doing.  But this isn't ideal since afterwards the balancer will remove too many PGs from the OSD since now it has a lower weight.  So I'll have to put the weight back once the cluster recovers and the balancer goes back to its business. But in any case -