Hi Dan,

I guess you answered before my e-mails with output from a test cluster arrived. 
I gave an example with a reweight=0.5 with similarly disastrous results. It 
looks like applying reweights in the crush map is seriously broken. If I 
understand the intention of the reweight correctly, then

effective-weight = crush-weight * reweight,

but it is clearly not implemented this way. Please take a look at the specific 
re-mapping examples on a test cluster I posted with effective-weights=0.5*1 and 
1*0.5.

Thanks and best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Dan van der Ster <dvand...@gmail.com>
Sent: 15 November 2022 11:23:44
To: Frank Schilder
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] OSDs down after reweight

Hi Frank,

Just a guess, but I wonder if for small values rounding/precision
start to impact the placement like you observed.

Do you see the same issue if you reweight to 2x the original?

-- Dan

On Tue, Nov 15, 2022 at 10:09 AM Frank Schilder <fr...@dtu.dk> wrote:
>
> Hi all,
>
> I re-weighted all OSDs in a pool down from 1.0 to the same value 0.052 (see 
> reason below). After this, all hell broke loose. OSDs were marked down, slow 
> OPS all over the place and the MDSes started complaining about slow 
> ops/requests. Basically all PGs were remapped. After setting all re-weights 
> back to 1.0 the situation went back to normal.
>
> Expected behaviour: No (!!!) PGs are remapped and everything continues to 
> work. Why did things go down?
>
> More details: We have 24 OSDs with weight=1.74699 in a pool. I wanted to add 
> OSDs with weight=0.09099 in such a way that the small OSDs receive 
> approximately the same number of PGs as the large ones. Setting a re-weight 
> factor of 0.052 for the large ones should achieve just that: 
> 1.74699*0.05=0.09084. So, procedure was:
>
> - ceph osd crush reweight osd.N 0.052 for all OSDs in that pool
> - add the small disks and re-balance
>
> I would expect that the crush mapping is invariant under a uniform change of 
> weight. That is, if I apply the same relative weight-change to all OSDs 
> (new_weight=old_weight*common_factor) in a pool, the mappings should be 
> preserved. However, this is not what I observed. How is it possible that PG 
> mappings change if the relative weight of all OSDs to each other stays the 
> same (the probabilities of picking an OSD are unchanged over all OSDs)?
>
> Thanks for any hints.
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to