This is related to https://tracker.ceph.com/issues/42341 and to 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-October/037017.html

After closing inspection yesterday we found that PGs are not being removed from 
OSDs which then leads to near full errors, explains why reweights don't work. 
This is a BIG issue because I have to constantly manually intervene to not have 
the cluster die.14.2.4. Fresh Setup, all defaultPG Balancer is turned off now, 
I begin to wonder if its at fault.

My crush map: https://termbin.com/3t8lWhat was mentioned that the bucket 
weights are WEIRD. I never touched this.The crush weights that are unsual are 
for nearfull osd53 and some are set to 10 from a previous manual intervention.
Now that the PGs are not being purged is one issue, the original issue is why 
the f ceph fills ONLY my nearfull OSDs in the first place. It seems to always 
select the fullest OSD to write more data onto it. If I reweight it it starts 
giving alerts for another almost full OSD because it intends to write 
everything there, despite everything else being only at about 60%.
I dont know how to debug this, it's a MAJOR PITA


Hope someone has an idea because I can't fight this 24/7, I'm getting pretty 
tired of this
Thanks
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to