Hi,

On 2019-11-20 15:55, thoralf schulze wrote:
hi,

we were able to track this down to the auto balancer: disabling the auto
balancer and cleaning out old (and probably not very meaningful)
upmap-entries via ceph osd rm-pg-upmap-items brought back stable mgr
daemons and an usable dashboard.

I can confirm that, in our case I see this on a 14.2.4 cluster (which has 
started its life with an earlier Nautilus version,
and developed this issue over the past weeks) and doing:
 ceph balancer off
has been sufficient to make the mgrs stable again (i.e. I left the upmap-items 
in place).

Interestingly, we did not see this with Luminous or Mimic on different clusters 
(which however have a more stable number of OSDs).

@devs: If there's any more info needed to track this down, please let us know.

Cheers,
        Oliver


the not-so-sensible upmap-entries might or might not have been caused by
us updating from mimic to nautilus - it's too late to debug this now.
this seems to be consistent with bryan stillwell's findings ("mgr hangs
with upmap balancer").

thank you very much & with kind regards,
thoralf.


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to