Hello list,

I'm having a serious issue, since my ceph cluster has become unresponsive. I 
was upgrading my cluster (3 servers, 3 monitors) from 13.2.1 to 13.2.2, which 
shouldn't be a problem.

Though on reboot my first host reported:

starting mon.ceph01 rank -1 at 192.168.200.197:6789/0 mon_data 
/var/lib/ceph/mon/ceph-ceph01 fsid 27dd45f1-28b5-4ac6-81ab-c62bc581130c
mon.cephxx@-1(probing) e5 preinit fsid 27dd45f1-28b5-4ac6-81ab-c62bc581130c
mon.cephxx@-1(probing) e5 not in monmap and have been in a quorum before; must 
have been removed
-1 mon.cephxx@-1(probing) e5 commit suicide!
-1 failed to initialize

I thought, perhaps the monitor doesn't want to accept the monmap of the other 
2, because of the version-difference. Sadly, I upgraded and rebooted the second 
server.

Since the cluster is unresponsive (because more than half of the monitors is 
offline / out of quorum). The logs of my second host, it keeps spamming:

2018-10-04 14:39:06.802 7fed0058f700 -1 mon.ceph02@1(probing) e6 
get_health_metrics reporting 14 slow ops, oldest is auth(proto 0 27 bytes epoch 
6)

Any help VERY MUCH appreciated, this sucks.

Thanks
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to