Quoting by morphin (morphinwith...@gmail.com): > Good news... :) > > After I tried everything. I decide to re-create my MONs from OSD's and > I used the script: > https://paste.ubuntu.com/p/rNMPdMPhT5/ > > And it worked!!!
Congrats! > I think when 2 server crashed and come back same time some how MON's > confused and the maps just corrupted. > After re-creation all the MONs was have the same map so it worked. > But still I dont know how to hell the mons can cause endless %95 I/O ??? > This a bug anyway and if you dont want to leave the problem then do > not "enable" your mons. Just start them manual! Another tough lesson. The only time we needed to manually start the mons was at "bootstrap" time. After a reboot they are brought up by systemd ... and it keeps on working. Have you rebooted your mon(s) after the manual start? > > ceph -s: https://paste.ubuntu.com/p/m3hFF22jM9/ > > As you can see below some of the OSDs are still down. And when I start > them they dont start. > Check start log: https://paste.ubuntu.com/p/ZJQG4khdbx/ > Debug log: https://paste.ubuntu.com/p/J3JyGShHym/ > > What we can do for the problem? Apply PR https://github.com/ceph/ceph/pull/24064 I see that you are running Mimic 13.2.1 ... 13.2.2 was released a few days ago. Not sure if this fix has made it into 13.2.2. > What is the cause of the problem? Somehow it looks like you hit this issue: https://tracker.ceph.com/issues/24866 Gr. Stefan -- | BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / i...@bit.nl _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com