Hi, still working on a troubled ceph cluster running .61.2-1raring consisting of (currently) 4 monitors a,b,c,g with g being a newly added monitor that failed/fails to sync up, so consider that one down. Now mon a and b died because for some (currently unknown) reason linux created a core dump on the root partition (/core) that filled up the partition to 0b left and consequently the mons died. Now I tried restarting them, but they they seem deadlocked in the following situation:
the corresponding ceph-mon.X logs show various errors about cephx like "cephx: verify_authorizer could not decrypt ticket info: error: NSS AES final round failed: -8190" "cephx: verify_reply coudln't decrypt with error: error decoding block for decryption" I can see that the /usr/sbin/ceph-create-keys process is stuck (based on the fact that its still running 20 minutes later). Manually running this says: INFO:ceph-create-keys:ceph-mon is not in quorum: u'probing' So, the monitors dont start up (stuck probing) because they cant communicate because they need new keys, and the keys cannot be generated because theres no quorum. Is there a way to fix this? Kind regards, Marc _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com