Just to emphasize that I don't think it's clock skew, here is the NTP state of all three monitors:
# ansible ceph_mons -m command -a "ntpq -p" -kK SSH password: sudo password [defaults to SSH password]: ceph0 | success | rc=0 >> remote refid st t when poll reach delay offset jitter ============================================================================== *controller-10g 198.60.73.8 2 u 43 64 377 0.236 0.057 0.097 ceph1 | success | rc=0 >> remote refid st t when poll reach delay offset jitter ============================================================================== *controller-10g 198.60.73.8 2 u 39 64 377 0.273 0.035 0.064 ceph2 | success | rc=0 >> remote refid st t when poll reach delay offset jitter ============================================================================== *controller-10g 198.60.73.8 2 u 30 64 377 0.201 -0.063 0.063 I think they are pretty well in synch. - Travis On Tue, Mar 25, 2014 at 11:09 AM, Travis Rhoden <trho...@gmail.com> wrote: > Hello, > > I just deployed a new Emperor cluster using ceph-deploy 1.4. All went > very smooth, until I rebooted all the nodes. After reboot, the monitors no > longer form a quorum. > > I followed the troubleshooting steps here: > http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-mon/ > > Specifically, I"m in the stat described in this section: > http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-mon/#most-common-monitor-issues > > The state for all the monitors is "electing". The docs say this is most > likely clock skew, but I do have all nodes synch'd with NTP. I've > confirmed this multiple times. I've also confirmed the monitors can reach > each other (by telneting to IP:PORT, and I can see established connections > via netstat). > > I'm baffled. > > here is a sample mon_status output: > > root@ceph0:~# ceph daemon mon.ceph0 quorum_status > { "election_epoch": 31, > "quorum": [], > "quorum_names": [], > "quorum_leader_name": "", > "monmap": { "epoch": 2, > "fsid": "XXX", (redacted) > "modified": "2014-03-24 14:35:22.332646", > "created": "0.000000", > "mons": [ > { "rank": 0, > "name": "ceph0", > "addr": "10.10.30.0:6789\/0"}, > { "rank": 1, > "name": "ceph1", > "addr": "10.10.30.1:6789\/0"}, > { "rank": 2, > "name": "ceph2", > "addr": "10.10.30.2:6789\/0"}]}} > > They all look identical to that. > > Any ideas what I can look at besides NTP? The docs really stress that it > should be clock skew, so I'll keep looking at that... > > - Travis >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com