After upgrading my cluster everything looked good, then I rebooted the farm and 
all hell broke loose.

I have 3 monitors  but none are able to start. On all of them the 
'/usr/bin/python /usr/sbin/ceph-create-keys' command is hanging because none of 
the nodes can accept quorum.


All ceph tools are producing the following fault:
# ceph -w
2013-05-10 15:00:55.259382 7f6b68e0e700  0 -- :/20337 >> 10.1.1.21:6789/0 
pipe(0x2fdc520 sd=4 :0 s=1 pgs=0 cs=0 l=1).fault
....


Using mommaptool I removed all but one monitor and did the same to ceph.conf 
and tried running interactively and get the following:

Heres the mom output
# /usr/bin/ceph-mon -i a --pid-file /var/run/ceph/mon.a.pid -c 
/etc/ceph/ceph.conf  -d
2013-05-10 14:54:23.405324 7f0750a61780  0 ceph version 0.61 
(237f3f1e8d8c3b85666529860285dcdffdeda4c5), process ceph-mon, pid 29289
starting mon.a rank 0 at 10.1.1.21:6789/0 mon_data /var/lib/ceph/mon/ceph-a 
fsid 969f28c3-5ee1-4451-9b5b-97c52b724a06
2013-05-10 14:54:23.455975 7f0750a61780  1 mon.a@-1(probing) e1 preinit fsid 
969f28c3-5ee1-4451-9b5b-97c52b724a06
2013-05-10 14:54:23.820160 7f0750a61780  1 mon.a@-1(probing).osd e6666 e6666: 
96 osds: 96 up, 96 in
2013-05-10 14:54:23.820372 7f0750a61780  1 mon.a@-1(probing).osd e6667 e6667: 
96 osds: 96 up, 96 in
2013-05-10 14:54:23.820618 7f0750a61780  1 mon.a@-1(probing).osd e6668 e6668: 
96 osds: 96 up, 96 in
2013-05-10 14:54:23.820802 7f0750a61780  1 mon.a@-1(probing).osd e6669 e6669: 
96 osds: 96 up, 96 in
2013-05-10 14:54:23.820995 7f0750a61780  1 mon.a@-1(probing).osd e6670 e6670: 
96 osds: 96 up, 96 in
2013-05-10 14:54:23.821180 7f0750a61780  1 mon.a@-1(probing).osd e6671 e6671: 
96 osds: 96 up, 96 in
2013-05-10 14:54:23.821368 7f0750a61780  1 mon.a@-1(probing).osd e6672 e6672: 
96 osds: 96 up, 96 in
2013-05-10 14:54:23.821549 7f0750a61780  1 mon.a@-1(probing).osd e6673 e6673: 
96 osds: 96 up, 96 in
2013-05-10 14:54:23.821735 7f0750a61780  1 mon.a@-1(probing).osd e6674 e6674: 
96 osds: 96 up, 96 in
2013-05-10 14:54:23.821981 7f0750a61780  1 mon.a@-1(probing).osd e6675 e6675: 
96 osds: 96 up, 96 in
2013-05-10 14:54:23.822173 7f0750a61780  1 mon.a@-1(probing).osd e6676 e6676: 
96 osds: 96 up, 96 in
2013-05-10 14:54:23.822353 7f0750a61780  1 mon.a@-1(probing).osd e6677 e6677: 
96 osds: 96 up, 96 in
2013-05-10 14:54:23.822529 7f0750a61780  1 mon.a@-1(probing).osd e6678 e6678: 
96 osds: 96 up, 96 in
2013-05-10 14:54:23.822698 7f0750a61780  1 mon.a@-1(probing).osd e6679 e6679: 
96 osds: 96 up, 96 in
2013-05-10 14:54:23.822879 7f0750a61780  1 mon.a@-1(probing).osd e6680 e6680: 
96 osds: 96 up, 96 in
2013-05-10 14:54:23.823056 7f0750a61780  1 mon.a@-1(probing).osd e6681 e6681: 
96 osds: 96 up, 96 in
2013-05-10 14:54:23.823229 7f0750a61780  1 mon.a@-1(probing).osd e6682 e6682: 
96 osds: 96 up, 96 in
2013-05-10 14:54:23.823403 7f0750a61780  1 mon.a@-1(probing).osd e6683 e6683: 
96 osds: 96 up, 96 in
2013-05-10 14:54:23.823580 7f0750a61780  1 mon.a@-1(probing).osd e6684 e6684: 
96 osds: 96 up, 96 in
2013-05-10 14:54:23.823749 7f0750a61780  1 mon.a@-1(probing).osd e6685 e6685: 
96 osds: 96 up, 96 in
2013-05-10 14:54:23.823915 7f0750a61780  1 mon.a@-1(probing).osd e6686 e6686: 
96 osds: 92 up, 96 in
2013-05-10 14:54:23.824088 7f0750a61780  1 mon.a@-1(probing).osd e6687 e6687: 
96 osds: 88 up, 96 in
2013-05-10 14:54:23.824261 7f0750a61780  1 mon.a@-1(probing).osd e6688 e6688: 
96 osds: 83 up, 96 in
2013-05-10 14:54:23.824434 7f0750a61780  1 mon.a@-1(probing).osd e6689 e6689: 
96 osds: 71 up, 96 in
2013-05-10 14:54:23.824610 7f0750a61780  1 mon.a@-1(probing).osd e6690 e6690: 
96 osds: 69 up, 96 in
2013-05-10 14:54:23.824793 7f0750a61780  1 mon.a@-1(probing).osd e6691 e6691: 
96 osds: 56 up, 96 in
2013-05-10 14:54:23.838611 7f0750a61780  0 mon.a@-1(probing).osd e6691 crush 
map has features 33816576, adjusting msgr requires
2013-05-10 14:54:23.838630 7f0750a61780  0 mon.a@-1(probing).osd e6691 crush 
map has features 33816576, adjusting msgr requires
2013-05-10 14:54:23.838634 7f0750a61780  0 mon.a@-1(probing).osd e6691 crush 
map has features 33816576, adjusting msgr requires
2013-05-10 14:54:23.838636 7f0750a61780  0 mon.a@-1(probing).osd e6691 crush 
map has features 33816576, adjusting msgr requires
2013-05-10 14:54:23.841335 7f0750a61780  0 mon.a@-1(probing) e1  my rank is now 
0 (was -1)
2013-05-10 14:54:23.842481 7f0748ff9700  0 -- 10.1.1.21:6789/0 >> 
10.1.1.33:6789/0 pipe(0x204ba00 sd=41 :0 s=1 pgs=0 cs=0 l=0).fault
2013-05-10 14:54:23.842493 7f07490fa700  0 -- 10.1.1.21:6789/0 >> 
10.1.1.22:6789/0 pipe(0x204bc80 sd=40 :0 s=1 pgs=0 cs=0 l=0).fault
2013-05-10 14:54:28.841438 7f074aaff700  1 mon.a@0(probing) e1 discarding 
message auth(proto 0 27 bytes epoch 1) v1 and sending client elsewhere
2013-05-10 14:54:28.841472 7f074aaff700  1 mon.a@0(probing) e1 discarding 
message auth(proto 0 27 bytes epoch 1) v1 and sending client elsewhere
2013-05-10 14:54:28.841483 7f074aaff700  1 mon.a@0(probing) e1 discarding 
message auth(proto 0 30 bytes epoch 1) v1 and sending client elsewhere
2013-05-10 14:54:28.841491 7f074aaff700  1 mon.a@0(probing) e1 discarding 
message auth(proto 0 27 bytes epoch 1) v1 and sending client elsewhere
2013-05-10 14:54:28.841499 7f074aaff700  1 mon.a@0(probing) e1 discarding 
message auth(proto 0 27 bytes epoch 1) v1 and sending client elsewhere
2013-05-10 14:54:28.841507 7f074aaff700  1 mon.a@0(probing) e1 discarding 
message auth(proto 0 27 bytes epoch 1) v1 and sending client elsewhere
2013-05-10 14:54:28.841515 7f074aaff700  1 mon.a@0(probing) e1 discarding 
message auth(proto 0 27 bytes epoch 1) v1 and sending client elsewhere
2013-05-10 14:54:28.841526 7f074aaff700  1 mon.a@0(probing) e1 discarding 
message auth(proto 0 27 bytes epoch 1) v1 and sending client elsewhere
2013-05-10 14:54:28.841540 7f074aaff700  1 mon.a@0(probing) e1 discarding 
message auth(proto 0 27 bytes epoch 1) v1 and sending client elsewhere
2013-05-10 14:54:28.841549 7f074aaff700  1 mon.a@0(probing) e1 discarding 
message auth(proto 0 27 bytes epoch 1) v1 and sending client elsewhere
2013-05-10 14:54:28.841556 7f074aaff700  1 mon.a@0(probing) e1 discarding 
message auth(proto 0 48 bytes epoch 1) v1 and sending client elsewhere
2013-05-10 14:54:28.841567 7f074aaff700  1 mon.a@0(probing) e1 discarding 
message auth(proto 0 27 bytes epoch 1) v1 and sending client elsewhere
2013-05-10 14:54:28.841578 7f074aaff700  1 mon.a@0(probing) e1 discarding 
message auth(proto 0 27 bytes epoch 1) v1 and sending client elsewhere
2013-05-10 14:54:28.841585 7f074aaff700  1 mon.a@0(probing) e1 discarding 
message auth(proto 0 27 bytes epoch 1) v1 and sending client elsewhere
....


Nelson Jeppesen

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to