It's odd, the cluster is seems to be working somewhat. Cant bring down OSDs 
online, but the un-restarted nodes still work.

Ceph -w hangs
ceph --admin-daemon /var/run/ceph/ceph-mon.FOO.asok mon_status hangs
and nothing in /var/log/ceph/*


Mon03 output
==== 57+0+0 (2052948678 0 0) 0x3fa7240 con 0x42c7580
2013-06-10 15:56:17.781158 7f91279d3700 20 mon.3@2(probing) e1 have connection
2013-06-10 15:56:17.781163 7f91279d3700  5 mon.3@2(probing) e1 waitlisting 
message auth(proto 0 27 bytes epoch 1) v1
2013-06-10 15:56:17.923463 7f91279d3700 10 mon.3@2(probing) e1 ms_handle_reset 
0x51429a0 10.198.141.36:6801/8552
2013-06-10 15:56:17.962012 7f9124db5700  1 -- 10.198.141.203:6789/0 >> :/0 
pipe(0x31d3780 sd=45 :6789 s=0 pgs=0 cs=0 l=0).accept sd=45 
10.198.141.39:54206/0
2013-06-10 15:56:17.962212 7f9124db5700 10 mon.3@2(probing) e1 
ms_verify_authorizer 10.198.141.39:6801/7360 osd protocol 0
2013-06-10 15:56:17.962584 7f91279d3700  1 -- 10.198.141.203:6789/0 <== osd.73 
10.198.141.39:6801/7360 1 ==== auth(proto 0 27 bytes epoch 1) v1 ==== 57+0+0 
(2243780706 0 0) 0x3fa7480 con 0x42c79a0
2013-06-10 15:56:17.962609 7f91279d3700 20 mon.3@2(probing) e1 have connection


Mon2 output (lots of the same)
2013-06-10 15:56:56.501807 7fb506fc2700  1 mon.2@1(electing) e1 discarding 
message auth(proto 0 26 bytes epoch 1) v1 and sending client elsewhere
2013-06-10 15:56:56.501826 7fb506fc2700  1 mon.2@1(electing) e1 discarding 
message auth(proto 0 26 bytes epoch 1) v1 and sending client elsewhere
2013-06-10 15:56:56.501847 7fb506fc2700  1 mon.2@1(electing) e1 discarding 
message auth(proto 0 27 bytes epoch 1) v1 and sending client elsewhere
2013-06-10 15:56:56.501865 7fb506fc2700  1 mon.2@1(electing) e1 discarding 
message auth(proto 0 27 bytes epoch 1) v1 and sending client elsewhere
2013-06-10 15:56:56.561414 7fb506fc2700  0 log [INF] : mon.2@1 won leader 
election with quorum 1,2

Mon01 output (lots of the same)
2013-06-10 15:56:29.456421 7fb8de0b2700 10 mon.1@0(synchronizing sync( 
requester state start )) e1 ms_verify_authorizer 10.198.141.32:6800/16748 osd 
protocol 0
2013-06-10 15:56:29.585180 7fb8ddfb1700  1 -- 10.198.141.201:6789/0 >> :/0 
pipe(0x315ba00 sd=259 :6789 s=0 pgs=0 cs=0 l=0).accept sd=259 
10.198.141.35:51166/0
2013-06-10 15:56:29.585483 7fb8ddfb1700 10 mon.1@0(synchronizing sync( 
requester state start )) e1 ms_verify_authorizer 10.198.141.35:6801/9214 osd 
protocol 0
2013-06-10 15:56:29.658574 7fb8dd3a5700  1 -- 10.198.141.201:6789/0 >> :/0 
pipe(0x3198280 sd=747 :6789 s=0 pgs=0 cs=0 l=0).accept sd=747 
10.198.141.32:49135/0
2013-06-10 15:56:29.658867 7fb8dd3a5700 10 mon.1@0(synchronizing sync( 
requester state start )) e1 ms_verify_authorizer 10.198.141.32:6801/17221 osd 
protocol 0
2013-06-10 15:56:29.787631 7fb8dd0a2700  1 -- 10.198.141.201:6789/0 >> :/0 
pipe(0x3198a00 sd=361 :6789 s=0 pgs=0 cs=0 l=0).accept sd=361 
10.198.141.32:49136/0
2013-06-10 15:56:29.787893 7fb8dd0a2700 10 mon.1@0(synchronizing sync( 
requester state start )) e1 ms_verify_authorizer 10.198.141.32:6803/18346 osd 
protocol 0
2013-06-10 15:56:30.025106 7fb8e02d4700  1 -- 10.198.141.201:6789/0 >> :/0 
pipe(0x3198000 sd=556 :6789 s=0 pgs=0 cs=0 l=0).accept sd=556 
10.198.141.25:40773/0
2013-06-10 15:56:30.025391 7fb8e02d4700 10 mon.1@0(synchronizing sync( 
requester state start )) e1 ms_verify_authorizer 10.198.141.25:6801/12417 osd 
protocol 0

Nelson Jeppesen
   Disney Technology Solutions and Services
   Phone 206-588-5001

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to