Hi,
You only have one OSD? I’ve seen similar strange things in test pools having 
only one OSD — and I kinda explained it by assuming that OSDs need peers (other 
OSDs sharing the same PG) to behave correctly. Install a second OSD and see how 
it goes...
Cheers, Dan


On 21 Aug 2014, at 02:59, Bruce McFarland 
<bruce.mcfarl...@taec.toshiba.com<mailto:bruce.mcfarl...@taec.toshiba.com>> 
wrote:

I have a cluster with 1 monitor and 3 OSD Servers. Each server has multiple 
OSD’s running on it. When I start the OSD using /etc/init.d/ceph start osd.0
I see the expected interaction between the OSD and the monitor authenticating 
keys etc and finally the OSD starts.

Running watching the cluster with ‘ceph –w’ running on the monitor I never see 
the INFO messages I expect. There isn’t a msg from osd.0 for the boot event and 
the expected INFO messages from osdmap and pgmap  for the osd and it’s pages 
being added to those maps.  I only see the last time the monitor was booted and 
it wins the monitor election and reports monmap, pgmap, and mdsmap info.

The firewalls are disabled with selinux==disabled and iptables turned off. All 
hosts can ssh w/o passwords into each other and I’ve verified traffic between 
hosts using tcpdump captures. Any ideas on what I’d need to add to ceph.conf or 
have overlooked would be greatly appreciated.
Thanks,
Bruce

[root@ceph0 ceph]# /etc/init.d/ceph restart osd.0
=== osd.0 ===
=== osd.0 ===
Stopping Ceph osd.0 on ceph0...kill 15676...done
=== osd.0 ===
2014-08-20 17:43:46.456592 7fa51a034700  1 -- :/0 messenger.start
2014-08-20 17:43:46.457363 7fa51a034700  1 -- :/1025971 --> 
209.243.160.84:6789/0 -- auth(proto 0 26 bytes epoch 0) v1 -- ?+0 
0x7fa51402f9e0 con 0x7fa51402f570
2014-08-20 17:43:46.458229 7fa5189f0700  1 -- 209.243.160.83:0/1025971 learned 
my addr 209.243.160.83:0/1025971
2014-08-20 17:43:46.459664 7fa5135fe700  1 -- 209.243.160.83:0/1025971 <== 
mon.0 209.243.160.84:6789/0 1 ==== mon_map v1 ==== 200+0+0 (3445960796 0 0) 
0x7fa508000ab0 con 0x7fa51402f570
2014-08-20 17:43:46.459849 7fa5135fe700  1 -- 209.243.160.83:0/1025971 <== 
mon.0 209.243.160.84:6789/0 2 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 
33+0+0 (536914167 0 0) 0x7fa508000f60 con 0x7fa51402f570
2014-08-20 17:43:46.460180 7fa5135fe700  1 -- 209.243.160.83:0/1025971 --> 
209.243.160.84:6789/0 -- auth(proto 2 32 bytes epoch 0) v1 -- ?+0 
0x7fa4fc0012d0 con 0x7fa51402f570
2014-08-20 17:43:46.461341 7fa5135fe700  1 -- 209.243.160.83:0/1025971 <== 
mon.0 209.243.160.84:6789/0 3 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 
206+0+0 (409581826 0 0) 0x7fa508000f60 con 0x7fa51402f570
2014-08-20 17:43:46.461514 7fa5135fe700  1 -- 209.243.160.83:0/1025971 --> 
209.243.160.84:6789/0 -- auth(proto 2 165 bytes epoch 0) v1 -- ?+0 
0x7fa4fc001cf0 con 0x7fa51402f570
2014-08-20 17:43:46.462824 7fa5135fe700  1 -- 209.243.160.83:0/1025971 <== 
mon.0 209.243.160.84:6789/0 4 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 
393+0+0 (2134012784 0 0) 0x7fa5080011d0 con 0x7fa51402f570
2014-08-20 17:43:46.463011 7fa5135fe700  1 -- 209.243.160.83:0/1025971 --> 
209.243.160.84:6789/0 -- mon_subscribe({monmap=0+}) v2 -- ?+0 0x7fa51402bbc0 
con 0x7fa51402f570
2014-08-20 17:43:46.463073 7fa5135fe700  1 -- 209.243.160.83:0/1025971 --> 
209.243.160.84:6789/0 -- auth(proto 2 2 bytes epoch 0) v1 -- ?+0 0x7fa4fc0025d0 
con 0x7fa51402f570
2014-08-20 17:43:46.463329 7fa51a034700  1 -- 209.243.160.83:0/1025971 --> 
209.243.160.84:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 
0x7fa514030490 con 0x7fa51402f570
2014-08-20 17:43:46.463363 7fa51a034700  1 -- 209.243.160.83:0/1025971 --> 
209.243.160.84:6789/0 -- mon_subscribe({monmap=2+,osdmap=0}) v2 -- ?+0 
0x7fa5140309b0 con 0x7fa51402f570
2014-08-20 17:43:46.463564 7fa5135fe700  1 -- 209.243.160.83:0/1025971 <== 
mon.0 209.243.160.84:6789/0 5 ==== mon_map v1 ==== 200+0+0 (3445960796 0 0) 
0x7fa508001100 con 0x7fa51402f570
2014-08-20 17:43:46.463639 7fa5135fe700  1 -- 209.243.160.83:0/1025971 <== 
mon.0 209.243.160.84:6789/0 6 ==== mon_subscribe_ack(300s) v1 ==== 20+0+0 
(540052875 0 0) 0x7fa5080013e0 con 0x7fa51402f570
2014-08-20 17:43:46.463707 7fa5135fe700  1 -- 209.243.160.83:0/1025971 <== 
mon.0 209.243.160.84:6789/0 7 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 
194+0+0 (1040860857 0 0) 0x7fa5080015d0 con 0x7fa51402f570
2014-08-20 17:43:46.468877 7fa51a034700  1 -- 209.243.160.83:0/1025971 --> 
209.243.160.84:6789/0 -- mon_command({"prefix": "get_command_descriptions"} v 
0) v1 -- ?+0 0x7fa514030e20 con 0x7fa51402f570
2014-08-20 17:43:46.469862 7fa5135fe700  1 -- 209.243.160.83:0/1025971 <== 
mon.0 209.243.160.84:6789/0 8 ==== osd_map(554..554 src has 1..554) v3 ==== 
59499+0+0 (2180258623 0 0) 0x7fa50800f980 con 0x7fa51402f570
2014-08-20 17:43:46.470428 7fa5135fe700  1 -- 209.243.160.83:0/1025971 <== 
mon.0 209.243.160.84:6789/0 9 ==== mon_subscribe_ack(300s) v1 ==== 20+0+0 
(540052875 0 0) 0x7fa50800fc40 con 0x7fa51402f570
2014-08-20 17:43:46.475021 7fa5135fe700  1 -- 209.243.160.83:0/1025971 <== 
mon.0 209.243.160.84:6789/0 10 ==== osd_map(554..554 src has 1..554) v3 ==== 
59499+0+0 (2180258623 0 0) 0x7fa508001100 con 0x7fa51402f570
2014-08-20 17:43:46.475081 7fa5135fe700  1 -- 209.243.160.83:0/1025971 <== 
mon.0 209.243.160.84:6789/0 11 ==== mon_subscribe_ack(300s) v1 ==== 20+0+0 
(540052875 0 0) 0x7fa508001310 con 0x7fa51402f570
2014-08-20 17:43:46.477559 7fa5135fe700  1 -- 209.243.160.83:0/1025971 <== 
mon.0 209.243.160.84:6789/0 12 ==== mon_command_ack([{"prefix": 
"get_command_descriptions"}]=0  v0) v1 ==== 72+0+29681 (1092875540 0 
3117897362) 0x7fa5080012b0 con 0x7fa51402f570
2014-08-20 17:43:46.592859 7fa51a034700  1 -- 209.243.160.83:0/1025971 --> 
209.243.160.84:6789/0 -- mon_command({"prefix": "osd crush create-or-move", 
"args": ["host=ceph0", "root=default"], "id": 0, "weight": 3.6400000000000001} 
v 0) v1 -- ?+0 0x7fa514030e20 con 0x7fa51402f570
2014-08-20 17:43:46.594426 7fa5135fe700  1 -- 209.243.160.83:0/1025971 <== 
mon.0 209.243.160.84:6789/0 13 ==== mon_command_ack([{"prefix": "osd crush 
create-or-move", "args": ["host=ceph0", "root=default"], "id": 0, "weight": 
3.6400000000000001}]=0 create-or-move updated item name 'osd.0' weight 3.64 at 
location {host=ceph0,root=default} to crush map v554) v1 ==== 254+0+0 
(748268703 0 0) 0x7fa508001100 con 0x7fa51402f570
create-or-move updated item name 'osd.0' weight 3.64 at location 
{host=ceph0,root=default} to crush map
2014-08-20 17:43:46.602415 7fa51a034700  1 -- 209.243.160.83:0/1025971 
mark_down 0x7fa51402f570 -- 0x7fa51402f300
2014-08-20 17:43:46.602500 7fa51a034700  1 -- 209.243.160.83:0/1025971 
mark_down_all
2014-08-20 17:43:46.602666 7fa51a034700  1 -- 209.243.160.83:0/1025971 shutdown 
complete.
Starting Ceph osd.0 on ceph0...
starting osd.0 at :/0 osd_data /var/lib/ceph/osd/ceph-0 
/var/lib/ceph/osd/ceph-0/journal
[root@ceph0 ceph]#


Ceph –w output from ceph-mon01:
2014-08-20 17:20:24.648538 7f326ebfd700  0 monclient: hunting for new mon
2014-08-20 17:20:24.648857 7f327455f700  0 -- 209.243.160.84:0/1005462 >> 
209.243.160.84:6789/0 pipe(0x7f3264020300 sd=3 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f3264020570).fault
2014-08-20 17:20:26.077687 mon.0 [INF] mon.ceph-mon01@0 won leader election 
with quorum 0
2014-08-20 17:20:26.077810 mon.0 [INF] monmap e1: 1 mons at 
{ceph-mon01=209.243.160.84:6789/0}
2014-08-20 17:20:26.077931 mon.0 [INF] pgmap v555: 192 pgs: 192 creating; 0 
bytes data, 0 kB used, 0 kB / 0 kB avail
2014-08-20 17:20:26.078032 mon.0 [INF] mdsmap e1: 0/0/1 up

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to