On Tue, Apr 8, 2014 at 3:33 AM, Diedrich Ehlerding
<diedrich.ehlerd...@ts.fujitsu.com> wrote:
> Hi,
>
>>
>> Have you increased the verbosity for the monitors, restarted them, and
>> looked at the log output?
>
> First of all: The bug is still there, and the logs do not help. But I
> seem to have found a workaround (just for myself, not generally)
>
> As for the bug:
>
> I appended "debug log=20" to ceph-deploy's generated ceph.conf but I
> dont see much in the logs (and they do not get larger bym this
> option). Here is one of the monitor logs form /var/lib/ceph; the
> other ones look identically.

You would probably need to up the verbosity for the monitors, so it
would look like this on
the global section

  debug mon = 20
  debug ms = 10

Then restart the mons and check the output
>
> 2014-04-08 08:26:09.405714 7fd1a0a94780  0 ceph version 0.72.2
> (a913ded2ff138aefb8cb84d347d72164099cfd60), process ceph-mon, pid
> 28842
> 2014-04-08 08:26:09.851227 7f66ecd06780  0 ceph version 0.72.2
> (a913ded2ff138aefb8cb84d347d72164099cfd60), process ceph-mon, pid
> 28943
> 2014-04-08 08:26:09.933034 7f66ecd06780  0 mon.hvrrzceph2 does not
> exist in monmap, will attempt to join an existing cluster
> 2014-04-08 08:26:09.933417 7f66ecd06780  0 using public_addr
> 10.111.3.2:0/0 -> 10.111.3.2:6789/0
> 2014-04-08 08:26:09.934003 7f66ecd06780  1 mon.hvrrzceph2@-1(probing)
> e0 preinit fsid c847e327-1bc5-445f-9c7e-de0551bfde06
> 2014-04-08 08:26:09.934149 7f66ecd06780  1 mon.hvrrzceph2@-1(probing)
> e0  initial_members hvrrzceph1,hvrrzceph2,hvrrzceph3, filtering seed
> monmap
> 2014-04-08 08:26:09.937302 7f66ecd06780  0 mon.hvrrzceph2@-1(probing)
> e0  my rank is now 0 (was -1)
> 2014-04-08 08:26:09.938254 7f66e63c9700  0 -- 10.111.3.2:6789/0 >>
> 0.0.0.0:0/2 pipe(0x15fba00 sd=21 :0 s=1 pgs=0 cs=0 l=0
> c=0x15c9c60).fault
> 2014-04-08 08:26:09.938442 7f66e61c7700  0 -- 10.111.3.2:6789/0 >>
> 10.112.3.2:6789/0 pipe(0x1605280 sd=22 :0 s=1 pgs=0 cs=0 l=0
> c=0x15c99a0).fault
> 2014-04-08 08:26:09.939001 7f66ecd04700  0 -- 10.111.3.2:6789/0 >>
> 0.0.0.0:0/1 pipe(0x15fb280 sd=25 :0 s=1 pgs=0 cs=0 l=0
> c=0x15c9420).fault
> 2014-04-08 08:26:09.939120 7f66e62c8700  0 -- 10.111.3.2:6789/0 >>
> 10.112.3.1:6789/0 pipe(0x1605780 sd=24 :0 s=1 pgs=0 cs=0 l=0
> c=0x15c9b00).fault
> 2014-04-08 08:26:09.941140 7f66e60c6700  0 -- 10.111.3.2:6789/0 >>
> 10.112.3.3:6789/0 pipe(0x1605c80 sd=23 :0 s=1 pgs=0 cs=0 l=0
> c=0x15c9840).fault
> 2014-04-08 08:27:09.934720 7f66e7bcc700  0
> mon.hvrrzceph2@0(probing).data_health(0) update_stats avail 70% total
> 15365520 used 3822172 avail 10762804
> 2014-04-08 08:28:09.935036 7f66e7bcc700  0
> mon.hvrrzceph2@0(probing).data_health(0) update_stats avail 70% total
> 15365520 used 3822172 avail 10762804
>
> Since ceph-deploy complained about not getting an answer from
> ceph-generate-keys:
>
> [hvrrzceph3][DEBUG ] Starting ceph-create-keys on hvrrzceph3...
> [hvrrzceph3][WARNIN] No data was received after 7 seconds,
> disconnecting...
>
> I therefore tried to create a keys manually:
>
> hvrrzceph2:~ # ceph-create-keys --id client.admin
> admin_socket: exception getting command descriptions: [Errno 2] No
> such file or directory
> INFO:ceph-create-keys:ceph-mon admin socket not ready yet.
> admin_socket: exception getting command descriptions: [Errno 2] No
> such file or directory
> INFO:ceph-create-keys:ceph-mon admin socket not ready yet.
> admin_socket: exception getting command descriptions: [Errno 2] No
> such file or directory
> INFO:ceph-create-keys:ceph-mon admin socket not ready yet.
> admin_socket: exception getting command descriptions: [Errno 2] No
> such file or directory
> INFO:ceph-create-keys:ceph-mon admin socket not ready yet.
>
> [etc.]
>
> As for the workaround: What I wanted to do is: I have three servers,
> two NICs, and thre IP addresses per server. The NICs are bonded, the
> bond has an IP address in one network (untagged), and additionally,
> two tagged VLANs are also on the bond. The bug occured when I tried
> to use a dedicated cluster network (i.e. one of the tagged vlans) and
> another dedicated public network (the other tagged vlan). At that
> time, I had
>
> I now tried to leave "cluster network" and "public network" away from
> ceph.conf ... and now I could create the cluster.
>
> So it seems to be a network problem, as you (and Brian) supposed.
> However, ssh etc. are properly working on all three networks.  I
> don't really understand what's going on there, but at least, I can
> continue to learn.
>
> Thank you.
>
> best regards
> Diedrich
>
>
>
>
> --
> Diedrich Ehlerding, Fujitsu Technology Solutions GmbH,
> FTS CE SC PS&IS W, Hildesheimer Str 25, D-30880 Laatzen
> Fon +49 511 8489-1806, Fax -251806, Mobil +49 173 2464758
> Firmenangaben: http://de.ts.fujitsu.com/imprint.html
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to