Monitors use the public network, not the cluster network. Only OSDs use the
cluster network. The purpose of the cluster network is that OSDs do a lot
of heartbeat checks, data replication, recovery, and rebalancing. So the
cluster network will see more traffic than the front end public network.
See http://ceph.com/docs/master/rados/configuration/mon-osd-interaction/ By
contrast, Ceph clients connect to monitors and OSDs, so they must be on the
public network. See the diagram here:
http://ceph.com/docs/master/rados/configuration/network-config-ref/  Notice
that all daemons use the public network? This is because clients connect
using the public network. Yet, only OSDs use the cluster network.

In your configuration, you specified the following:

[mon.controller1]
  host = controller1
  mon addr = 10.100.10.1:6789
  public addr = 10.100.0.150
  cluster addr = 10.100.10.1
  cluster network = 10.100.10.0/24
  public network = 10.100.0.0/21

The IP address for the mon.controller1 is set to a cluster network IP
address--namely, 10.100.10.1:6789.  Since the monitor only connects on the
public network, and you have specifically told it to connect only on a
cluster network, that is why the monitor is running on the cluster network.
Your monitor address should be something like 10.100.0.155:6789 in that
range.

However, now that you have a monitor IP address, changing it can be a bit
troublesome too. See the following:

http://ceph.com/docs/master/rados/operations/add-or-rm-mons/#changing-a-monitor-s-ip-address





On Wed, Jan 15, 2014 at 1:13 PM, Jeff Bachtel <
jbach...@bericotechnologies.com> wrote:

>  If I understand correctly then, I should either not specify mon addr or
> set it to an external IP?
>
> Thanks for the clarification,
>
> Jeff
>
>
> On 01/15/2014 03:58 PM, John Wilkins wrote:
>
> Jeff,
>
>  First, if you've specified the public and cluster networks in [global],
> you don't need to specify it anywhere else. If you do, they get overridden.
> That's not the issue here. It appears from your ceph.conf file that you've
> specified an address on the cluster network. Specifically, you specified mon
> addr = 10.100.10.1:6789, but you indicated elsewhere that this IP address
> belongs to the cluster network.
>
>
> On Mon, Jan 13, 2014 at 11:29 AM, Jeff Bachtel <
> jbach...@bericotechnologies.com> wrote:
>
>> I've got a cluster with 3 mons, all of which are binding solely to a
>> cluster network IP, and neither to 0.0.0.0:6789 nor a public IP. I
>> hadn't noticed the problem until now because it makes little difference in
>> how I normally use Ceph (rbd and radosgw), but now that I'm trying to use
>> cephfs it's obviously suboptimal.
>>
>> [global]
>>   auth cluster required = cephx
>>   auth service required = cephx
>>   auth client required = cephx
>>   keyring = /etc/ceph/keyring
>>   cluster network = 10.100.10.0/24
>>   public network = 10.100.0.0/21
>>   public addr = 10.100.0.150
>>   cluster addr = 10.100.10.1
>>    fsid = de10594a-0737-4f34-a926-58dc9254f95f
>>
>> [mon]
>>   cluster network = 10.100.10.0/24
>>   public network = 10.100.0.0/21
>>   mon data = /var/lib/ceph/mon/mon.$id
>>
>> [mon.controller1]
>>   host = controller1
>>   mon addr = 10.100.10.1:6789
>>   public addr = 10.100.0.150
>>   cluster addr = 10.100.10.1
>>   cluster network = 10.100.10.0/24
>>   public network = 10.100.0.0/21
>>
>> And then with /usr/bin/ceph-mon -i controller1 --debug_ms 12 --pid-file
>> /var/run/ceph/mon.controller1.pid -c /etc/ceph/ceph.conf I get in logs
>>
>> 2014-01-13 14:19:13.578458 7f195e6d97a0  0 ceph version 0.72.2
>> (a913ded2ff138aefb8cb84d347d72164099cfd60), process ceph-mon, pid 7559
>> 2014-01-13 14:19:13.641639 7f195e6d97a0 10 -- :/0 rank.bind
>> 10.100.10.1:6789/0
>> 2014-01-13 14:19:13.641668 7f195e6d97a0 10 accepter.accepter.bind
>> 2014-01-13 14:19:13.642773 7f195e6d97a0 10 accepter.accepter.bind bound
>> to 10.100.10.1:6789/0
>> 2014-01-13 14:19:13.642800 7f195e6d97a0  1 -- 10.100.10.1:6789/0 learned
>> my addr 10.100.10.1:6789/0
>> 2014-01-13 14:19:13.642808 7f195e6d97a0  1 accepter.accepter.bind
>> my_inst.addr is 10.100.10.1:6789/0 need_addr=0
>>
>> Whith no mention of public addr (10.100.2.1) or public network (
>> 10.100.0.0/21) found. mds (on this host) and osd (on other hosts) bind
>> to 0.0.0.0 and a public IP, respectively.
>>
>> At this point public/cluster addr/network are WAY overspecified in
>> ceph.conf, but the problem appeared with far less specification.
>>
>> Any ideas? Thanks,
>>
>> Jeff
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>
>  --
> John Wilkins
> Senior Technical Writer
> Intank
> john.wilk...@inktank.com
> (415) 425-9599
> http://inktank.com
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
John Wilkins
Senior Technical Writer
Intank
john.wilk...@inktank.com
(415) 425-9599
http://inktank.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to