Hey Ben,

Take a look at the osd log for another OSD who's ip you did not change.

What errors does it show related the re-ip'd OSD?

Is the other OSD trying to communicate with the re-ip'd OSD's old ip
address?

Jake


On Wed, Aug 30, 2017 at 3:55 PM Jeremy Hanmer <jeremy.han...@dreamhost.com>
wrote:

> This is simply not true. We run quite a few ceph clusters with
> rack-level layer2 domains (thus routing between racks) and everything
> works great.
>
> On Wed, Aug 30, 2017 at 10:52 AM, David Turner <drakonst...@gmail.com>
> wrote:
> > ALL OSDs need to be running the same private network at the same time.
> ALL
> > clients, RGW, OSD, MON, MGR, MDS, etc, etc need to be running on the same
> > public network at the same time.  You cannot do this as a one at a time
> > migration to the new IP space.  Even if all of the servers can still
> > communicate via routing, it just won't work.  Changing the public/private
> > network addresses for a cluster requires full cluster down time.
> >
> > On Wed, Aug 30, 2017 at 11:09 AM Ben Morrice <ben.morr...@epfl.ch>
> wrote:
> >>
> >> Hello
> >>
> >> We have a small cluster that we need to move to a different network in
> >> the same datacentre.
> >>
> >> My workflow was the following (for a single OSD host), but I failed
> >> (further details below)
> >>
> >> 1) ceph osd set noout
> >> 2) stop ceph-osd processes
> >> 3) change IP, gateway, domain (short hostname is the same), VLAN
> >> 4) change references of OLD IP (cluster and public network) in
> >> /etc/ceph/ceph.conf with NEW IP (see [1])
> >> 5) start a single OSD process
> >>
> >> This seems to work as the NEW IP can communicate with mon hosts and osd
> >> hosts on the OLD network, the OSD is booted and is visible via 'ceph -w'
> >> however after a few seconds the OSD drops with messages such as the
> >> below in it's log file
> >>
> >> heartbeat_check: no reply from 10.1.1.100:6818 osd.14 ever on either
> >> front or back, first ping sent 2017-08-30 16:42:14.692210 (cutoff
> >> 2017-08-30 16:42:24.962245)
> >>
> >> There are logs like the above for every OSD server/process
> >>
> >> and then eventually a
> >>
> >> 2017-08-30 16:42:14.486275 7f6d2c966700  0 log_channel(cluster) log
> >> [WRN] : map e85351 wrongly marked me down
> >>
> >>
> >> Am I missing something obvious to reconfigure the network on a OSD host?
> >>
> >>
> >>
> >> [1]
> >>
> >> OLD
> >> [osd.0]
> >>     host = sn01
> >>     devs = /dev/sdi
> >>     cluster addr = 10.1.1.101
> >>     public addr = 10.1.1.101
> >> NEW
> >> [osd.0]
> >>     host = sn01
> >>     devs = /dev/sdi
> >>     cluster addr = 10.1.2.101
> >>     public addr = 10.1.2.101
> >>
> >> --
> >> Kind regards,
> >>
> >> Ben Morrice
> >>
> >> ______________________________________________________________________
> >> Ben Morrice | e: ben.morr...@epfl.ch | t: +41-21-693-9670
> >> EPFL / BBP
> >> Biotech Campus
> >> Chemin des Mines 9
> >> 1202 Geneva
> >> Switzerland
> >>
> >> _______________________________________________
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to