Hey Ben, Take a look at the osd log for another OSD who's ip you did not change.
What errors does it show related the re-ip'd OSD? Is the other OSD trying to communicate with the re-ip'd OSD's old ip address? Jake On Wed, Aug 30, 2017 at 3:55 PM Jeremy Hanmer <jeremy.han...@dreamhost.com> wrote: > This is simply not true. We run quite a few ceph clusters with > rack-level layer2 domains (thus routing between racks) and everything > works great. > > On Wed, Aug 30, 2017 at 10:52 AM, David Turner <drakonst...@gmail.com> > wrote: > > ALL OSDs need to be running the same private network at the same time. > ALL > > clients, RGW, OSD, MON, MGR, MDS, etc, etc need to be running on the same > > public network at the same time. You cannot do this as a one at a time > > migration to the new IP space. Even if all of the servers can still > > communicate via routing, it just won't work. Changing the public/private > > network addresses for a cluster requires full cluster down time. > > > > On Wed, Aug 30, 2017 at 11:09 AM Ben Morrice <ben.morr...@epfl.ch> > wrote: > >> > >> Hello > >> > >> We have a small cluster that we need to move to a different network in > >> the same datacentre. > >> > >> My workflow was the following (for a single OSD host), but I failed > >> (further details below) > >> > >> 1) ceph osd set noout > >> 2) stop ceph-osd processes > >> 3) change IP, gateway, domain (short hostname is the same), VLAN > >> 4) change references of OLD IP (cluster and public network) in > >> /etc/ceph/ceph.conf with NEW IP (see [1]) > >> 5) start a single OSD process > >> > >> This seems to work as the NEW IP can communicate with mon hosts and osd > >> hosts on the OLD network, the OSD is booted and is visible via 'ceph -w' > >> however after a few seconds the OSD drops with messages such as the > >> below in it's log file > >> > >> heartbeat_check: no reply from 10.1.1.100:6818 osd.14 ever on either > >> front or back, first ping sent 2017-08-30 16:42:14.692210 (cutoff > >> 2017-08-30 16:42:24.962245) > >> > >> There are logs like the above for every OSD server/process > >> > >> and then eventually a > >> > >> 2017-08-30 16:42:14.486275 7f6d2c966700 0 log_channel(cluster) log > >> [WRN] : map e85351 wrongly marked me down > >> > >> > >> Am I missing something obvious to reconfigure the network on a OSD host? > >> > >> > >> > >> [1] > >> > >> OLD > >> [osd.0] > >> host = sn01 > >> devs = /dev/sdi > >> cluster addr = 10.1.1.101 > >> public addr = 10.1.1.101 > >> NEW > >> [osd.0] > >> host = sn01 > >> devs = /dev/sdi > >> cluster addr = 10.1.2.101 > >> public addr = 10.1.2.101 > >> > >> -- > >> Kind regards, > >> > >> Ben Morrice > >> > >> ______________________________________________________________________ > >> Ben Morrice | e: ben.morr...@epfl.ch | t: +41-21-693-9670 > >> EPFL / BBP > >> Biotech Campus > >> Chemin des Mines 9 > >> 1202 Geneva > >> Switzerland > >> > >> _______________________________________________ > >> ceph-users mailing list > >> ceph-users@lists.ceph.com > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com