Centos 7.2. .. and i think i just figured it out. One node had directories from former OSDs in /var/lib/ceph/osd. When restarting other OSDs on this host, ceph apparently added those to the crush map, too.
[root@sm-cld-mtl-013 osd]# ls -la /var/lib/ceph/osd/ total 128 drwxr-x--- 8 ceph ceph 90 Feb 24 14:44 . drwxr-x--- 9 ceph ceph 106 Feb 24 14:44 .. drwxr-xr-x 2 root root 6 Jul 2 2015 ceph-42 drwxr-xr-x 2 root root 6 Jul 2 2015 ceph-43 drwxr-xr-x 1 root root 278 May 4 22:21 ceph-44 drwxr-xr-x 1 root root 278 May 4 22:21 ceph-45 drwxr-xr-x 1 root root 278 May 4 22:25 ceph-67 drwxr-xr-x 1 root root 304 May 4 22:25 ceph-86 (42 and 43 are on a different host.. yet when 'systemctl start ceph.target' is used, the osd preflight adds them to the crush map anyway: May 4 22:13:26 sm-cld-mtl-013 ceph-osd: starting osd.67 at :/0 osd_data /var/lib/ceph/osd/ceph-67 /var/lib/ceph/osd/ceph-67/journal May 4 22:13:26 sm-cld-mtl-013 ceph-osd: starting osd.45 at :/0 osd_data /var/lib/ceph/osd/ceph-45 /var/lib/ceph/osd/ceph-45/journal May 4 22:13:26 sm-cld-mtl-013 ceph-osd: WARNING: will not setuid/gid: /var/lib/ceph/osd/ceph-42 owned by 0:0 and not requested 167:167 May 4 22:13:26 sm-cld-mtl-013 ceph-osd: 2016-05-04 22:13:26.529176 7f00cca7c900 -1 #033[0;31m ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-43: (2) No such file or directory#033[0m May 4 22:13:26 sm-cld-mtl-013 ceph-osd: 2016-05-04 22:13:26.534657 7fb55c17e900 -1 #033[0;31m ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-42: (2) No such file or directory#033[0m May 4 22:13:26 sm-cld-mtl-013 systemd: ceph-osd@43.service: main process exited, code=exited, status=1/FAILURE May 4 22:13:26 sm-cld-mtl-013 systemd: Unit ceph-osd@43.service entered failed state. May 4 22:13:26 sm-cld-mtl-013 systemd: ceph-osd@43.service failed. May 4 22:13:26 sm-cld-mtl-013 systemd: ceph-osd@42.service: main process exited, code=exited, status=1/FAILURE May 4 22:13:26 sm-cld-mtl-013 systemd: Unit ceph-osd@42.service entered failed state. May 4 22:13:26 sm-cld-mtl-013 systemd: ceph-osd@42.service failed. -Ben On Tue, May 3, 2016 at 7:16 PM, Wade Holler <wade.hol...@gmail.com> wrote: > Hi Ben, > > What OS+Version ? > > Best Regards, > Wade > > > On Tue, May 3, 2016 at 2:44 PM Ben Hines <bhi...@gmail.com> wrote: > >> My crush map keeps putting some OSDs on the wrong node. Restarting them >> fixes it temporarily, but they eventually hop back to the other node that >> they aren't really on. >> >> Is there anything that can cause this to look for? >> >> Ceph 9.2.1 >> >> -Ben >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com