Thanks Eugen! I was looking into running all the commands manually, following the docs for add/remove osd but tried ceph-disk first.
I actually made it work by changing the id part in ceph-disk ( it was checking the wrong journal device, which was owned by root:root ). The next part was that I tried re-using an old journal, so I had to create a new one ( parted / sgdisk to set ceph-journal parttype). Could I have just zapped the previous journal? After that it prepared successfully and starting peering. Unsetting nobackfill let it recover a 4TB HDD in approx 9 hours. The best part was that I didn't have to backfill twice then, by reusing the osd uuid. I'll see if I can add to the docs after we have updated to Luminous or Mimic and started using ceph-volume. Kind Regards David Majchrzak On aug 3 2018, at 4:16 pm, Eugen Block <ebl...@nde.ag> wrote: > > Hi, > we have a full bluestore cluster and had to deal with read errors on > the SSD for the block.db. Something like this helped us to recreate a > pre-existing OSD without rebalancing, just refilling the PGs. I would > zap the journal device and let it recreate. It's very similar to your > ceph-deploy output, but maybe you get more of it if you run it manually: > > ceph-osd [--cluster-uuid <CLUSTER_UUID>] [--osd-objectstore filestore] > --mkfs -i <OSD_ID> --osd-journal <PATH_TO_SSD> --osd-data > /var/lib/ceph/osd/ceph-<OSD_ID>/ --mkjournal --setuser ceph --setgroup > ceph --osd-uuid <OSD_UUID> > > Maybe after zapping the journal this will work. At least it would rule > out the old journal as the show-stopper. > > Regards, > Eugen > > > Zitat von David Majchrzak <da...@oderland.se>: > > Hi! > > Trying to replace an OSD on a Jewel cluster (filestore data on HDD + > > journal device on SSD). > > I've set noout and removed the flapping drive (read errors) and > > replaced it with a new one. > > > > I've taken down the osd UUID to be able to prepare the new disk with > > the same osd.ID. The journal device is the same as the previous one > > (should I delete the partition and recreate it?) > > However, running ceph-disk prepare returns: > > # ceph-disk -v prepare --cluster-uuid > > c51a2683-55dc-4634-9d9d-f0fec9a6f389 --osd-uuid > > dc49691a-2950-4028-91ea-742ffc9ed63f --journal-dev --data-dev > > --fs-type xfs /dev/sdo /dev/sda8 > > command: Running command: /usr/bin/ceph-osd --check-allows-journal > > -i 0 --log-file $run_dir/$cluster-osd-check.log --cluster ceph > > --setuser ceph --setgroup ceph > > command: Running command: /usr/bin/ceph-osd --check-wants-journal -i > > 0 --log-file $run_dir/$cluster-osd-check.log --cluster ceph > > --setuser ceph --setgroup ceph > > command: Running command: /usr/bin/ceph-osd --check-needs-journal -i > > 0 --log-file $run_dir/$cluster-osd-check.log --cluster ceph > > --setuser ceph --setgroup ceph > > Traceback (most recent call last): > > File "/usr/sbin/ceph-disk", line 9, in <module> > > load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')() > > File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 5371, in run > > main(sys.argv[1:]) > > File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 5322, in > > main > > args.func(args) > > File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 1900, in > > main > > Prepare.factory(args).prepare() > > File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line > > 1896, in factory > > return PrepareFilestore(args) > > File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line > > 1909, in __init__ > > self.journal = PrepareJournal(args) > > File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line > > 2221, in __init__ > > raise Error('journal specified but not allowed by osd backend') > > ceph_disk.main.Error: Error: journal specified but not allowed by osd > > backend > > > > I tried googling first of course. It COULD be that we have set > > setuser_match_path globally in ceph.conf (like this bug report: > > https://tracker.ceph.com/issues/19642) since the cluster was created > > as dumpling a long time ago. > > Best practice to fix it? Create [osd.X] configs and set > > setuser_match_path in there instead for the old OSDs? > > Should I do any other steps preceding this if I want to use the same > > osd UUID? I've only stopped ceph-osd@21, removed the physical disk, > > inserted new one and tried running prepare. > > Kind Regards, > > David > > > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com