Re: [ceph-users] ceph recovery incomplete PGs on Luminous RC

Daniel K Mon, 24 Jul 2017 09:58:00 -0700

I was able to export the PGs using the ceph-object-store tool and import
them to the new OSDs.


I moved some other OSDs from the bare metal on a node into a virtual
machine on the same node and was surprised at how easy it was. Install ceph
in the VM(using ceph-deploy) -- stop the OSD and dismount OSD drive from
physical machine, mount it to the VM, the OSD was auto-detected and
ceph-osd process started automatically and was up within a few seconds.

I'm having a different problem now that I will make a separate message
about.

Thanks!


On Mon, Jul 24, 2017 at 12:52 PM, Gregory Farnum <gfar...@redhat.com> wrote:

>
> On Fri, Jul 21, 2017 at 10:23 PM Daniel K <satha...@gmail.com> wrote:
>
>> Luminous 12.1.0(RC)
>>
>> I replaced two OSD drives(old ones were still good, just too small),
>> using:
>>
>> ceph osd out osd.12
>> ceph osd crush remove osd.12
>> ceph auth del osd.12
>> systemctl stop ceph-osd@osd.12
>> ceph osd rm osd.12
>>
>> I later found that I also should have unmounted it from
>> /var/lib/ceph/osd-12
>>
>> (remove old disk, insert new disk)
>>
>> I added the new disk/osd with ceph-deploy osd prepare stor-vm3:sdg
>> --bluestore
>>
>> This automatically activated the osd (not sure why, I thought it needed a
>> ceph-deploy osd activate as well)
>>
>>
>> Then, working on an unrelated issue, I upgraded one (out of 4 total)
>> nodes to 12.1.1 using apt and rebooted.
>>
>> The mon daemon would not form a quorum with the others on 12.1.0, so,
>> instead of troubleshooting that, I just went ahead and upgraded the other 3
>> nodes and rebooted.
>>
>> Lots of recovery IO went on afterwards, but now things have stopped at:
>>
>>     pools:   10 pools, 6804 pgs
>>     objects: 1784k objects, 7132 GB
>>     usage:   11915 GB used, 19754 GB / 31669 GB avail
>>     pgs:     0.353% pgs not active
>>              70894/2988573 objects degraded (2.372%)
>>              422090/2988573 objects misplaced (14.123%)
>>              6626 active+clean
>>              129  active+remapped+backfill_wait
>>              23   incomplete
>>              14   active+undersized+degraded+remapped+backfill_wait
>>              4    active+undersized+degraded+remapped+backfilling
>>              4    active+remapped+backfilling
>>              2    active+clean+scrubbing+deep
>>              1    peering
>>              1    active+recovery_wait+degraded+remapped
>>
>>
>> when I run ceph pg query on the incompletes, they all list at least one
>> of the two removed OSDs(12,17) in "down_osds_we_would_probe"
>>
>> most pools are size:2 min_size 1(trusting bluestore to tell me which one
>> is valid). One pool is size:1 min size:1 and I'm okay with losing it,
>> except I had it mounted in a directory on cephfs, I rm'd the directory but
>> I can't delete the pool because it's "in use by CephFS"
>>
>>
>> I still have the old drives, can I stick them into another host and
>> re-add them somehow?
>>
>
> Yes, that'll probably be your easiest solution. You may have some trouble
> because you already deleted them, but I'm not sure.
>
> Alternatively, you ought to be able to remove the pool from CephFS using
> some of the monitor commands and then delete it.
>
>
>> This data isn't super important, but I'd like to learn a bit on how to
>> recover when bad things happen as we are planning a production deployment
>> in a couple of weeks.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph recovery incomplete PGs on Luminous RC

Reply via email to