Re: [ceph-users] dealing with incomplete PGs while using bluestore

Daniel K Sat, 22 Jul 2017 17:04:38 -0700

I am in the process of doing exactly what you are -- this worked for me:

1. mount the first partition of the bluestore drive that holds the missing
PGs (if it's not already mounted)
> mkdir /mnt/tmp
> mount /dev/sdb1 /mnt/tmp



2. export the pg to a suitable temporary storage location:
> ceph-objectstore-tool --data-path /mnt/tmp --pgid 1.24 --op export --file
/mnt/sdd1/recover.1.24

3. find the acting osd
> ceph health detail |grep incomplete

PG_DEGRADED Degraded data redundancy: 23 pgs unclean, 23 pgs incomplete
    pg 1.24 is incomplete, acting [18,13]
    pg 4.1f is incomplete, acting [11]
    ...
4. set noout
> ceph osd set noout

5. Find the OSD and log into it -- I used 18 here.
> ceph osd find 18
{
    "osd": 18,
    "ip": "10.0.15.54:6801/9263",
    "crush_location": {
        "building": "building-dc",
        "chassis": "chassis-dc400f5-10",
        "city": "city",
        "floor": "floor-dc4",
        "host": "stor-vm4",
        "rack": "rack-dc400f5",
        "region": "cfl",
        "room": "room-dc400",
        "root": "default",
        "row": "row-dc400f"
    }
}

> ssh user@10.0.15.54

6. copy the file to somewhere accessible by the new(acting) osd
> scp user@10.0.14.51:/mnt/sdd1/recover.1.24 /tmp/recover.1.24

7. stop the osd
> service ceph-osd@18 stop

8. import the file using ceph-objectstore-tool
> ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-18 --op import
--file /tmp/recover.1.24

9. start the osd
> service-osd@18 start

this worked for me -- not sure if this is the best way or if I took any
extra steps and I have yet to validate that the data is good.

I based this partially off your original email, and the guide here
http://ceph.com/geen-categorie/incomplete-pgs-oh-my/






On Sat, Jul 22, 2017 at 4:46 PM, mofta7y <moft...@gmail.com> wrote:

> Hi All,
>
> I have a situation here.
>
> I have an EC pool that is having cache tier pool (the cache tier is
> replicated with size 2).
>
> Had an issue on the pool and the crush map got changed after rebooting
> some OSD in any case I lost 4 cache ties OSDs
>
> those lost OSDs are not really lost they look fine to me but bluestore is
> giving me exception when starting them i cant deal with it. (will open
> question about that exception as well)
>
> So now i have 14 incomplete Pgs on the caching tier.
>
>
> I am trying to recover them using ceph-objectstore-tool
>
> the extraction and import works nice with no issues but the OSD fail to
> start after wards with same issue as the original OSD .
>
> after importing the PG on the acting OSD i get the exact same exception I
> was getting while trying to start the failed OSD
>
> removing that import resolve the issue.
>
>
> So the question is how can use ceph-objectstore-tool to import in
> bluestore as i think i am missing somthing here
>
>
> here is the procedure and the steps i used
>
> 1- stop old osd (it cannot start anyway)
>
> 2- use this command to extract the pg i need
>
> ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-116 --pgid 15.371
> --op export --file /tmp/recover.15.371
>
> that command work
>
> 3- check what is the acting OSD for the pg
>
> 4- stop the acting OSD
>
> 5- delete the current folder with same og name
>
> 6- use this command
>
> ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-78  --op import
> /tmp/recover.15.371
> the error i got in both cases is this bluestore error
>
> Jul 22 16:35:20 alm9 ceph-osd[3799171]:   -257> 2017-07-22 16:20:19.544195
> 7f7157036a40 -1 osd.116 119691 log_to_monitors {default=true}
> Jul 22 16:35:20 alm9 ceph-osd[3799171]:      0> 2017-07-22 16:35:20.142143
> 7f713c597700 -1 /tmp/buildd/ceph-11.2.0/src/os/bluestore/BitMapAllocator.cc:
> In function 'virtual int BitMapAllocator::reserve(uint64_t)' thread
> 7f713c597700 time 2017-07-22 16:35:20.139309
> Jul 22 16:35:20 alm9 ceph-osd[3799171]: 
> /tmp/buildd/ceph-11.2.0/src/os/bluestore/BitMapAllocator.cc:
> 82: FAILED assert(!(need % m_block_size))
> Jul 22 16:35:20 alm9 ceph-osd[3799171]:  ceph version 11.2.0
> (f223e27eeb35991352ebc1f67423d4ebc252adb7)
> Jul 22 16:35:20 alm9 ceph-osd[3799171]:  1: (ceph::__ceph_assert_fail(char
> const*, char const*, int, char const*)+0x80) [0x562b84558380]
> Jul 22 16:35:20 alm9 ceph-osd[3799171]:  2: (BitMapAllocator::reserve(unsigned
> long)+0x2ab) [0x562b8437c5cb]
> Jul 22 16:35:20 alm9 ceph-osd[3799171]:  3: (BlueFS::reclaim_blocks(unsigned
> int, unsigned long, std::vector<AllocExtent, 
> mempool::pool_allocator<(mempool::pool_index_t)7,
> AllocExtent> >*)+0x22a) [0x562b8435109a]
> Jul 22 16:35:20 alm9 ceph-osd[3799171]:  4: (BlueStore::_balance_bluefs_fr
> eespace(std::vector<bluestore_pextent_t, std::allocator<bluestore_pextent_t>
> >*)+0x28e) [0x562b84270dae]
> Jul 22 16:35:20 alm9 ceph-osd[3799171]:  5: 
> (BlueStore::_kv_sync_thread()+0x164a)
> [0x562b84273eea]
> Jul 22 16:35:20 alm9 ceph-osd[3799171]:  6: 
> (BlueStore::KVSyncThread::entry()+0xd)
> [0x562b842ad9dd]
> Jul 22 16:35:20 alm9 ceph-osd[3799171]:  7: (()+0x76ba) [0x7f71560c76ba]
> Jul 22 16:35:20 alm9 ceph-osd[3799171]:  8: (clone()+0x6d) [0x7f71547953dd]
> Jul 22 16:35:20 alm9 ceph-osd[3799171]:  NOTE: a copy of the executable,
> or `objdump -rdS <executable>` is needed to interpret this.
>
>
> if any one have any idea how to restore those PGs please point me to the
> right direction
>
>
> by the way resarting the folder that i deleted in step5 manually make the
> osd go up again
>
>
>
> Thanks
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] dealing with incomplete PGs while using bluestore

Reply via email to