[ceph-users] ceph inconsistent pg missing ec object

2017-10-18 Thread Stijn De Weirdt
hi all, we have a ceph 10.2.7 cluster with a 8+3 EC pool. in that pool, there is a pg in inconsistent state. we followed http://ceph.com/geen-categorie/ceph-manually-repair-object/, however, we are unable to solve our issue. from the primary osd logs, the reported pg had a missing object. we fo

Re: [ceph-users] ceph inconsistent pg missing ec object

2017-10-18 Thread Gregory Farnum
It would help if you can provide the exact output of "ceph -s", "pg query", and any other relevant data. You shouldn't need to do manual repair of erasure-coded pools, since it has checksums and can tell which bits are bad. Following that article may not have done you any good (though I wouldn't ex

Re: [ceph-users] ceph inconsistent pg missing ec object

2017-10-18 Thread Denes Dolhay
Hi All, The linked document is for filestore, which in your case is correct as I understand it, but I wonder, if a similar document exists for bluestore? Thanks, Denes. On 10/18/2017 02:56 PM, Stijn De Weirdt wrote: hi all, we have a ceph 10.2.7 cluster with a 8+3 EC pool. in that pool, the

Re: [ceph-users] ceph inconsistent pg missing ec object

2017-10-19 Thread Stijn De Weirdt
hi greg, i attached the gzip output of the query and some more info below. if you need more, let me know. stijn > [root@mds01 ~]# ceph -s > cluster 92beef0a-1239-4000-bacf-4453ab630e47 > health HEALTH_ERR > 1 pgs inconsistent > 40 requests are blocked > 512 sec >

Re: [ceph-users] ceph inconsistent pg missing ec object

2017-10-19 Thread Gregory Farnum
Okay, you're going to need to explain in very clear terms exactly what happened to your cluster, and *exactly* what operations you performed manually. The PG shards seem to have different views of the PG in question. The primary has a different log_tail, last_user_version, and last_epoch_clean fro

Re: [ceph-users] ceph inconsistent pg missing ec object

2017-10-20 Thread Stijn De Weirdt
hi gregory, we more or less followed the instructions on the site (famous last words, i know ;) grepping for the error in the osd logs of the osds of the pg, the primary logs had "5.5e3s0 shard 59(5) missing 5:c7ae919b:::10014d3184b.:head" we looked for the object using the find command,

Re: [ceph-users] ceph inconsistent pg missing ec object

2017-11-02 Thread Gregory Farnum
Okay, after consulting with a colleague this appears to be an instance of http://tracker.ceph.com/issues/21382. Assuming the object is one that doesn't have snapshots, your easiest resolution is to use rados get to retrieve the object (which, unlike recovery, should work) and then "rados put" it ba

Re: [ceph-users] ceph inconsistent pg missing ec object

2017-11-09 Thread Kenneth Waegeman
Hi Greg, Thanks! This seems to have worked for at least 1 of 2 inconsistent pgs: The inconsistency disappeared after a new scrub. Still waiting for the result of the second pg. I tried to force deep-scrub with `ceph pg deep-scrub ` yesterday, but today the last deep scrub is still from a week