Over the weekend, two inconsistent PG’s popped up in my cluster. This being 
after having scrubs disabled for close to 6 weeks after a very long rebalance 
after adding 33% more OSD’s, an OSD failing, increasing PG’s, etc.

It appears we came out the other end with 2 inconsistent PG’s and I’m trying to 
resolve them, and not seeming to have much luck.
Ubuntu 16.04, Jewel 10.2.5, 3x replicated pool for reference.

> $ ceph health detail
> HEALTH_ERR 2 pgs inconsistent; 3 scrub errors; 
> noout,sortbitwise,require_jewel_osds flag(s) set
> pg 10.7bd is active+clean+inconsistent, acting [8,23,17]
> pg 10.2d8 is active+clean+inconsistent, acting [18,17,22]
> 3 scrub errors

> $ rados list-inconsistent-pg objects
> ["10.2d8","10.7bd”]

Pretty straight forward, 2 PG’s with inconsistent copies. Lets dig deeper.

> $ rados list-inconsistent-obj 10.2d8 --format=json-pretty
> {
>     "epoch": 21094,
>     "inconsistents": [
>         {
>             "object": {
>                 "name": “object.name",
>                 "nspace": “namespace.name",
>                 "locator": "",
>                 "snap": "head"
>             },
>             "errors": [],
>             "shards": [
>                 {
>                     "osd": 17,
>                     "size": 15913,
>                     "omap_digest": "0xffffffff",
>                     "data_digest": "0xa6798e03",
>                     "errors": []
>                 },
>                 {
>                     "osd": 18,
>                     "size": 15913,
>                     "omap_digest": "0xffffffff",
>                     "data_digest": "0xa6798e03",
>                     "errors": []
>                 },
>                 {
>                     "osd": 22,
>                     "size": 15913,
>                     "omap_digest": "0xffffffff",
>                     "data_digest": "0xa6798e03",
>                     "errors": [
>                         "data_digest_mismatch_oi"
>                     ]
>                 }
>             ]
>         }
>     ]
> }

> $ rados list-inconsistent-obj 10.7bd --format=json-pretty
> {
>     "epoch": 21070,
>     "inconsistents": [
>         {
>             "object": {
>                 "name": “object2.name",
>                 "nspace": “namespace.name",
>                 "locator": "",
>                 "snap": "head"
>             },
>             "errors": [
>                 "read_error"
>             ],
>             "shards": [
>                 {
>                     "osd": 8,
>                     "size": 27691,
>                     "omap_digest": "0xffffffff",
>                     "data_digest": "0x9ce36903",
>                     "errors": []
>                 },
>                 {
>                     "osd": 17,
>                     "size": 27691,
>                     "omap_digest": "0xffffffff",
>                     "data_digest": "0x9ce36903",
>                     "errors": []
>                 },
>                 {
>                     "osd": 23,
>                     "size": 27691,
>                     "errors": [
>                         "read_error"
>                     ]
>                 }
>             ]
>         }
>     ]
> }


So we have one PG (10.7bd) with a read error on osd.23, which is known and 
scheduled for replacement.
We also have a data digest mismatch on PG 10.2d8 on osd.22, which I have been 
attempting to repair with no real tangible results.

> $ ceph pg repair 10.2d8
> instructing pg 10.2d8 on osd.18 to repair

I’ve run the ceph pg repair command multiple times, and each time, it instructs 
osd.18 to repair to the PG.
Is this to assume that osd.18 is the acting member of the copies, and its being 
told to backfill the known-good copy of the PG over the agreed upon wrong 
version on osd.22.

> $ zgrep 'ERR' /var/log/ceph/*
> /var/log/ceph/ceph-osd.18.log.7.gz:2017-02-23 20:45:21.561164 7fc8dfeb8700 -1 
> log_channel(cluster) log [ERR] : 10.2d8 recorded data digest 0x7fa9879c != on 
> disk 0xa6798e03 on 10:1b42251f:{object.name}:head
> /var/log/ceph/ceph-osd.18.log.7.gz:2017-02-23 20:45:21.561225 7fc8dfeb8700 -1 
> log_channel(cluster) log [ERR] : deep-scrub 10.2d8 
> 10:1b42251f:{object.name}:head on disk size (15913) does not match object 
> info size (10280) adjusted for ondisk to (10280)
> /var/log/ceph/ceph-osd.18.log.7.gz:2017-02-23 21:05:59.935815 7fc8dfeb8700 -1 
> log_channel(cluster) log [ERR] : 10.2d8 deep-scrub 2 errors


> $ ceph pg 10.2d8 query
> {
>     "state": "active+clean+inconsistent",
>     "snap_trimq": "[]",
>     "epoch": 21746,
>     "up": [
>         18,
>         17,
>         22
>     ],
>     "acting": [
>         18,
>         17,
>         22
>     ],
>     "actingbackfill": [
>         "17",
>         "18",
>         "22"
>     ],

However, no recovery io ever occurs, and the PG never goes active+clean. Not 
seeing anything exciting in the logs of the OSD’s nor the mon’s.

I’ve found a few articles and mailing list entries that mention downing the 
OSD, flushing the journal, moving object off the disk, starting the OSD, and 
running the repair command again.

However, after finding the object on disk, and eyeballing the size and the 
md5sum, they all appear to be identical.
> $ ls -la 
> /var/lib/ceph/osd/ceph-*/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}
> -rw-r--r-- 1 ceph ceph 15913 Jan 27 02:31 
> /var/lib/ceph/osd/ceph-17/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}
> -rw-r--r-- 1 ceph ceph 15913 Jan 27 02:31 
> /var/lib/ceph/osd/ceph-18/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}
> -rw-r--r-- 1 ceph ceph 15913 Jan 27 02:31 
> /var/lib/ceph/osd/ceph-22/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}

> $ md5sum 
> /var/lib/ceph/osd/ceph-*/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}
> 55a76349b758d68945e5028784c59f24  
> /var/lib/ceph/osd/ceph-17/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}
> 55a76349b758d68945e5028784c59f24  
> /var/lib/ceph/osd/ceph-18/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}

> 55a76349b758d68945e5028784c59f24  
> /var/lib/ceph/osd/ceph-22/current/10.2d8_head/DIR_8/DIR_D/DIR_2/DIR_4/DIR_4/DIR_A/{object.name}

Should I schedule another scrub? Should I do the whole down the OSD, flush 
journal, move object song and dance?

Hoping the user list will provide some insight into the proper steps to move 
forward with. And assuming the other inconsistent PG will fix itself once the 

Thanks,

Reed
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to