Re: [ceph-users] Significant uptick in inconsistent pgs in Jewel 10.2.9

2017-09-08 Thread David Zafman
Robin, Would you generate the values and keys for the various versions of at least one of the objects?   .dir.default.292886573.13181.12 is a good example because there are 3 variations for the same object. If there isn't much activity to .dir.default.64449186.344176, you could do one osd a

Re: [ceph-users] Significant uptick in inconsistent pgs in Jewel 10.2.9

2017-09-08 Thread David Zafman
Robin, The only two changesets I can spot in Jewel that I think might be related are these: 1. http://tracker.ceph.com/issues/20089 https://github.com/ceph/ceph/pull/15416 This should improve the repair functionality. 2. http://tracker.ceph.com/issues/19404 https://github.com/ceph/ceph/pul

Re: [ceph-users] Significant uptick in inconsistent pgs in Jewel 10.2.9

2017-09-08 Thread Robin H. Johnson
On Thu, Sep 07, 2017 at 08:24:04PM +, Robin H. Johnson wrote: > pg 5.3d40 is active+clean+inconsistent, acting [1322,990,655] > pg 5.f1c0 is active+clean+inconsistent, acting [631,1327,91] Here is the output of 'rados list-inconsistent-obj' for the PGs: $ sudo rados list-inconsistent-obj 5.f1c

[ceph-users] Significant uptick in inconsistent pgs in Jewel 10.2.9

2017-09-07 Thread Robin H. Johnson
Hi, Our clusters were upgraded to v10.2.9, from ~v10.2.7 (actually a local git snapshot that was not quite 10.2.7), and since then, we're seeing a LOT more scrub errors than previously. The digest logging on the scrub errors, in some cases, is also now maddeningly short: it doesn't contain ANY in