Re: [ceph-users] ceph pg repair fails...?

2019-10-03 Thread Jake Grimmett
Dear All, Many thanks to Brad and Mattia for good advice. I was away for two days, in the meantime the pg has fixed itself. I'm not complaining, but it's strange... Looking at the OSD logs, we see the previous repair fail. Then a routine scrub appears to fix the issue. The same thing happened

Re: [ceph-users] ceph pg repair fails...?

2019-10-01 Thread Brad Hubbard
On Wed, Oct 2, 2019 at 1:15 AM Mattia Belluco wrote: > > Hi Jake, > > I am curious to see if your problem is similar to ours (despite the fact > we are still on Luminous). > > Could you post the output of: > > rados list-inconsistent-obj > > and > > rados list-inconsistent-snapset Make sure

Re: [ceph-users] ceph pg repair fails...?

2019-10-01 Thread Mattia Belluco
Hi Jake, I am curious to see if your problem is similar to ours (despite the fact we are still on Luminous). Could you post the output of: rados list-inconsistent-obj and rados list-inconsistent-snapset Thanks, Mattia On 10/1/19 1:08 PM, Jake Grimmett wrote: > Dear All, > > I've just

[ceph-users] ceph pg repair fails...?

2019-10-01 Thread Jake Grimmett
Dear All, I've just found two inconsistent pg that fail to repair. This might be the same bug as shown here: Cluster is running Nautilus 14.2.2 OS is Scientific Linux 7.6 DB/WAL on NVMe, Data on 12TB HDD Logs below cab also be