Re: [ceph-users] Inconsistent PG's, repair ineffective

2013-05-22 Thread David Zafman
You need to find out where the third copy is. Corrupt it. Then let repair copy the data from a good copy. $ ceph pg map 19.1b You should see something like this: osdmap e158 pg 19.1b (19.1b) -> up [13, 22, xx] acting [13, 22, xx] The osd xx that is NOT 13 or 22 has the corrupted copy.Con

Re: [ceph-users] Inconsistent PG's, repair ineffective

2013-05-21 Thread John Nielsen
I've checked, all the disks are fine and the cluster is healthy except for the inconsistent objects. How would I go about manually repairing? On May 21, 2013, at 3:26 PM, David Zafman wrote: > > I can't reproduce this on v0.61-2. Could the disks for osd.13 & osd.22 be > unwritable? > > In

Re: [ceph-users] Inconsistent PG's, repair ineffective

2013-05-21 Thread David Zafman
I can't reproduce this on v0.61-2. Could the disks for osd.13 & osd.22 be unwritable? In your case it looks like the 3rd replica is probably the bad one, since osd.13 and osd.22 are the same. You probably want to manually repair the 3rd replica. David Zafman Senior Developer http://www.inkt

Re: [ceph-users] Inconsistent PG's, repair ineffective

2013-05-21 Thread John Nielsen
Cuttlefish on CentOS 6, ceph-0.61.2-0.el6.x86_64. On May 21, 2013, at 12:13 AM, David Zafman wrote: > > What version of ceph are you running? > > David Zafman > Senior Developer > http://www.inktank.com > > On May 20, 2013, at 9:14 AM, John Nielsen wrote: > >> Some scrub errors showed up on

Re: [ceph-users] Inconsistent PG's, repair ineffective

2013-05-20 Thread David Zafman
What version of ceph are you running? David Zafman Senior Developer http://www.inktank.com On May 20, 2013, at 9:14 AM, John Nielsen wrote: > Some scrub errors showed up on our cluster last week. We had some issues with > host stability a couple weeks ago; my guess is that errors were introdu

[ceph-users] Inconsistent PG's, repair ineffective

2013-05-20 Thread John Nielsen
Some scrub errors showed up on our cluster last week. We had some issues with host stability a couple weeks ago; my guess is that errors were introduced at that point and a recent background scrub detected them. I was able to clear most of them via "ceph pg repair", but several remain. Based on