Re: [ceph-users] Repair inconsistent pgs..

Voloshanenko Igor Thu, 20 Aug 2015 15:43:23 -0700

Sam, i try to understand which rbd contain this chunks.. but no luck. No
rbd images block names started with this...


Actually, now that I think about it, you probably didn't remove the
> images for 3fac9490/rbd_data.eb5f22eb141f2.00000000000004ba/snapdir//2
> and 22ca30c4/rbd_data.e846e25a70bf7.0000000000000307/snapdir//2




2015-08-21 1:36 GMT+03:00 Samuel Just <sj...@redhat.com>:

> Actually, now that I think about it, you probably didn't remove the
> images for 3fac9490/rbd_data.eb5f22eb141f2.00000000000004ba/snapdir//2
> and 22ca30c4/rbd_data.e846e25a70bf7.0000000000000307/snapdir//2, but
> other images (that's why the scrub errors went down briefly, those
> objects -- which were fine -- went away).  You might want to export
> and reimport those two images into new images, but leave the old ones
> alone until you can clean up the on disk state (image and snapshots)
> and clear the scrub errors.  You probably don't want to read the
> snapshots for those images either.  Everything else is, I think,
> harmless.
>
> The ceph-objectstore-tool feature would probably not be too hard,
> actually.  Each head/snapdir image has two attrs (possibly stored in
> leveldb -- that's why you want to modify the ceph-objectstore-tool and
> use its interfaces rather than mucking about with the files directly)
> '_' and 'snapset' which contain encoded representations of
> object_info_t and SnapSet (both can be found in src/osd/osd_types.h).
> SnapSet has a set of clones and related metadata -- you want to read
> the SnapSet attr off disk and commit a transaction writing out a new
> version with that clone removed.  I'd start by cloning the repo,
> starting a vstart cluster locally, and reproducing the issue.  Next,
> get familiar with using ceph-objectstore-tool on the osds in that
> vstart cluster.  A good first change would be creating a
> ceph-objectstore-tool op that lets you dump json for the object_info_t
> and SnapSet (both types have format() methods which make that easy) on
> an object to stdout so you can confirm what's actually there.  oftc
> #ceph-devel or the ceph-devel mailing list would be the right place to
> ask questions.
>
> Otherwise, it'll probably get done in the next few weeks.
> -Sam
>
> On Thu, Aug 20, 2015 at 3:10 PM, Voloshanenko Igor
> <igor.voloshane...@gmail.com> wrote:
> > thank you Sam!
> > I also noticed this linked errors during scrub...
> >
> > Now all lools like reasonable!
> >
> > So we will wait for bug to be closed.
> >
> > do you need any help on it?
> >
> > I mean i can help with coding/testing/etc...
> >
> > 2015-08-21 0:52 GMT+03:00 Samuel Just <sj...@redhat.com>:
> >>
> >> Ah, this is kind of silly.  I think you don't have 37 errors, but 2
> >> errors.  pg 2.490 object
> >> 3fac9490/rbd_data.eb5f22eb141f2.00000000000004ba/snapdir//2 is missing
> >> snap 141.  If you look at the objects after that in the log:
> >>
> >> 2015-08-20 20:15:44.865670 osd.19 10.12.2.6:6838/1861727 298 : cluster
> >> [ERR] repair 2.490
> >> 68c89490/rbd_data.16796a3d1b58ba.0000000000000047/head//2 expected
> >> clone 2d7b9490/rbd_data.18f92c3d1b58ba.0000000000006167/141//2
> >> 2015-08-20 20:15:44.865817 osd.19 10.12.2.6:6838/1861727 299 : cluster
> >> [ERR] repair 2.490
> >> ded49490/rbd_data.11a25c7934d3d4.0000000000008a8a/head//2 expected
> >> clone 68c89490/rbd_data.16796a3d1b58ba.0000000000000047/141//2
> >>
> >> The clone from the second line matches the head object from the
> >> previous line, and they have the same clone id.  I *think* that the
> >> first error is real, and the subsequent ones are just scrub being
> >> dumb.  Same deal with pg 2.c4.  I just opened
> >> http://tracker.ceph.com/issues/12738.
> >>
> >> The original problem is that
> >> 3fac9490/rbd_data.eb5f22eb141f2.00000000000004ba/snapdir//2 and
> >> 22ca30c4/rbd_data.e846e25a70bf7.0000000000000307/snapdir//2 are both
> >> missing a clone.  Not sure how that happened, my money is on a
> >> cache/tiering evict racing with a snap trim.  If you have any logging
> >> or relevant information from when that happened, you should open a
> >> bug.  The 'snapdir' in the two object names indicates that the head
> >> object has actually been deleted (which makes sense if you moved the
> >> image to a new image and deleted the old one) and is only being kept
> >> around since there are live snapshots.  I suggest you leave the
> >> snapshots for those images alone for the time being -- removing them
> >> might cause the osd to crash trying to clean up the wierd on disk
> >> state.  Other than the leaked space from those two image snapshots and
> >> the annoying spurious scrub errors, I think no actual corruption is
> >> going on though.  I created a tracker ticket for a feature that would
> >> let ceph-objectstore-tool remove the spurious clone from the
> >> head/snapdir metadata.
> >>
> >> Am I right that you haven't actually seen any osd crashes or user
> >> visible corruption (except possibly on snapshots of those two images)?
> >> -Sam
> >>
> >> On Thu, Aug 20, 2015 at 10:07 AM, Voloshanenko Igor
> >> <igor.voloshane...@gmail.com> wrote:
> >> > Inktank:
> >> >
> >> >
> https://download.inktank.com/docs/ICE%201.2%20-%20Cache%20and%20Erasure%20Coding%20FAQ.pdf
> >> >
> >> > Mail-list:
> >> > https://www.mail-archive.com/ceph-users@lists.ceph.com/msg18338.html
> >> >
> >> > 2015-08-20 20:06 GMT+03:00 Samuel Just <sj...@redhat.com>:
> >> >>
> >> >> Which docs?
> >> >> -Sam
> >> >>
> >> >> On Thu, Aug 20, 2015 at 9:57 AM, Voloshanenko Igor
> >> >> <igor.voloshane...@gmail.com> wrote:
> >> >> > Not yet. I will create.
> >> >> > But according to mail lists and Inktank docs - it's expected
> >> >> > behaviour
> >> >> > when
> >> >> > cache enable
> >> >> >
> >> >> > 2015-08-20 19:56 GMT+03:00 Samuel Just <sj...@redhat.com>:
> >> >> >>
> >> >> >> Is there a bug for this in the tracker?
> >> >> >> -Sam
> >> >> >>
> >> >> >> On Thu, Aug 20, 2015 at 9:54 AM, Voloshanenko Igor
> >> >> >> <igor.voloshane...@gmail.com> wrote:
> >> >> >> > Issue, that in forward mode, fstrim doesn't work proper, and
> when
> >> >> >> > we
> >> >> >> > take
> >> >> >> > snapshot - data not proper update in cache layer, and client
> >> >> >> > (ceph)
> >> >> >> > see
> >> >> >> > damaged snap.. As headers requested from cache layer.
> >> >> >> >
> >> >> >> > 2015-08-20 19:53 GMT+03:00 Samuel Just <sj...@redhat.com>:
> >> >> >> >>
> >> >> >> >> What was the issue?
> >> >> >> >> -Sam
> >> >> >> >>
> >> >> >> >> On Thu, Aug 20, 2015 at 9:41 AM, Voloshanenko Igor
> >> >> >> >> <igor.voloshane...@gmail.com> wrote:
> >> >> >> >> > Samuel, we turned off cache layer few hours ago...
> >> >> >> >> > I will post ceph.log in few minutes
> >> >> >> >> >
> >> >> >> >> > For snap - we found issue, was connected with cache tier..
> >> >> >> >> >
> >> >> >> >> > 2015-08-20 19:23 GMT+03:00 Samuel Just <sj...@redhat.com>:
> >> >> >> >> >>
> >> >> >> >> >> Ok, you appear to be using a replicated cache tier in front
> of
> >> >> >> >> >> a
> >> >> >> >> >> replicated base tier.  Please scrub both inconsistent pgs
> and
> >> >> >> >> >> post
> >> >> >> >> >> the
> >> >> >> >> >> ceph.log from before when you started the scrub until after.
> >> >> >> >> >> Also,
> >> >> >> >> >> what command are you using to take snapshots?
> >> >> >> >> >> -Sam
> >> >> >> >> >>
> >> >> >> >> >> On Thu, Aug 20, 2015 at 3:59 AM, Voloshanenko Igor
> >> >> >> >> >> <igor.voloshane...@gmail.com> wrote:
> >> >> >> >> >> > Hi Samuel, we try to fix it in trick way.
> >> >> >> >> >> >
> >> >> >> >> >> > we check all rbd_data chunks from logs (OSD) which are
> >> >> >> >> >> > affected,
> >> >> >> >> >> > then
> >> >> >> >> >> > query
> >> >> >> >> >> > rbd info to compare which rbd consist bad rbd_data, after
> >> >> >> >> >> > that
> >> >> >> >> >> > we
> >> >> >> >> >> > mount
> >> >> >> >> >> > this
> >> >> >> >> >> > rbd as rbd0, create empty rbd, and DD all info from bad
> >> >> >> >> >> > volume
> >> >> >> >> >> > to
> >> >> >> >> >> > new
> >> >> >> >> >> > one.
> >> >> >> >> >> >
> >> >> >> >> >> > But after that - scrub errors growing... Was 15 errors..
> >> >> >> >> >> > .Now
> >> >> >> >> >> > 35...
> >> >> >> >> >> > We
> >> >> >> >> >> > laos
> >> >> >> >> >> > try to out OSD which was lead, but after rebalancing this
> 2
> >> >> >> >> >> > pgs
> >> >> >> >> >> > still
> >> >> >> >> >> > have
> >> >> >> >> >> > 35 scrub errors...
> >> >> >> >> >> >
> >> >> >> >> >> > ceph osd getmap -o <outfile> - attached
> >> >> >> >> >> >
> >> >> >> >> >> >
> >> >> >> >> >> > 2015-08-18 18:48 GMT+03:00 Samuel Just <sj...@redhat.com
> >:
> >> >> >> >> >> >>
> >> >> >> >> >> >> Is the number of inconsistent objects growing?  Can you
> >> >> >> >> >> >> attach
> >> >> >> >> >> >> the
> >> >> >> >> >> >> whole ceph.log from the 6 hours before and after the
> >> >> >> >> >> >> snippet
> >> >> >> >> >> >> you
> >> >> >> >> >> >> linked above?  Are you using cache/tiering?  Can you
> attach
> >> >> >> >> >> >> the
> >> >> >> >> >> >> osdmap
> >> >> >> >> >> >> (ceph osd getmap -o <outfile>)?
> >> >> >> >> >> >> -Sam
> >> >> >> >> >> >>
> >> >> >> >> >> >> On Tue, Aug 18, 2015 at 4:15 AM, Voloshanenko Igor
> >> >> >> >> >> >> <igor.voloshane...@gmail.com> wrote:
> >> >> >> >> >> >> > ceph - 0.94.2
> >> >> >> >> >> >> > Its happen during rebalancing
> >> >> >> >> >> >> >
> >> >> >> >> >> >> > I thought too, that some OSD miss copy, but looks like
> >> >> >> >> >> >> > all
> >> >> >> >> >> >> > miss...
> >> >> >> >> >> >> > So any advice in which direction i need to go
> >> >> >> >> >> >> >
> >> >> >> >> >> >> > 2015-08-18 14:14 GMT+03:00 Gregory Farnum
> >> >> >> >> >> >> > <gfar...@redhat.com>:
> >> >> >> >> >> >> >>
> >> >> >> >> >> >> >> From a quick peek it looks like some of the OSDs are
> >> >> >> >> >> >> >> missing
> >> >> >> >> >> >> >> clones
> >> >> >> >> >> >> >> of
> >> >> >> >> >> >> >> objects. I'm not sure how that could happen and I'd
> >> >> >> >> >> >> >> expect
> >> >> >> >> >> >> >> the
> >> >> >> >> >> >> >> pg
> >> >> >> >> >> >> >> repair to handle that but if it's not there's probably
> >> >> >> >> >> >> >> something
> >> >> >> >> >> >> >> wrong; what version of Ceph are you running? Sam, is
> >> >> >> >> >> >> >> this
> >> >> >> >> >> >> >> something
> >> >> >> >> >> >> >> you've seen, a new bug, or some kind of config issue?
> >> >> >> >> >> >> >> -Greg
> >> >> >> >> >> >> >>
> >> >> >> >> >> >> >> On Tue, Aug 18, 2015 at 6:27 AM, Voloshanenko Igor
> >> >> >> >> >> >> >> <igor.voloshane...@gmail.com> wrote:
> >> >> >> >> >> >> >> > Hi all, at our production cluster, due high
> >> >> >> >> >> >> >> > rebalancing
> >> >> >> >> >> >> >> > (((
> >> >> >> >> >> >> >> > we
> >> >> >> >> >> >> >> > have 2
> >> >> >> >> >> >> >> > pgs in
> >> >> >> >> >> >> >> > inconsistent state...
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> > root@temp:~# ceph health detail | grep inc
> >> >> >> >> >> >> >> > HEALTH_ERR 2 pgs inconsistent; 18 scrub errors
> >> >> >> >> >> >> >> > pg 2.490 is active+clean+inconsistent, acting
> >> >> >> >> >> >> >> > [56,15,29]
> >> >> >> >> >> >> >> > pg 2.c4 is active+clean+inconsistent, acting
> >> >> >> >> >> >> >> > [56,10,42]
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> > From OSD logs, after recovery attempt:
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> > root@test:~# ceph pg dump | grep -i incons | cut
> -f 1
> >> >> >> >> >> >> >> > |
> >> >> >> >> >> >> >> > while
> >> >> >> >> >> >> >> > read
> >> >> >> >> >> >> >> > i;
> >> >> >> >> >> >> >> > do
> >> >> >> >> >> >> >> > ceph pg repair ${i} ; done
> >> >> >> >> >> >> >> > dumped all in format plain
> >> >> >> >> >> >> >> > instructing pg 2.490 on osd.56 to repair
> >> >> >> >> >> >> >> > instructing pg 2.c4 on osd.56 to repair
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> > /var/log/ceph/ceph-osd.56.log:51:2015-08-18
> >> >> >> >> >> >> >> > 07:26:37.035910
> >> >> >> >> >> >> >> > 7f94663b3700
> >> >> >> >> >> >> >> > -1
> >> >> >> >> >> >> >> > log_channel(cluster) log [ERR] : deep-scrub 2.490
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> >
> f5759490/rbd_data.1631755377d7e.00000000000004da/head//2
> >> >> >> >> >> >> >> > expected
> >> >> >> >> >> >> >> > clone
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> >
> 90c59490/rbd_data.eb486436f2beb.0000000000007a65/141//2
> >> >> >> >> >> >> >> > /var/log/ceph/ceph-osd.56.log:52:2015-08-18
> >> >> >> >> >> >> >> > 07:26:37.035960
> >> >> >> >> >> >> >> > 7f94663b3700
> >> >> >> >> >> >> >> > -1
> >> >> >> >> >> >> >> > log_channel(cluster) log [ERR] : deep-scrub 2.490
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> >
> fee49490/rbd_data.12483d3ba0794b.000000000000522f/head//2
> >> >> >> >> >> >> >> > expected
> >> >> >> >> >> >> >> > clone
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> >
> f5759490/rbd_data.1631755377d7e.00000000000004da/141//2
> >> >> >> >> >> >> >> > /var/log/ceph/ceph-osd.56.log:53:2015-08-18
> >> >> >> >> >> >> >> > 07:26:37.036133
> >> >> >> >> >> >> >> > 7f94663b3700
> >> >> >> >> >> >> >> > -1
> >> >> >> >> >> >> >> > log_channel(cluster) log [ERR] : deep-scrub 2.490
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> >
> a9b39490/rbd_data.12483d3ba0794b.00000000000037b3/head//2
> >> >> >> >> >> >> >> > expected
> >> >> >> >> >> >> >> > clone
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> >
> fee49490/rbd_data.12483d3ba0794b.000000000000522f/141//2
> >> >> >> >> >> >> >> > /var/log/ceph/ceph-osd.56.log:54:2015-08-18
> >> >> >> >> >> >> >> > 07:26:37.036243
> >> >> >> >> >> >> >> > 7f94663b3700
> >> >> >> >> >> >> >> > -1
> >> >> >> >> >> >> >> > log_channel(cluster) log [ERR] : deep-scrub 2.490
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> >
> bac19490/rbd_data.1238e82ae8944a.000000000000032e/head//2
> >> >> >> >> >> >> >> > expected
> >> >> >> >> >> >> >> > clone
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> >
> a9b39490/rbd_data.12483d3ba0794b.00000000000037b3/141//2
> >> >> >> >> >> >> >> > /var/log/ceph/ceph-osd.56.log:55:2015-08-18
> >> >> >> >> >> >> >> > 07:26:37.036289
> >> >> >> >> >> >> >> > 7f94663b3700
> >> >> >> >> >> >> >> > -1
> >> >> >> >> >> >> >> > log_channel(cluster) log [ERR] : deep-scrub 2.490
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> >
> 98519490/rbd_data.123e9c2ae8944a.0000000000000807/head//2
> >> >> >> >> >> >> >> > expected
> >> >> >> >> >> >> >> > clone
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> >
> bac19490/rbd_data.1238e82ae8944a.000000000000032e/141//2
> >> >> >> >> >> >> >> > /var/log/ceph/ceph-osd.56.log:56:2015-08-18
> >> >> >> >> >> >> >> > 07:26:37.036314
> >> >> >> >> >> >> >> > 7f94663b3700
> >> >> >> >> >> >> >> > -1
> >> >> >> >> >> >> >> > log_channel(cluster) log [ERR] : deep-scrub 2.490
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> >
> c3c09490/rbd_data.1238e82ae8944a.0000000000000c2b/head//2
> >> >> >> >> >> >> >> > expected
> >> >> >> >> >> >> >> > clone
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> >
> 98519490/rbd_data.123e9c2ae8944a.0000000000000807/141//2
> >> >> >> >> >> >> >> > /var/log/ceph/ceph-osd.56.log:57:2015-08-18
> >> >> >> >> >> >> >> > 07:26:37.036363
> >> >> >> >> >> >> >> > 7f94663b3700
> >> >> >> >> >> >> >> > -1
> >> >> >> >> >> >> >> > log_channel(cluster) log [ERR] : deep-scrub 2.490
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> >
> 28809490/rbd_data.edea7460fe42b.00000000000001d9/head//2
> >> >> >> >> >> >> >> > expected
> >> >> >> >> >> >> >> > clone
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> >
> c3c09490/rbd_data.1238e82ae8944a.0000000000000c2b/141//2
> >> >> >> >> >> >> >> > /var/log/ceph/ceph-osd.56.log:58:2015-08-18
> >> >> >> >> >> >> >> > 07:26:37.036432
> >> >> >> >> >> >> >> > 7f94663b3700
> >> >> >> >> >> >> >> > -1
> >> >> >> >> >> >> >> > log_channel(cluster) log [ERR] : deep-scrub 2.490
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> >
> e1509490/rbd_data.1423897545e146.00000000000009a6/head//2
> >> >> >> >> >> >> >> > expected
> >> >> >> >> >> >> >> > clone
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> >
> 28809490/rbd_data.edea7460fe42b.00000000000001d9/141//2
> >> >> >> >> >> >> >> > /var/log/ceph/ceph-osd.56.log:59:2015-08-18
> >> >> >> >> >> >> >> > 07:26:38.548765
> >> >> >> >> >> >> >> > 7f94663b3700
> >> >> >> >> >> >> >> > -1
> >> >> >> >> >> >> >> > log_channel(cluster) log [ERR] : 2.490 deep-scrub 17
> >> >> >> >> >> >> >> > errors
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> > So, how i can solve "expected clone" situation by
> >> >> >> >> >> >> >> > hand?
> >> >> >> >> >> >> >> > Thank in advance!
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> > _______________________________________________
> >> >> >> >> >> >> >> > ceph-users mailing list
> >> >> >> >> >> >> >> > ceph-users@lists.ceph.com
> >> >> >> >> >> >> >> >
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >
> >> >> >> >> >> >> >
> >> >> >> >> >> >
> >> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >
> >> >> >
> >> >
> >> >
> >
> >
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Repair inconsistent pgs..

Reply via email to