Re: [ceph-users] Continuing placement group problems

kevin horan Thu, 26 Jun 2014 15:36:12 -0700




On 06/26/2014 01:08 PM, Gregory Farnum wrote:

On Thu, Jun 26, 2014 at 12:52 PM, Kevin Horan
<kho...@globalrecordings.net> wrote:

I am also getting inconsistent object errors on a regular basis, about 1-2
every week or so for about 300GB of data. All OSDs are using XFS
filesystems. Some OSDs are individual 3TB internal hard drives and some are
external FC attached raid6 arrays. I am using this cluster to store kvm
images and I've noticed that the inconsistent objects always occur on my two
most recently created VM images, even though one of them is hardly ever used
(just a bare VM not put into production yet). This all started about 4
months ago on 0.72 and now is continuing to occur on version .80. I also
changed the number of replicas from 2 to 3 for the pool containing these
images and that had no effect.

Here is an example log entry:

2014-06-24 18:11:51.683310 7faf44297700  0 log [ERR] : 4.b6 shard 0: soid
c539a8b6/rbd_data.9fdea2ae8944a.00000000000004e2/head//4 digest 2541762784
!= known digest 3305022936
2014-06-24 18:11:52.107321 7faf50f60700  0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
Invalid argument
2014-06-24 18:11:52.215752 7faf5075f700  0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
Invalid argument
2014-06-24 18:11:52.365798 7faf50f60700  0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
Invalid argument
2014-06-24 18:11:52.674643 7faf5075f700  0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
Invalid argument
2014-06-24 18:11:52.749641 7faf50f60700  0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
Invalid argument
2014-06-24 18:11:55.194967 7faf5075f700  0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
Invalid argument
2014-06-24 18:11:55.259322 7faf50f60700  0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
Invalid argument
2014-06-24 18:11:55.526157 7faf5075f700  0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-2) set_extsize: FSSETXATTR: (22)
Invalid argument
2014-06-24 18:11:55.547270 7faf44297700  0 log [ERR] : 4.b6 deep-scrub 0
missing, 1 inconsistent objects
2014-06-24 18:11:55.547282 7faf44297700  0 log [ERR] : 4.b6 deep-scrub 1
errors

Can you go find out what about those files is different? Are they
different sizes, with the overlapping pieces being the same? Are they
completely different?

  Here is the info block on three images:

root@vashti:~/t1# rbd info libvirt-pool/radosgw
rbd image 'radosgw':
        size 10000 MB in 2500 objects
        order 22 (4096 kB objects)
        block_name_prefix: rbd_data.6aad02ae8944a
        format: 2
        features: layering

root@vashti:~/t1# rbd info libvirt-pool/auth-data
rbd image 'auth-data':
        size 10000 MB in 2500 objects
        order 22 (4096 kB objects)
        block_name_prefix: rbd_data.9fdea2ae8944a
        format: 2
        features: layering

root@vashti:~/t1# rbd info libvirt-pool/auth
rbd image 'auth':
        size 10240 MB in 2560 objects
        order 22 (4096 kB objects)
        block_name_prefix: rbd_data.51a3b2ae8944a
        format: 2
        features: layering
root@vashti:~/t1#

The first two incur the inconsistent objects, while the third one (whichwas created a year ago) does not (nor do my other older images). All ofthem are 10G in size, including the non-problematic one. I'm not surewhat you mean by "overlapping pieces"?

Are your systems losing power or otherwise doing
mean things to the local filesystem?

I have not seen any kernel errors about file systems nor have I hadany file system level problems.

  Have you noticed a pattern of
distribution in terms of the underlying storage system on the
inconsistent OSDs?

I have found the bad objects on PGs whose primary OSD was on a singleinternal drive, and in other cases the primary OSD was on an external drive.

About 3 months ago I had an event where 3 out of only 6 OSDs wheredown while noout was set (pool was set to size=2, min_size=1). About 2minutes after these 3 OSDs came back up, another OSD, not one of thesethree, suffered a physical error and was lost. This resulted in about 10or so lost objects. I soon got this all cleaned up got the cluster backto the clean state (see here<https://www.mail-archive.com/ceph-users@lists.ceph.com/msg09377.html>for the full story). But it was soon after that that I started gettingthese inconsistent objects. Prior to that event I had gone over a yearwithout any inconsistent objects. There has also been a lot ofre-structuring going on with new OSDs being added and/or moved (stillgetting it ready for production). But I always take one step and let itreturn to clean before taking the next step.When I got the first inconsistent object a simple repair didn't workso then I started trying some online suggestions of truncating objectsto the correct size and/or removing objects. Some of these things causedsome of the OSDs to crash and then not start again. I finally had tocompletely delete the image containing the bad objects and then the OSDsstarted to stay up all the time again. After that one though all theinconsistent objects have been fixable with a simple repair.


Thanks for you help.

Kevin

-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com

Sometimes one of the objects has 0 size. I've also started getting the
FSSETXATTR errors recently, though I think that started after this problem
started. I've read elsewhere that these are harmless and will go away in a
future version.  I also looked in the monitor logs but didn't see any
reference to inconsistent or scrubbed objects.

Kevin
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Continuing placement group problems

Reply via email to