On Mon, Jan 22, 2018 at 12:33 AM, Samuli Heinonen <samp...@neutraali.net> wrote:
> Hi again, > > here is more information regarding issue described earlier > > It looks like self healing is stuck. According to "heal statistics" crawl > began at Sat Jan 20 12:56:19 2018 and it's still going on (It's around Sun > Jan 21 20:30 when writing this). However glustershd.log says that last heal > was completed at "2018-01-20 11:00:13.090697" (which is 13:00 UTC+2). Also > "heal info" has been running now for over 16 hours without any information. > In statedump I can see that storage nodes have locks on files and some of > those are blocked. Ie. Here again it says that ovirt8z2 is having active > lock even ovirt8z2 crashed after the lock was granted.: > > [xlator.features.locks.zone2-ssd1-vmstor1-locks.inode] > path=/.shard/3d55f8cc-cda9-489a-b0a3-fd0f43d67876.27 > mandatory=0 > inodelk-count=3 > lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:self-heal > inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid = > 18446744073709551610, owner=d0c6d857a87f0000, client=0x7f885845efa0, > connection-id=sto2z2.xxx-10975-2018/01/20-10:56:14: > 649541-zone2-ssd1-vmstor1-client-0-0-0, granted at 2018-01-20 10:59:52 > lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:metadata > lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0 > inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid = > 3420, owner=d8b9372c397f0000, client=0x7f8858410be0, > connection-id=ovirt8z2.xxx.com-5652-2017/12/27-09:49:02: > 946825-zone2-ssd1-vmstor1-client-0-7-0, granted at 2018-01-20 08:57:23 > inodelk.inodelk[1](BLOCKED)=type=WRITE, whence=0, start=0, len=0, pid = > 18446744073709551610, owner=d0c6d857a87f0000, client=0x7f885845efa0, > connection-id=sto2z2.xxx-10975-2018/01/20-10:56:14: > 649541-zone2-ssd1-vmstor1-client-0-0-0, blocked at 2018-01-20 10:59:52 > > I'd also like to add that volume had arbiter brick before crash happened. > We decided to remove it because we thought that it was causing issues. > However now I think that this was unnecessary. After the crash arbiter logs > had lots of messages like this: > [2018-01-20 10:19:36.515717] I [MSGID: 115072] > [server-rpc-fops.c:1640:server_setattr_cbk] > 0-zone2-ssd1-vmstor1-server: 37374187: SETATTR > <gfid:a52055bd-e2e9-42dd-92a3-e96b693bcafe> > (a52055bd-e2e9-42dd-92a3-e96b693bcafe) > ==> (Operation not permitted) [Operation not permitted] > > Is there anyways to force self heal to stop? Any help would be very much > appreciated :) > Exposing .shard to a normal mount is opening a can of worms. You should probably look at mounting the volume with gfid aux-mount where you can access a file with <path-to-mount>/.gfid/<gfid-string>to clear locks on it. Mount command: mount -t glusterfs -o aux-gfid-mount vm1:test /mnt/testvol A gfid string will have some hyphens like: 11118443-1894-4273-9340-4b212fa1c0e4 That said. Next disconnect on the brick where you successfully did the clear-locks will crash the brick. There was a bug in 3.8.x series with clear-locks which was fixed in 3.9.0 with a feature. The self-heal deadlocks that you witnessed also is fixed in 3.10 version of the release. 3.8.x is EOLed, so I recommend you to upgrade to a supported version soon. > > Best regards, > Samuli Heinonen > > > > > > Samuli Heinonen <samp...@neutraali.net> > 20 January 2018 at 21.57 > Hi all! > > One hypervisor on our virtualization environment crashed and now some of > the VM images cannot be accessed. After investigation we found out that > there was lots of images that still had active lock on crashed hypervisor. > We were able to remove locks from "regular files", but it doesn't seem > possible to remove locks from shards. > > We are running GlusterFS 3.8.15 on all nodes. > > Here is part of statedump that shows shard having active lock on crashed > node: > [xlator.features.locks.zone2-ssd1-vmstor1-locks.inode] > path=/.shard/75353c17-d6b8-485d-9baf-fd6c700e39a1.21 > mandatory=0 > inodelk-count=1 > lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:metadata > lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0:self-heal > lock-dump.domain.domain=zone2-ssd1-vmstor1-replicate-0 > inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid = > 3568, owner=14ce372c397f0000, client=0x7f3198388770, connection-id > ovirt8z2.xxx-5652-2017/12/27-09:49:02:946825-zone2-ssd1-vmstor1-client-1-7-0, > granted at 2018-01-20 08:57:24 > > If we try to run clear-locks we get following error message: > # gluster volume clear-locks zone2-ssd1-vmstor1 > /.shard/75353c17-d6b8-485d-9baf-fd6c700e39a1.21 > kind all inode > Volume clear-locks unsuccessful > clear-locks getxattr command failed. Reason: Operation not permitted > > Gluster vol info if needed: > Volume Name: zone2-ssd1-vmstor1 > Type: Replicate > Volume ID: b6319968-690b-4060-8fff-b212d2295208 > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x 2 = 2 > Transport-type: rdma > Bricks: > Brick1: sto1z2.xxx:/ssd1/zone2-vmstor1/export > Brick2: sto2z2.xxx:/ssd1/zone2-vmstor1/export > Options Reconfigured: > cluster.shd-wait-qlength: 10000 > cluster.shd-max-threads: 8 > cluster.locking-scheme: granular > performance.low-prio-threads: 32 > cluster.data-self-heal-algorithm: full > performance.client-io-threads: off > storage.linux-aio: off > performance.readdir-ahead: on > client.event-threads: 16 > server.event-threads: 16 > performance.strict-write-ordering: off > performance.quick-read: off > performance.read-ahead: on > performance.io-cache: off > performance.stat-prefetch: off > cluster.eager-lock: enable > network.remote-dio: on > cluster.quorum-type: none > network.ping-timeout: 22 > performance.write-behind: off > nfs.disable: on > features.shard: on > features.shard-block-size: 512MB > storage.owner-uid: 36 > storage.owner-gid: 36 > performance.io-thread-count: 64 > performance.cache-size: 2048MB > performance.write-behind-window-size: 256MB > server.allow-insecure: on > cluster.ensure-durability: off > config.transport: rdma > server.outstanding-rpc-limit: 512 > diagnostics.brick-log-level: INFO > > Any recommendations how to advance from here? > > Best regards, > Samuli Heinonen > > _______________________________________________ > Gluster-users mailing list > Gluster-users@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users > -- Pranith
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users