[Gluster-users] Dispersed Volume Errors after failed expansion
Hello, I've run into an issue with Gluster 11.1 and need some assistance. I have a 4+1 dispersed gluster setup consisting of 20 nodes and 200 bricks. This setup was 15 nodes and 150 bricks until last week and was working flawlessly. We needed more space so we expanded the volume by adding 5 more nodes and 50 bricks. We added the nodes and triggered a fix-layout command. Unknown to us at the time, one of the five new nodes had a hardware issue, the CPU cooling fan was bad. This caused the node to throttle down to 500mhz on all cores and eventually shut itself down mid fix-layout. Due to how our ISP works, we could only replace the entire node, so we did and executed a replace-brick command. Presently this is the state we are in and I'm not sure how best to proceed to fix the errors and behavior I'm seeing. I'm not sure if running another fix-layout command again should be the next step or not given hundreds of objects are stuck in a persistent heal state, and the fact that doing just about any command other than status, info or heal volume info, results in all client mounts hanging for ~5m or bricks start to drop. The client logs show numerous anomolies as well such as: [2023-11-10 17:41:52.153423 +] W [MSGID: 122040] [ec-common.c:1262:ec_prepare_update_cbk] 0-media-disperse-30: Failed to get size and version : FOP : 'XATTROP' failed on '/path/to/folder' with gfid 0d295c94-5577-4445-9e57-6258f24d22c5. Parent FOP: OPENDIR [Input/output error] [2023-11-10 17:48:46.965415 +] E [MSGID: 122038] [ec-dir-read.c:398:ec_manager_readdir] 0-media-disperse-36: EC is not winding readdir: FOP : 'READDIRP' failed on gfid f8ad28d0-05b4-4df3-91ea-73fabf27712c. Parent FOP: No Parent [File descriptor in bad state] [2023-11-10 17:39:46.076149 +] I [MSGID: 109018] [dht-common.c:1840:dht_revalidate_cbk] 0-media-dht: Mismatching layouts for /path/to/folder2, gfid = f04124e5-63e6-4ddf-9b6b-aa47770f90f2 [2023-11-10 17:39:18.463421 +] E [MSGID: 122034] [ec-common.c:662:ec_log_insufficient_vol] 0-media-disperse-4: Insufficient available children for this request: Have : 0, Need : 4 : Child UP : 1 Mask: 0, Healing : 0 : FOP : 'XATTROP' failed on '/path/to/another/folder with gfid f04124e5-63e6-4ddf-9b6b-aa47770f90f2. Parent FOP: SETXATTR [2023-11-10 17:36:21.565681 +] W [MSGID: 122006] [ec-combine.c:188:ec_iatt_combine] 0-media-disperse-39: Failed to combine iatt (inode: 13324146332441721129-13324146332441721129, links: 2-2, uid: 1000-1000, gid: 1000-1001, rdev: 0-0, size: 10-10, mode: 40775-40775), FOP : 'LOOKUP' failed on '/path/to/yet/another/folder'. Parent FOP: No Parent [2023-11-10 17:39:46.147299 +] W [MSGID: 114031] [client-rpc-fops_v2.c:2563:client4_0_lookup_cbk] 0-media-client-1: remote operation failed. [{path=/path/to/folder3}, {gfid=----}, {errno=13}, {error=Permission denied}] [2023-11-10 17:39:46.093069 +] W [MSGID: 114061] [client-common.c:1232:client_pre_readdirp_v2] 0-media-client-14: remote_fd is -1. EBADFD [{gfid=f04124e5-63e6-4ddf-9b6b-aa47770f90f2}, {errno=77}, {error=File descriptor in bad state}] [2023-11-10 17:55:11.407630 +] E [MSGID: 122038] [ec-dir-read.c:398:ec_manager_readdir] 0-media-disperse-30: EC is not winding readdir: FOP : 'READDIRP' failed on gfid 2bba7b7e-7a4b-416a-80f0-dd50caffd2c2. Parent FOP: No Parent [File descriptor in bad state] [2023-11-10 17:39:46.076179 +] W [MSGID: 109221] [dht-selfheal.c:2023:dht_selfheal_directory] 0-media-dht: Directory selfheal failed [{path=/path/to/folder7}, {misc=2}, {unrecoverable-errors}, {gfid=f04124e5-63e6-4ddf-9b6b-aa47770f90f2}] Something about this failed expansion has caused these errors and I'm not sure how to proceed. Right now doing just about anything causes the client mounts to hang for up to 5 minutes including restarting a node, trying to use a volume set command, etc. I tried increasing a cache timeout value and ~153 bricks out of 200 dropped offline. Restarting a node seems to cause the mounts to hang as well. I've tried: running a gluster volume heal volumename full - will cause mounts to hang for 3-5m but seems to proceed Running ls -alhR against volume to trigger heals Tried removing new bricks, which triggers a rebalance which fails almost immediately, and most of the self-heal agents go offline as well Turned off bit-rot to reduce load on system Replace a brick with a new brick (same drive, new dir.) Attempted force as well. Changed heal mode from diff to full Lowered parallel heal count to 4 When I replaced the one brick, the heal count dropped on that brick from ~100 to ~6, however, those 6 are folders in the root of the volume vs subfolders many layers in. I suspect this is causing a lot of the issues I'm seeing and I don't know how to resolve this without damaging any of the existing data. I'm hoping its just due to the fix layout failing and that just needs to run again but wanted to seek
Re: [Gluster-users] dispersed volume + cifs export does not work (replicated + cifs works fine)
To answer my one question and maybe it is helpful for the community: Due to a different (minor) issue with the posix-permissions, we set stat-prefetch to off. This causes the issue mentioned below, i.e. it had nothing to do with the smb.conf setting. gluster volume set volname performance.stat-prefetch on solved that issue. Reference: https://access.redhat.com/solutions/4558341 On 20/10/2019 18:26, Felix Kölzow wrote: Dear Gluster-Users, _ _ _short story:_ _ _ Two volumes are exported via smb/cifs with the (almost) the same configuration with respect to smb.conf. The replicated volume is easily accessible via cifs and fuse. The dispersed volume is accessible via fuse, but not via cifs. Error message from windows client: The parameter is incorrect Maybe the error is somehow related to this: https://gluster-users.gluster.narkive.com/g35gmGj6/vfs-gluster-broken _more information: _ We have created a gluster setup that consists of three servers, and each server provides two bricks. Two volumes are created on these bricks and are going to exported via smb/cifs. * replicated distributed * dispersed The volume settings are given here: [root@node1 ~]# gluster volume info replicated_cifs Volume Name: replicated_cifs Type: Distributed-Replicate Volume ID: 51bb4440-3b8e-48be-a84c-5ea9e1ddd38e Status: Started Snapshot Count: 0 Number of Bricks: 2 x 3 = 6 Transport-type: tcp Bricks: Brick1: node1:/gluster/vg00/replicated_cifs/brick Brick2: node2:/gluster/vg00/replicated_cifs/brick Brick3: node3:/gluster/vg00/replicated_cifs/brick Brick4: node1:/gluster/vg01/replicated_cifs/brick Brick5: node2:/gluster/vg01/replicated_cifs/brick Brick6: node3:/gluster/vg01/replicated_cifs/brick Options Reconfigured: features.show-snapshot-directory: on features.uss: enable features.barrier: disable changelog.changelog: on geo-replication.ignore-pid-check: on geo-replication.indexing: on transport.address-family: inet nfs.disable: on performance.client-io-threads: off diagnostics.latency-measurement: on diagnostics.count-fop-hits: on user.cifs: enable features.cache-invalidation: on features.cache-invalidation-timeout: 600 performance.cache-samba-metadata: on performance.stat-prefetch: disable performance.cache-invalidation: on performance.md-cache-timeout: 600 network.inode-lru-limit: 20 performance.nl-cache: on performance.nl-cache-timeout: 600 performance.readdir-ahead: on performance.parallel-readdir: on client.event-threads: 4 server.event-threads: 4 server.root-squash: off cluster.lookup-optimize: on features.quota: on features.inode-quota: on features.quota-deem-statfs: on performance.cache-size: 10GB cluster.server-quorum-ratio: 51% cluster.enable-shared-storage: enable [root@node1 ~]# gluster volume info dispersed_cifs Volume Name: dispersed_cifs Type: Disperse Volume ID: 0a291429-1875-41c8-96ff-bce0054ed309 Status: Started Snapshot Count: 0 Number of Bricks: 1 x (4 + 2) = 6 Transport-type: tcp Bricks: Brick1: node1:/gluster/vg00/dispersed_cifs/brick Brick2: node2:/gluster/vg00/dispersed_cifs/brick Brick3: node3:/gluster/vg00/dispersed_cifs/brick Brick4: node1:/gluster/vg01/dispersed_cifs/brick Brick5: node2:/gluster/vg01/dispersed_cifs/brick Brick6: node3:/gluster/vg01/dispersed_cifs/brick Options Reconfigured: changelog.changelog: on geo-replication.ignore-pid-check: on geo-replication.indexing: on transport.address-family: inet nfs.disable: on diagnostics.latency-measurement: on diagnostics.count-fop-hits: on user.cifs: enable features.cache-invalidation: on features.cache-invalidation-timeout: 600 performance.cache-samba-metadata: on performance.stat-prefetch: disable performance.cache-invalidation: on performance.md-cache-timeout: 600 network.inode-lru-limit: 20 performance.nl-cache: on performance.nl-cache-timeout: 600 performance.readdir-ahead: on performance.parallel-readdir: on server.event-threads: 4 client.event-threads: 4 server.root-squash: off cluster.lookup-optimize: on features.quota: on features.inode-quota: on features.quota-deem-statfs: on performance.cache-size: 10GB cluster.server-quorum-ratio: 51% cluster.enable-shared-storage: enable The export via cifs looks: _distributed replicated:_ [gluster-replicated_cifs] vfs objects = fruit acl_xattr glusterfs acl_xattr:ignore system acls = yes acl_xattr:default acl style = windows glusterfs:volume = replicated_cifs glusterfs:logfile = /var/log/samba/glusterfs-replicated_cifs.%M.log glusterfs:loglevel = 7 kernel share modes = no path = / read only = no guest ok = no browseable = no [replicated_data] vfs objects = fruit acl_xattr shadow_copy2 glusterfs acl_xattr:ignore system acls = yes acl_xattr:default acl style = windows glusterfs:volume = replicated_cifs glusterfs:logfile = /var/log/samba/glusterfs-replicated_data.%M.log glusterfs:loglevel = 7 kernel share modes = no path = /replicated_data read only = no guest ok = no create mask = 0660 directory mask = 0770 map acl inherit = yes inherit permissions =
[Gluster-users] dispersed volume + cifs export does not work (replicated + cifs works fine)
Dear Gluster-Users, _ _ _short story:_ _ _ Two volumes are exported via smb/cifs with the (almost) the same configuration with respect to smb.conf. The replicated volume is easily accessible via cifs and fuse. The dispersed volume is accessible via fuse, but not via cifs. Error message from windows client: The parameter is incorrect Maybe the error is somehow related to this: https://gluster-users.gluster.narkive.com/g35gmGj6/vfs-gluster-broken _more information: _ We have created a gluster setup that consists of three servers, and each server provides two bricks. Two volumes are created on these bricks and are going to exported via smb/cifs. * replicated distributed * dispersed The volume settings are given here: [root@node1 ~]# gluster volume info replicated_cifs Volume Name: replicated_cifs Type: Distributed-Replicate Volume ID: 51bb4440-3b8e-48be-a84c-5ea9e1ddd38e Status: Started Snapshot Count: 0 Number of Bricks: 2 x 3 = 6 Transport-type: tcp Bricks: Brick1: node1:/gluster/vg00/replicated_cifs/brick Brick2: node2:/gluster/vg00/replicated_cifs/brick Brick3: node3:/gluster/vg00/replicated_cifs/brick Brick4: node1:/gluster/vg01/replicated_cifs/brick Brick5: node2:/gluster/vg01/replicated_cifs/brick Brick6: node3:/gluster/vg01/replicated_cifs/brick Options Reconfigured: features.show-snapshot-directory: on features.uss: enable features.barrier: disable changelog.changelog: on geo-replication.ignore-pid-check: on geo-replication.indexing: on transport.address-family: inet nfs.disable: on performance.client-io-threads: off diagnostics.latency-measurement: on diagnostics.count-fop-hits: on user.cifs: enable features.cache-invalidation: on features.cache-invalidation-timeout: 600 performance.cache-samba-metadata: on performance.stat-prefetch: disable performance.cache-invalidation: on performance.md-cache-timeout: 600 network.inode-lru-limit: 20 performance.nl-cache: on performance.nl-cache-timeout: 600 performance.readdir-ahead: on performance.parallel-readdir: on client.event-threads: 4 server.event-threads: 4 server.root-squash: off cluster.lookup-optimize: on features.quota: on features.inode-quota: on features.quota-deem-statfs: on performance.cache-size: 10GB cluster.server-quorum-ratio: 51% cluster.enable-shared-storage: enable [root@node1 ~]# gluster volume info dispersed_cifs Volume Name: dispersed_cifs Type: Disperse Volume ID: 0a291429-1875-41c8-96ff-bce0054ed309 Status: Started Snapshot Count: 0 Number of Bricks: 1 x (4 + 2) = 6 Transport-type: tcp Bricks: Brick1: node1:/gluster/vg00/dispersed_cifs/brick Brick2: node2:/gluster/vg00/dispersed_cifs/brick Brick3: node3:/gluster/vg00/dispersed_cifs/brick Brick4: node1:/gluster/vg01/dispersed_cifs/brick Brick5: node2:/gluster/vg01/dispersed_cifs/brick Brick6: node3:/gluster/vg01/dispersed_cifs/brick Options Reconfigured: changelog.changelog: on geo-replication.ignore-pid-check: on geo-replication.indexing: on transport.address-family: inet nfs.disable: on diagnostics.latency-measurement: on diagnostics.count-fop-hits: on user.cifs: enable features.cache-invalidation: on features.cache-invalidation-timeout: 600 performance.cache-samba-metadata: on performance.stat-prefetch: disable performance.cache-invalidation: on performance.md-cache-timeout: 600 network.inode-lru-limit: 20 performance.nl-cache: on performance.nl-cache-timeout: 600 performance.readdir-ahead: on performance.parallel-readdir: on server.event-threads: 4 client.event-threads: 4 server.root-squash: off cluster.lookup-optimize: on features.quota: on features.inode-quota: on features.quota-deem-statfs: on performance.cache-size: 10GB cluster.server-quorum-ratio: 51% cluster.enable-shared-storage: enable The export via cifs looks: _distributed replicated:_ [gluster-replicated_cifs] vfs objects = fruit acl_xattr glusterfs acl_xattr:ignore system acls = yes acl_xattr:default acl style = windows glusterfs:volume = replicated_cifs glusterfs:logfile = /var/log/samba/glusterfs-replicated_cifs.%M.log glusterfs:loglevel = 7 kernel share modes = no path = / read only = no guest ok = no browseable = no [replicated_data] vfs objects = fruit acl_xattr shadow_copy2 glusterfs acl_xattr:ignore system acls = yes acl_xattr:default acl style = windows glusterfs:volume = replicated_cifs glusterfs:logfile = /var/log/samba/glusterfs-replicated_data.%M.log glusterfs:loglevel = 7 kernel share modes = no path = /replicated_data read only = no guest ok = no create mask = 0660 directory mask = 0770 map acl inherit = yes inherit permissions = yes inherit acls = true store dos attributes = yes shadow:snapdir = /.snaps shadow:basedir = / shadow:sort = desc shadow:snapprefix = snap_replicated_cifs shadow:format = _GMT-%Y.%m.%d-%H.%M.%S _dispersed volume:_ [gluster-dispersed_cifs] vfs objects = fruit acl_xattr glusterfs acl_xattr:ignore system acls = yes acl_xattr:default acl style = windows glusterfs:volume = dispersed_cifs glusterfs:logfile =
Re: [Gluster-users] Dispersed volume and auto-heal
No, you should replace the brick. On Wed, Dec 7, 2016 at 1:02 PM, Cedric Lemarchandwrote: > Hello, > > Is gluster able to auto-heal when some bricks are lost ? by auto-heal I mean > that losted parity are re-generated on bricks that are still available in > order to recover the level of redundancy without replacing the failed bricks. > > I am in the learning curve, apologies if the question is trivial. > > Cheers, > > Cédric > > > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Dispersed volume and auto-heal
Hello, Is gluster able to auto-heal when some bricks are lost ? by auto-heal I mean that losted parity are re-generated on bricks that are still available in order to recover the level of redundancy without replacing the failed bricks. I am in the learning curve, apologies if the question is trivial. Cheers, Cédric ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] DISPERSED VOLUME
I think you should try with a bigger file.1,10,100,1000KB? Small files might just being replicated to bricks...(Just a guess..) On Fri, Nov 25, 2016 at 12:41 PM, Alexandre Blancawrote: > Hi, > > I am a beginner in distributed file systems and I currently work on > Glusterfs. > I work with 4 VM : srv1, srv2, srv3 and cli1 > I tested several types of volume (distributed, replicated, striped ...) > which are for me JBOD, RAID 1 and RAID 0. > When I try to make a dispersed volume (raid5 / 6) I have a misunderstanding > ... > > > gluster volume create gv7 disperse-data 3 redundancy 1 > ipserver1:/data/brick1/gv7 ipserver2:/data/brick1/gv7 > ipserver3:/data/brick1/gv7 ipserver4:/data/brick1/gv7 > > > gluster volume info > > > Volume Name: gv7 > Type: Disperse > Status: Created > Number of Bricks: 4 > Transport-type: tcp > Bricks: > Brick1: ipserver1:/data/brick1/gv7 > Brick2: ipserver2:/data/brick1/gv7 > Brick3: ipserver3:/data/brick1/gv7 > Brick4: ipserver4:/data/brick1/gv7 > > gluster volume start gv7 > > > mkdir /home/cli1/gv7_dispersed_directory > > > mount -t glusterfs ipserver1:/gv7 /home/cli1/gv7_dispersed_directory > > > > Now, when i create a file on my moint point (gv7_dispersed_directory) : > > > cd /home/cli1/gv7_dispersed_directory > > > echo 'hello world !' >> test_file > > > I can see in my srv1 : > > > cd /data/brick1/gv7 > > > cat test > > > hello world ! > > > in my srv2 : > > > > cd /data/brick1/gv7 > > > > cat test > > > > hello world ! > > > in my srv4: > > > > cd /data/brick1/gv7 > > > > cat test > > > > hello world ! > > > but in my srv3 : > > > > cd /data/brick1/gv7 > > > > cat test > > > > hello world ! > > hello world ! > > > Why?! output of server 3 displays 2 times hello world ! Parity? Redundancy? > I don't know... > > Best regards > > Alex > > > > > > > > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] DISPERSED VOLUME
Hi, I am a beginner in distributed file systems and I currently work on Glusterfs. I work with 4 VM : srv1, srv2, srv3 and cli1 I tested several types of volume (distributed, replicated, striped ...) which are for me JBOD, RAID 1 and RAID 0. When I try to make a dispersed volume (raid5 / 6) I have a misunderstanding ... gluster volume create gv7 disperse-data 3 redundancy 1 ipserver1:/data/brick1/gv7 ipserver2:/data/brick1/gv7 ipserver3:/data/brick1/gv7 ipserver4:/data/brick1/gv7 gluster volume info Volume Name: gv7 Type: Disperse Status: Created Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: ipserver1:/data/brick1/gv7 Brick 2 : ipserver 2 :/data/brick 1 /gv 7 Brick3: ipserver3:/data/brick1/gv7 Brick4: ipserver4:/data/brick1/gv7 gluster volume start gv7 m kdir / home/cli1/gv7_dispersed_directory m ount -t glusterfs ipserv er 1:/gv 7 / home/cli1 / gv7_dispersed_directory Now, when i create a file on my moint point (gv7_dispersed_directory) : cd /home/cli1/gv7 _dispersed_directory echo 'hello world !' >> test_file I can see in my srv1 : cd /data/brick1/gv7 cat test hello world ! in my srv2 : cd /data/brick1/gv7 cat test hello world ! in my srv4: cd /data/brick1/gv7 cat test hello world ! but in my srv3 : cd /data/brick1/gv7 cat test hello world ! hello world ! Why?! output of server 3 displays 2 times hello world ! Parity? Redundancy? I don't know... Best regards Alex ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users