[Gluster-users] is 10.4 released?
I see packages for 10.4, but no release announcement or release notes. Anyone know what's the status? Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] New Gluster volume (10.3) not healing symlinks after brick offline
I've seen issues with symlinks failing to heal as well. I never found a good solution on the glusterfs side of things. Most reliable fix I found is just rm and recreate the symlink in the fuse volume itself. Also, I'd strongly suggest heavy load testing before upgrading to 10.3 in production, after upgrading from 9.5 -> 10.3 I've seen frequent brick process crashes(glusterfsd), whereas 9.5 was quite stable. On Mon, Jan 23, 2023 at 3:58 PM Matt Rubright wrote: > > Hi friends, > > I have recently built a new replica 3 arbiter 1 volume on 10.3 servers and > have been putting it through its paces before getting it ready for production > use. The volume will ultimately contain about 200G of web content files > shared among multiple frontends. Each will use the gluster fuse client to > connect. > > What I am experiencing sounds very much like this post from 9 years ago: > https://lists.gnu.org/archive/html/gluster-devel/2013-12/msg00103.html > > In short, if I perform these steps I can reliably end up with symlinks on the > volume which will not heal either by initiating a 'full heal' from the > cluster or using a fuse client to read each file: > > 1) Verify that all nodes are healthy, the volume is healthy, and there are no > items needing to be healed > 2) Cleanly shut down one server hosting a brick > 3) Copy data, including some symlinks, from a fuse client to the volume > 4) Bring the brick back online and observe the number and type of items > needing to be healed > 5) Initiate a full heal from one of the nodes > 6) Confirm that while files and directories are healed, symlinks are not > > Please help me determine if I have improper expectations here. I have some > basic knowledge of managing gluster volumes, but I may be misunderstanding > intended behavior. > > Here is the volume info and heal data at each step of the way: > > *** Verify that all nodes are healthy, the volume is healthy, and there are > no items needing to be healed *** > > # gluster vol info cwsvol01 > > Volume Name: cwsvol01 > Type: Replicate > Volume ID: 7b28e6e6-4a73-41b7-83fe-863a45fd27fc > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x (2 + 1) = 3 > Transport-type: tcp > Bricks: > Brick1: glfs02-172-20-1:/data/brick01/cwsvol01 > Brick2: glfs01-172-20-1:/data/brick01/cwsvol01 > Brick3: glfsarb01-172-20-1:/data/arb01/cwsvol01 (arbiter) > Options Reconfigured: > performance.client-io-threads: off > nfs.disable: on > transport.address-family: inet > storage.fips-mode-rchecksum: on > cluster.granular-entry-heal: on > > # gluster vol status > Status of volume: cwsvol01 > Gluster process TCP Port RDMA Port Online Pid > -- > Brick glfs02-172-20-1:/data/brick01/cwsvol0 > 1 50253 0 Y 1397 > Brick glfs01-172-20-1:/data/brick01/cwsvol0 > 1 56111 0 Y 1089 > Brick glfsarb01-172-20-1:/data/arb01/cwsvol > 01 54517 0 Y > 118704 > Self-heal Daemon on localhost N/A N/AY 1413 > Self-heal Daemon on glfs01-172-20-1 N/A N/AY 3490 > Self-heal Daemon on glfsarb01-172-20-1 N/A N/AY > 118720 > > Task Status of Volume cwsvol01 > -- > There are no active volume tasks > > # gluster vol heal cwsvol01 info summary > Brick glfs02-172-20-1:/data/brick01/cwsvol01 > Status: Connected > Total Number of entries: 0 > Number of entries in heal pending: 0 > Number of entries in split-brain: 0 > Number of entries possibly healing: 0 > > Brick glfs01-172-20-1:/data/brick01/cwsvol01 > Status: Connected > Total Number of entries: 0 > Number of entries in heal pending: 0 > Number of entries in split-brain: 0 > Number of entries possibly healing: 0 > > Brick glfsarb01-172-20-1:/data/arb01/cwsvol01 > Status: Connected > Total Number of entries: 0 > Number of entries in heal pending: 0 > Number of entries in split-brain: 0 > Number of entries possibly healing: 0 > > *** Cleanly shut down one server hosting a brick *** > > *** Copy data, including some symlinks, from a fuse client to the volume *** > > # gluster vol heal cwsvol01 info summary > Brick glfs02-172-20-1:/data/brick01/cwsvol01 > Status: Transport endpoint is not connected > Total Number of entries: - > Number of entries in heal pending: - > Number of entries in split-brain: - > Number of entries possibly healing: - > > Brick glfs01-172-20-1:/data/brick01/cwsvol01 > Status: Connected > Total Number of entries: 810 > Number of entries in heal pending: 810 > Number of entries in split-brain: 0 > Number of entries possibly healing: 0 > > Brick glfsarb01-172-20-1:/data/arb01/cwsvol01 > Status: Connected > Total Number of entries: 810 > Number of entries
[Gluster-users] Gluster 11.0 upgrade report
Just upgraded my test 3 node distributed-replica 9x2 glusterfs to 11.0 and it was a bit rough. After upgrading the 1st node, gluster volume status showed only the bricks on node 1, and gluster peer status showed node1 rejecting node 2 & 3. After upgrading node2, and then node3, node 3 remained rejected. I followed the docs for resolving the rejected peer, i.e. clean out /var/lib/glusterd other than .info file and was able to peer probe and get node 3 back into the cluster. However, the fuse glusterfs client is now oddly reporting the volume is only 1.1TB, versus the 2.5TB before(9x280GB disks). Also, glusterfsd's seem to crash under load testing just as much as 10, and it created unhealable files which I'd never seen on 10, and only resolved it by rm -rf on the whole testing directory tree. Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] glusterfs v10.3 crashing
I've started seeing lots of crashes recently after having a stable gluster for a couple of years. I upgraded from 9.6 to 10.3 in hopes of more stability and got 3 crashes yesterday. All 3 seemed to have the rpc_transport_unref line near the top, not sure what's going on, but it's making the filesystem unusable: Crash1 Program terminated with signal SIGBUS, Bus error. #0 0x7f0d66b4e7aa in __gf_free (free_ptr=0x7f0d58153678) at mem-pool.c:363 363 mem-pool.c: No such file or directory. [Current thread is 1 (Thread 0x7f0d53fff700 (LWP 1664975))] (gdb) where #0 0x7f0d66b4e7aa in __gf_free (free_ptr=0x7f0d58153678) at mem-pool.c:363 #1 __gf_free (free_ptr=0x7f0d58153678) at mem-pool.c:332 #2 0x7f0d66ad04fa in rpc_transport_unref () from /usr/lib/x86_64-linux-gnu/libgfrpc.so.0 #3 0x7f0d66b7a71d in event_dispatch_epoll_handler (event=0x7f0d53ffe054, event_pool=0x556c97385518) at event-epoll.c:638 #4 event_dispatch_epoll_worker (data=0x7f0d54006928) at event-epoll.c:749 #5 0x7f0d66a90ea7 in start_thread (arg=) at pthread_create.c:477 #6 0x7f0d669b0a2f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Crash2: Core was generated by /usr/sbin/glusterfsd Program terminated with signal SIGSEGV, Segmentation fault. #0 0x in ?? () [Current thread is 1 (Thread 0x7f26fc1bc700 (LWP 1731799))] (gdb) where #0 0x in ?? () #1 0x7f270957f4a5 in rpc_transport_unref (this=this@entry=0x7f26f40a0ad8) at rpc-transport.c:501 #2 0x7f26feec3f4b in server_process_event_upcall (this=this@entry=0x7f26f002b0f8, data=data@entry=0x7f26fc1b9a70) at server.c:1499 #3 0x7f26feec4964 in server_notify (this=0x7f26f002b0f8, event=19, data=0x7f26fc1b9a70) at server.c:1620 #4 0x7f27095cc244 in xlator_notify (xl=0x7f26f002b0f8, event=19, data=0x7f26fc1b9a70) at xlator.c:711 #5 0x7f270965df2b in default_notify (this=0x7f26f0028fd8, event=, data=0x7f26fc1b9a70) at defaults.c:3414 #6 0x7f26fef95b09 in ?? () from /usr/lib/x86_64-linux-gnu/glusterfs/10.3/xlator/debug/io-stats.so #7 0x7f27095cc244 in xlator_notify (xl=0x7f26f0028fd8, event=19, data=0x7f26fc1b9a70) at xlator.c:711 #8 0x7f270965df2b in default_notify (this=this@entry=0x7f26f00272c8, event=event@entry=19, data=data@entry=0x7f26fc1b9a70) at defaults.c:3414 #9 0x7f26fefc27ce in notify (this=0x7f26f00272c8, event=19, data=0x7f26fc1b9a70) at quota.c:5017 #10 0x7f27095cc244 in xlator_notify (xl=0x7f26f00272c8, event=19, data=0x7f26fc1b9a70) at xlator.c:711 #11 0x7f270965df2b in default_notify (this=this@entry=0x7f26f0025548, event=event@entry=19, data=data@entry=0x7f26fc1b9a70) at defaults.c:3414 #12 0x7f26fefe47b1 in notify (this=0x7f26f0025548, event=19, data=0x7f26fc1b9a70) at index.c:2664 #13 0x7f27095cc244 in xlator_notify (xl=0x7f26f0025548, event=19, data=0x7f26fc1b9a70) at xlator.c:711 #14 0x7f270965df2b in default_notify (this=this@entry=0x7f26f00239e8, event=event@entry=19, data=data@entry=0x7f26fc1b9a70) at defaults.c:3414 #15 0x7f26feff68c0 in notify (this=0x7f26f00239e8, event=19, data=0x7f26fc1b9a70) at barrier.c:516 #16 0x7f27095cc244 in xlator_notify (xl=0x7f26f00239e8, event=19, data=0x7f26fc1b9a70) at xlator.c:711 #17 0x7f270965df2b in default_notify (this=0x7f26f0021878, event=, data=0x7f26fc1b9a70) at defaults.c:3414 #18 0x7f27095cc244 in xlator_notify (xl=0x7f26f0021878, event=19, data=0x7f26fc1b9a70) at xlator.c:711 #19 0x7f270965df2b in default_notify (this=0x7f26f001fe28, event=, data=0x7f26fc1b9a70) at defaults.c:3414 #20 0x7f27095cc244 in xlator_notify (xl=0x7f26f001fe28, event=19, data=0x7f26fc1b9a70) at xlator.c:711 #21 0x7f270965df2b in default_notify (this=this@entry=0x7f26f001e428, event=event@entry=19, data=data@entry=0x7f26fc1b9a70) at defaults.c:3414 #22 0x7f2704033efd in notify (this=0x7f26f001e428, event=19, data=0x7f26fc1b9a70) at io-threads.c:1339 #23 0x7f27095cc244 in xlator_notify (xl=0x7f26f001e428, event=19, data=0x7f26fc1b9a70) at xlator.c:711 #24 0x7f270965df2b in default_notify (this=this@entry=0x7f26f001c8c8, event=event@entry=19, data=data@entry=0x7f26fc1b9a70) at defaults.c:3414 #25 0x7f270404ccc1 in notify (event=, data=0x7f26fc1b9a70, this=0x7f26f001c8c8) at upcall.c:2368 #26 notify (this=0x7f26f001c8c8, event=, data=0x7f26fc1b9a70) at upcall.c:2355 #27 0x7f270404daa4 in upcall_client_cache_invalidate (this=this@entry=0x7f26f001c8c8, gfid=gfid@entry=0x7f26f82dad34 "y-\300b\261;H6\221\306\003\031\307\306", , up_client_entry=up_client_entry@entry=0x7f2680327d48, flags=flags@entry=24, stbuf=stbuf@entry=0x7f26fc1ba0d0, p_stbuf=p_stbuf@entry=0x0, oldp_stbuf=0x0, xattr=0x0, now=1670297728) at upcall-internal.c:632 #28 0x7f270405233c in upcall_cache_invalidate (frame=0x7f2684be2738, this=0x7f26f001c8c8, client=0x7f26f022e078, inode=, flags=24, stbuf=0x7f26fc1ba0d0, p_stbuf=0x0, oldp_stbuf=0x0, xattr=0x0) at upcall-internal.c:566 #29 0x7f2704041835 in upc
Re: [Gluster-users] gluster volume not healing - remote operation failed
On Wed, Sep 14, 2022 at 7:08 AM wrote: > > Hi folks, > > my gluster volume isn't fully healing. We had an outage couple days ago > and all other files got healed successfully. Now - days later - i can > see there are still two gfid's per node remaining in healing list. > > root@storage-001~# for i in `gluster volume list`; do gluster volume > heal $i info; done > Brick storage-003.mydomain.com:/mnt/bricks/g-volume-myvolume > > > Status: Connected > Number of entries: 2 > > Brick storage-002.mydomain.com:/mnt/bricks/g-volume-myvolume > > > Status: Connected > Number of entries: 2 > > Brick storage-001.mydomain.com:/mnt/bricks/g-volume-myvolume > > > Status: Connected > Number of entries: 2 > > In the log i can see that the glustershd process is invoked to heal the > reamining files but fails with "remote operation failed". > [2022-09-14 10:56:50.007978 +] I [MSGID: 108026] > [afr-self-heal-entry.c:1053:afr_selfheal_entry_do] > 0-g-volume-myvolume-replicate-0: performing entry selfheal on > 48791313-e5e7-44df-bf99-3ebc8d4cf5d5 > [2022-09-14 10:56:50.008428 +] I [MSGID: 108026] > [afr-self-heal-entry.c:1053:afr_selfheal_entry_do] > 0-g-volume-myvolume-replicate-0: performing entry selfheal on > a4babc5a-bd5a-4429-b65e-758651d5727c > [2022-09-14 10:56:50.015005 +] E [MSGID: 114031] > [client-rpc-fops_v2.c:214:client4_0_mkdir_cbk] > 0-g-volume-myvolume-client-2: remote operation failed. [{path=(null)}, > {errno=22}, {error=Invalid argument}] > [2022-09-14 10:56:50.015007 +] E [MSGID: 114031] > [client-rpc-fops_v2.c:214:client4_0_mkdir_cbk] > 0-g-volume-myvolume-client-3: remote operation failed. [{path=(null)}, > {errno=22}, {error=Invalid argument}] > [2022-09-14 10:56:50.015138 +] E [MSGID: 114031] > [client-rpc-fops_v2.c:214:client4_0_mkdir_cbk] > 0-g-volume-myvolume-client-4: remote operation failed. [{path=(null)}, > {errno=22}, {error=Invalid argument}] > [2022-09-14 10:56:50.614082 +] E [MSGID: 114031] > [client-rpc-fops_v2.c:214:client4_0_mkdir_cbk] > 0-g-volume-myvolume-client-2: remote operation failed. [{path=(null)}, > {errno=22}, {error=Invalid argument}] > [2022-09-14 10:56:50.614108 +] E [MSGID: 114031] > [client-rpc-fops_v2.c:214:client4_0_mkdir_cbk] > 0-g-volume-myvolume-client-3: remote operation failed. [{path=(null)}, > {errno=22}, {error=Invalid argument}] > [2022-09-14 10:56:50.614099 +] E [MSGID: 114031] > [client-rpc-fops_v2.c:214:client4_0_mkdir_cbk] > 0-g-volume-myvolume-client-4: remote operation failed. [{path=(null)}, > {errno=22}, {error=Invalid argument}] > [2022-09-14 10:56:51.619623 +] E [MSGID: 114031] > [client-rpc-fops_v2.c:214:client4_0_mkdir_cbk] > 0-g-volume-myvolume-client-2: remote operation failed. [{path=(null)}, > {errno=22}, {error=Invalid argument}] > [2022-09-14 10:56:51.619630 +] E [MSGID: 114031] > [client-rpc-fops_v2.c:214:client4_0_mkdir_cbk] > 0-g-volume-myvolume-client-3: remote operation failed. [{path=(null)}, > {errno=22}, {error=Invalid argument}] > [2022-09-14 10:56:51.619632 +] E [MSGID: 114031] > [client-rpc-fops_v2.c:214:client4_0_mkdir_cbk] > 0-g-volume-myvolume-client-4: remote operation failed. [{path=(null)}, > {errno=22}, {error=Invalid argument}] > > The gluster is running with opversion 9 on CentOS. There are no > entries in split brain. > > How can i get these files finally healed? > > Thanks in advance. > I've seen this too. The only I've found to fix it is run a find under each of my bricks and run getfattr -n trusted.gfid -e hex on all the files, saving the output to a text file and then greping for the problematic gfid's to identify which file it is. Accessing the files through the gluster fuse mount can sometimes heal them, but I've had symlinks I just had to rm and recreate and other files that were just failed removals that only exist in one brick and no others that have to be removed by hand. Happens often enough I wrote a script that traverses all files under a brick and recursively removes the file in the brick and it's gfid version under .glusterfs. I can dig it up if you're still interested, don't have it handy atm. Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] heal failure after bricks go down
Sequence of events which ended up with 2 bricks down and a heal failure. What should I do about the heal failure, and before or after replacing the bad disk? First, gluster 10.2 info Volume Name: glust-distr-rep Type: Distributed-Replicate Volume ID: fe0ea6f6-2d1b-4b5c-8af5-0c11ea546270 Status: Started Snapshot Count: 0 Number of Bricks: 9 x 2 = 18 Transport-type: tcp Bricks: Brick1: md1cfsd01:/bricks/b0/br Brick2: md1cfsd02:/bricks/b0/br Brick3: md1cfsd03:/bricks/b0/br Brick4: md1cfsd01:/bricks/b3/br Brick5: md1cfsd02:/bricks/b3/br Brick6: md1cfsd03:/bricks/b3/br Brick7: md1cfsd01:/bricks/b1/br Brick8: md1cfsd02:/bricks/b1/br Brick9: md1cfsd03:/bricks/b1/br Brick10: md1cfsd01:/bricks/b4/br Brick11: md1cfsd02:/bricks/b4/br Brick12: md1cfsd03:/bricks/b4/br Brick13: md1cfsd01:/bricks/b2/br Brick14: md1cfsd02:/bricks/b2/br Brick15: md1cfsd03:/bricks/b2/br Brick16: md1cfsd01:/bricks/b5/br Brick17: md1cfsd02:/bricks/b5/br Brick18: md1cfsd03:/bricks/b5/br Options Reconfigured: performance.md-cache-statfs: on cluster.server-quorum-type: server cluster.min-free-disk: 15 storage.batch-fsync-delay-usec: 0 user.smb: enable features.cache-invalidation: on nfs.disable: on performance.readdir-ahead: on transport.address-family: inet Fun started with a brick(d02:b5) crashing: [2022-08-02 18:59:29.417147 +] W [rpcsvc.c:1323:rpcsvc_callback_submit] 0-rpcsvc: transmission of rpc-request failed pending frames: frame : type(1) op(WRITE) frame : type(1) op(WRITE) frame : type(1) op(WRITE) patchset: git://git.gluster.org/glusterfs.git signal received: 7 time of crash: 2022-08-02 18:59:29 + configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 10.2 /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x28a54)[0x7fefb20f7a54] /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x700)[0x7fefb20fffc0] /lib/x86_64-linux-gnu/libc.so.6(+0x3bd60)[0x7fefb1ecdd60] /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(__gf_free+0x5a)[0x7fefb211c7aa] /usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_unref+0x9a)[0x7fefb209e4fa] /usr/lib/x86_64-linux-gnu/glusterfs/10.2/xlator/protocol/server.so(+0xaf4b)[0x7fefac1fff4b] /usr/lib/x86_64-linux-gnu/glusterfs/10.2/xlator/protocol/server.so(+0xb964)[0x7fefac200964] /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(xlator_notify+0x34)[0x7fefb20eb244] /usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_notify+0x1ab)[0x7fefb217cf2b] ... Then a few hours later a read error on a different brick(b2) on the same host: [2022-08-02 22:04:17.808970 +] E [MSGID: 113040] [posix-inode-fd-ops.c:1758:posix_readv] 0-glust-distr-rep-posix: read failed on gfid=16b51498-966e-4546-b561-24b0062f4324, fd=0x7ff9f00d6b08, offset=663314432 size=16384, buf=0x7ff9fc0f7000 [Input/output error] [2022-08-02 22:04:17.809057 +] E [MSGID: 115068] [server-rpc-fops_v2.c:1369:server4_readv_cbk] 0-glust-distr-rep-server: READ info [{frame=1334746}, {READV_fd_no=4}, {uuid_utoa=16b51498-966e-4546-b561-24b0062f4324}, {client=CTX_ID:6d7535af-769c-4223-aad0-79acffa836ed-GRAPH_ID:0-PID:1414-HOST:r4-16-PC_NAME:glust-distr-rep-client-13-RECON_NO:-1}, {error-xlator=glust-distr-rep-posix}, {errno=5}, {error=Input/output error}] This looks like a real hardware error: [Tue Aug 2 18:03:48 2022] megaraid_sas :03:00.0: 6293 (712778647s/0x0002/FATAL) - Unrecoverable medium error during recovery on PD 04(e0x20/s4) at 1d267163 [Tue Aug 2 18:03:49 2022] sd 0:2:3:0: [sdd] tag#435 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=3s [Tue Aug 2 18:03:49 2022] sd 0:2:3:0: [sdd] tag#435 CDB: Read(10) 28 00 1d 26 70 78 00 01 00 00 [Tue Aug 2 18:03:49 2022] blk_update_request: I/O error, dev sdd, sector 489058424 op 0x0:(READ) flags 0x80700 phys_seg 9 prio class 0 This morning noticing both b2 & b5 were offline, systemctl stopped and started glusterd to restart the bricks. All bricks are now up: Status of volume: glust-distr-rep Gluster process TCP Port RDMA Port Online Pid -- Brick md1cfsd01:/bricks/b0/br 55386 0 Y 2047 Brick md1cfsd02:/bricks/b0/br 59983 0 Y 3036416 Brick md1cfsd03:/bricks/b0/br 58028 0 Y 2014 Brick md1cfsd01:/bricks/b3/br 59454 0 Y 2041 Brick md1cfsd02:/bricks/b3/br 52352 0 Y 3036421 Brick md1cfsd03:/bricks/b3/br 56786 0 Y 2017 Brick md1cfsd01:/bricks/b1/br 59885 0 Y 2040 Brick md1cfsd02:/bricks/b1/br 55148 0 Y 3036434 Brick md1cfsd03:/bricks/b1/br 52422 0 Y 2068 Brick md1cfsd01:/bricks/b4/br 56378 0 Y 2099 Brick md1cfsd02:/bricks/b4/br 60152 0 Y 303647
Re: [Gluster-users] GlusterFS 9 and Debian 11
It's built for Debian, can't speak to the docs but an apt repo is available: https://download.nfs-ganesha.org/3/3.5/Debian/ On Mon, Sep 27, 2021 at 3:53 AM Eliyahu Rosenberg wrote: > > Since it seems there are after all some Debian (/debian based) users on this > list, can I hijack this thread just a bit and ask about ganesha and glusterfs? > Is that not built for Debian or is it included in the main package? > > I ask because as far as I can tell docs on doing gluster+ganesha refer to > rpms that don't seem to have deb equivalents and commands referred int the > docs also don't seem to exist for me. > > Thanks! > Eli > > On Wed, Sep 22, 2021 at 3:40 PM Kaleb Keithley wrote: >> >> >> On Wed, Sep 22, 2021 at 7:51 AM Taste-Of-IT wrote: >>> >>> Hi, >>> >>> i installed fresh Debian 11 stable and use GlusterFS latest sources. At >>> installing glusterfs-server i got error missing libreadline7 Paket, which >>> is not in Debian 11. >>> >>> Is GF 9 not Debian 11 ready? >> >> >> Our Debian 11 box has readline-common 8.1-1 and libreadline8 8.1-1 and >> glusterfs 9 builds fine for us. >> >> What "latest sources" are you using? >> >> -- >> >> Kaleb >> >> >> >> >> Community Meeting Calendar: >> >> Schedule - >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >> Bridge: https://meet.google.com/cpu-eiue-hvk >> Gluster-users mailing list >> Gluster-users@gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users > > > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://meet.google.com/cpu-eiue-hvk > Gluster-users mailing list > Gluster-users@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Duplicate files after 8.2 -> 8.3 upgrade
Googling around I see this has been happening for years, with apparently no one ever understanding why. The files are now on 2 bricks of each of the 3 replica nodes in my cluster, and the gfids all appear identical, per below. I guess I'll plan on deleting a copy directly under one of the bricks after the heal finishes. I'll be happy to provide more info or logs if someone is interested in looking further. pdsh -w glust0[5-7] getfattr -m . -d -e hex /bricks/b*/br/ | dshbak | grep gfid trusted.gfid=0x9c751f5a7f16490dacda32cb7403ccc1 trusted.gfid2path.f2a6b5d7c793e21e=0x39336663336262342d613161332d343039662d613061312d3661646361316232326231662f323139323332335f4252434156322d312d4845524544494356365f53696d62615f3835353233615f4c616e65305f4944543331342e746172 trusted.gfid=0x9c751f5a7f16490dacda32cb7403ccc1 trusted.gfid2path.f2a6b5d7c793e21e=0x39336663336262342d613161332d343039662d613061312d3661646361316232326231662f323139323332335f4252434156322d312d4845524544494356365f53696d62615f3835353233615f4c616e65305f4944543331342e746172 trusted.gfid=0x9c751f5a7f16490dacda32cb7403ccc1 trusted.gfid2path.f2a6b5d7c793e21e=0x39336663336262342d613161332d343039662d613061312d3661646361316232326231662f323139323332335f4252434156322d312d4845524544494356365f53696d62615f3835353233615f4c616e65305f4944543331342e746172 trusted.gfid=0x9c751f5a7f16490dacda32cb7403ccc1 trusted.gfid2path.f2a6b5d7c793e21e=0x39336663336262342d613161332d343039662d613061312d3661646361316232326231662f323139323332335f4252434156322d312d4845524544494356365f53696d62615f3835353233615f4c616e65305f4944543331342e746172 trusted.gfid=0x9c751f5a7f16490dacda32cb7403ccc1 trusted.gfid2path.f2a6b5d7c793e21e=0x39336663336262342d613161332d343039662d613061312d3661646361316232326231662f323139323332335f4252434156322d312d4845524544494356365f53696d62615f3835353233615f4c616e65305f4944543331342e746172 trusted.gfid=0x9c751f5a7f16490dacda32cb7403ccc1 trusted.gfid2path.f2a6b5d7c793e21e=0x39336663336262342d613161332d343039662d613061312d3661646361316232326231662f323139323332335f4252434156322d312d4845524544494356365f53696d62615f3835353233615f4c616e65305f4944543331342e746172 On Fri, Jan 8, 2021 at 1:29 PM Eli V wrote: > > I just upgraded a gluster replica 3 gluster from 8.2 -> 8.3 rebooting > the nodes one by one. The cluster was completely idle during this > time. After each reboot things seemed fine and a gluster volume heal > vol & info reported no issues. However, about an hour later I started > seeing duplicate files in ls listing of some of the directories, and > doing a heal info now shows files undergoing healing. The list of > healing files does not contain all the duplicates however. Any idea's > what's going on, or pointers to debug further? Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Duplicate files after 8.2 -> 8.3 upgrade
I just upgraded a gluster replica 3 gluster from 8.2 -> 8.3 rebooting the nodes one by one. The cluster was completely idle during this time. After each reboot things seemed fine and a gluster volume heal vol & info reported no issues. However, about an hour later I started seeing duplicate files in ls listing of some of the directories, and doing a heal info now shows files undergoing healing. The list of healing files does not contain all the duplicates however. Any idea's what's going on, or pointers to debug further? Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Docs on gluster parameters
I think docs.gluster.org needs a section on the available parameters, especially considering how important some of them can be. For example a google for performance.parallel-readdir, or features.cache-invalidation only seems to turn up some hits in the release notes on docs.gluster.org. I woudn't expect a new user to have to go read the release notes for all previous releases to understand the importance of these parameters, or what paremeters even exist. Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] missing files on FUSE mount
On Tue, Oct 20, 2020 at 8:41 AM Martín Lorenzo wrote: > > Hi, I have the following problem, I have a distributed replicated cluster set > up with samba and CTDB, over fuse mount points > I am having inconsistencies across the FUSE mounts, users report that files > are disappearing after being copied/moved. I take a look at the mount points > on each node, and they don't display the same data > > faulty mount point > [root@gluster6 ARRIBA GENTE martes 20 de octubre]# ll > ls: cannot access PANEO VUELTA A CLASES CON TAPABOCAS.mpg: No such file or > directory > ls: cannot access PANEO NIÑOS ESCUELAS CON TAPABOCAS.mpg: No such file or > directory > total 633723 > drwxr-xr-x. 5 arribagente PN 4096 Oct 19 10:52 COMERCIAL AG martes 20 de > octubre > -rw-r--r--. 1 arribagente PN 648927236 Jun 3 07:16 PANEO FACHADA PALACIO > LEGISLATIVO DRONE DIA Y NOCHE.mpg > -?? ? ? ? ?? PANEO NIÑOS ESCUELAS CON > TAPABOCAS.mpg > -?? ? ? ? ?? PANEO VUELTA A CLASES CON > TAPABOCAS.mpg > > > ###healthy mount point### > [root@gluster7 ARRIBA GENTE martes 20 de octubre]# ll > total 3435596 > drwxr-xr-x. 5 arribagente PN 4096 Oct 19 10:52 COMERCIAL AG martes 20 > de octubre > -rw-r--r--. 1 arribagente PN 648927236 Jun 3 07:16 PANEO FACHADA PALACIO > LEGISLATIVO DRONE DIA Y NOCHE.mpg > -rw-r--r--. 1 arribagente PN 2084415492 Aug 18 09:14 PANEO NIÑOS ESCUELAS CON > TAPABOCAS.mpg > -rw-r--r--. 1 arribagente PN 784701444 Sep 4 07:23 PANEO VUELTA A CLASES > CON TAPABOCAS.mpg > > - So far the only way to solve this is to create a directory in the healthy > mount point, on the same path: > [root@gluster7 ARRIBA GENTE martes 20 de octubre]# mkdir hola > > - When you refresh the other mountpoint, and the issue is resolved: > [root@gluster6 ARRIBA GENTE martes 20 de octubre]# ll > total 3435600 > drwxr-xr-x. 5 arribagente PN 4096 Oct 19 10:52 COMERCIAL AG martes 20 > de octubre > drwxr-xr-x. 2 rootroot 4096 Oct 20 08:45 hola > -rw-r--r--. 1 arribagente PN648927236 Jun 3 07:16 PANEO FACHADA PALACIO > LEGISLATIVO DRONE DIA Y NOCHE.mpg > -rw-r--r--. 1 arribagente PN 2084415492 Aug 18 09:14 PANEO NIÑOS ESCUELAS > CON TAPABOCAS.mpg > -rw-r--r--. 1 arribagente PN784701444 Sep 4 07:23 PANEO VUELTA A CLASES > CON TAPABOCAS.mpg > > Interestingly, the error occurs on the mount point where the files were > copied. They don't show up as pending heal entries. I have around 15 people > using them over samba, I think I'm having this issue reported every two days. > > I have an older cluster with similar issues, different gluster version, but a > very similar topology (4 bricks, initially two bricks then expanded) > Please note , the bricks aren't the same size (but their replicas are), so my > other suspicion is that rebalancing has something to do with it. > > I'm trying to reproduce it over a small virtualized cluster, so far no > results. > > Here are the cluster details > four nodes, replica 2, plus one arbiter hosting 2 bricks > I have 2 bricks with ~20 TB capacity and the other pair is ~48TB > Volume Name: tapeless > Type: Distributed-Replicate > Volume ID: 53bfa86d-b390-496b-bbd7-c4bba625c956 > Status: Started > Snapshot Count: 0 > Number of Bricks: 2 x (2 + 1) = 6 > Transport-type: tcp > Bricks: > Brick1: gluster6.glustersaeta.net:/data/glusterfs/tapeless/brick_6/brick > Brick2: gluster7.glustersaeta.net:/data/glusterfs/tapeless/brick_7/brick > Brick3: > kitchen-store.glustersaeta.net:/data/glusterfs/tapeless/brick_1a/brick > (arbiter) > Brick4: gluster12.glustersaeta.net:/data/glusterfs/tapeless/brick_12/brick > Brick5: gluster13.glustersaeta.net:/data/glusterfs/tapeless/brick_13/brick > Brick6: > kitchen-store.glustersaeta.net:/data/glusterfs/tapeless/brick_2a/brick > (arbiter) > Options Reconfigured: > features.quota-deem-statfs: on > performance.client-io-threads: on > nfs.disable: on > transport.address-family: inet > features.quota: on > features.inode-quota: on > features.cache-invalidation: on > features.cache-invalidation-timeout: 600 > performance.cache-samba-metadata: on > performance.stat-prefetch: on > performance.cache-invalidation: on > performance.md-cache-timeout: 600 > network.inode-lru-limit: 20 > performance.nl-cache: on > performance.nl-cache-timeout: 600 > performance.readdir-ahead: on > performance.parallel-readdir: on > performance.cache-size: 1GB > client.event-threads: 4 > server.event-threads: 4 > performance.normal-prio-threads: 16 > performance.io-thread-count: 32 > performance.write-behind-window-size: 8MB > storage.batch-fsync-delay-usec: 0 > cluster.data-self-heal: on > cluster.metadata-self-heal: on > cluster.entry-self-heal: on > cluster.self-heal-daemon: on > performance.write-behind: on > performance.open-behind: on > > Log section form faulty mount point. I think the [file exists] entries are > from people trying to copy the missing files over an
[Gluster-users] fuse Stale file handle error
Have a directory in a weird state on a Distributed-Replicate, server is Gluster 7.3, client is the fuse client 6.6. Script did a mkdir then tried to mv a file into the new dir, which failed. The ls -l of it from the fuse client gives the stale file hande error and the weird listing: d? ? ? ? ?? orig Looks like from the bricks themselves the directory exists and looks normal. So what's the proper way to remove this bad directory? Just rmdir on all the bricks directly? Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users