[Gluster-users] FUSE Client Crashes On Shared File

Timothy Orme Mon, 21 Oct 2019 10:58:05 -0700

Hello,

I'm running gluster 6.5 on Amazon Linux 2 (CentOS 7 variant).  I have a 
distributed-replicated cluster running, with sharding enabled for files over 
512 MB.


I tried issuing an `rm` for a large number of files, and seem to be 
consistently getting the client to crash on a specific file set.  I see the 
following error in the logs:

[2019-10-21 17:43:19.875880] I [fuse-bridge.c:5142:fuse_init] 0-glusterfs-fuse: 
FUSE inited with protocol versions: glusterfs 7.24 kernel 7.26
[2019-10-21 17:43:19.875896] I [fuse-bridge.c:5753:fuse_graph_sync] 0-fuse: 
switched to graph 0
[2019-10-21 17:44:16.372054] W [MSGID: 109009] 
[dht-common.c:2807:dht_lookup_linkfile_cbk] 0-scratch-dht: 
/.shard/4b6d0aab-aa33-44dd-8d3f-2054712702dd.1: gfid different on data file on 
scratch-replicate-3, gfid local = 00000000-0000-0000-0000-000000000000, gfid 
node = 793507db-42e1-4b9e-9ce0-b2c2451f78dd
[2019-10-21 17:44:16.373429] W [MSGID: 109009] 
[dht-common.c:2562:dht_lookup_everywhere_cbk] 0-scratch-dht: 
/.shard/4b6d0aab-aa33-44dd-8d3f-2054712702dd.1: gfid differs on subvolume 
scratch-replicate-3, gfid local = 5c52fe2a-c580-42ae-b2cb-ce3cae39ffeb, gfid 
node = 793507db-42e1-4b9e-9ce0-b2c2451f78dd
[2019-10-21 17:44:16.373730] E [MSGID: 133010] 
[shard.c:2326:shard_common_lookup_shards_cbk] 0-scratch-shard: Lookup on shard 
1 failed. Base file gfid = 4b6d0aab-aa33-44dd-8d3f-2054712702dd [Stale file 
handle]
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(1) op(UNLINK)
frame : type(1) op(UNLINK)
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash:
2019-10-21 17:44:16
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 6.5
/lib64/libglusterfs.so.0(+0x267cc)[0x7f5cc26f47cc]
/lib64/libglusterfs.so.0(gf_print_trace+0x306)[0x7f5cc26ff1e6]
/lib64/libc.so.6(+0x347e0)[0x7f5cc0d037e0]
/lib64/libuuid.so.1(+0x25f0)[0x7f5cc1e5a5f0]
/lib64/libuuid.so.1(+0x2674)[0x7f5cc1e5a674]
/lib64/libglusterfs.so.0(uuid_utoa+0x1c)[0x7f5cc26fe38c]
/usr/lib64/glusterfs/6.5/xlator/features/shard.so(+0xa7f9)[0x7f5cb2af67f9]
/usr/lib64/glusterfs/6.5/xlator/features/shard.so(+0x9758)[0x7f5cb2af5758]
/usr/lib64/glusterfs/6.5/xlator/features/shard.so(+0xa869)[0x7f5cb2af6869]
/usr/lib64/glusterfs/6.5/xlator/features/shard.so(+0x700f)[0x7f5cb2af300f]
/usr/lib64/glusterfs/6.5/xlator/features/shard.so(+0xa95e)[0x7f5cb2af695e]
/usr/lib64/glusterfs/6.5/xlator/features/shard.so(+0xaf8d)[0x7f5cb2af6f8d]
/usr/lib64/glusterfs/6.5/xlator/features/shard.so(+0xb3b3)[0x7f5cb2af73b3]
/usr/lib64/glusterfs/6.5/xlator/features/shard.so(+0xba42)[0x7f5cb2af7a42]
/lib64/libglusterfs.so.0(+0x61170)[0x7f5cc272f170]
/lib64/libc.so.6(+0x49e00)[0x7f5cc0d18e00]
---------

The error is consistent and reproducible.  Every time I remount and try and 
delete the client crashes again.  I'm assuming something is wrong with the 
shards.  How do I correct this, or is this a bug with sharding?

Thanks!
Tim

________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] FUSE Client Crashes On Shared File

Reply via email to