[Gluster-users] Bricks crashing in 3.7.1

Alessandro De Salvo Wed, 10 Jun 2015 01:09:04 -0700

Hi,
yesterday I’ve got a strange crash on almost all bricks, same type of crash, 
repeated:


[2015-06-09 18:23:56.407520] I [login.c:81:gf_auth] 0-auth/login: allowed user 
names: c3deedb5-893f-41fb-8c33-9ae23a0e9d27
[2015-06-09 18:23:56.407580] I [server-handshake.c:585:server_setvolume] 
0-atlas-data-01-server: accepted client from 
atlas-storage-10.roma1.infn.it-7546-2015/06/09-18:23:55:618600-atlas-data-01-client-0-0-0
 (version: 3.7.1)
[2015-06-09 18:23:56.407707] I [login.c:81:gf_auth] 0-auth/login: allowed user 
names: c3deedb5-893f-41fb-8c33-9ae23a0e9d27
[2015-06-09 18:23:56.407772] I [server-handshake.c:585:server_setvolume] 
0-atlas-data-01-server: accepted client from 
atlas-storage-09.roma1.infn.it-25429-2015/06/09-18:18:57:328935-atlas-data-01-client-0-0-0
 (version: 3.7.1)
[2015-06-09 18:23:56.415905] I [login.c:81:gf_auth] 0-auth/login: allowed user 
names: c3deedb5-893f-41fb-8c33-9ae23a0e9d27
[2015-06-09 18:23:56.415947] I [server-handshake.c:585:server_setvolume] 
0-atlas-data-01-server: accepted client from 
atlas-storage-10.roma1.infn.it-7530-2015/06/09-18:23:54:608880-atlas-data-01-client-0-0-0
 (version: 3.7.1)
[2015-06-09 18:23:56.433956] E [posix-handle.c:157:posix_make_ancestryfromgfid] 
0-atlas-data-01-posix: could not read the link from the gfid handle 
/bricks/atlas/data01/data/.glusterfs/74/4b/744b7cf0-258f-4dea-b4d9-7001bb21ca56 
(No such file or directory)
[2015-06-09 18:23:56.433954] E [posix-handle.c:157:posix_make_ancestryfromgfid] 
0-atlas-data-01-posix: could not read the link from the gfid handle 
/bricks/atlas/data01/data/.glusterfs/74/4b/744b7cf0-258f-4dea-b4d9-7001bb21ca56 
(No such file or directory)
pending frames:
frame : type(0) op(11)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 
2015-06-09 18:23:56
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7.1
/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb2)[0x7f0f6446ed92]
/lib64/libglusterfs.so.0(gf_print_trace+0x32d)[0x7f0f644899ed]
/lib64/libc.so.6(+0x35650)[0x7f0f62e60650]
/usr/lib64/glusterfs/3.7.1/xlator/features/upcall.so(upcall_cache_invalidate+0xb5)[0x7f0f5537cab5]
/usr/lib64/glusterfs/3.7.1/xlator/features/upcall.so(up_readdir_cbk+0x1a2)[0x7f0f55376292]
/usr/lib64/glusterfs/3.7.1/xlator/features/locks.so(pl_readdirp_cbk+0x164)[0x7f0f5558dc94]
/usr/lib64/glusterfs/3.7.1/xlator/features/access-control.so(posix_acl_readdirp_cbk+0x299)[0x7f0f557a6829]
/usr/lib64/glusterfs/3.7.1/xlator/features/bitrot-stub.so(br_stub_readdirp_cbk+0x181)[0x7f0f559b5fb1]
/usr/lib64/glusterfs/3.7.1/xlator/storage/posix.so(posix_readdirp+0x143)[0x7f0f56f0cfc3]
/lib64/libglusterfs.so.0(default_readdirp+0x75)[0x7f0f644736a5]
/lib64/libglusterfs.so.0(default_readdirp+0x75)[0x7f0f644736a5]
/lib64/libglusterfs.so.0(default_readdirp+0x75)[0x7f0f644736a5]
/usr/lib64/glusterfs/3.7.1/xlator/features/bitrot-stub.so(br_stub_readdirp+0x246)[0x7f0f559b0d46]
/usr/lib64/glusterfs/3.7.1/xlator/features/access-control.so(posix_acl_readdirp+0x18d)[0x7f0f557a45cd]
/usr/lib64/glusterfs/3.7.1/xlator/features/locks.so(pl_readdirp+0x14e)[0x7f0f5558c7ee]
/usr/lib64/glusterfs/3.7.1/xlator/features/upcall.so(up_readdirp+0x17a)[0x7f0f5537abfa]
/lib64/libglusterfs.so.0(default_readdirp_resume+0x134)[0x7f0f644809e4]
/lib64/libglusterfs.so.0(call_resume+0x7d)[0x7f0f64498c7d]
/usr/lib64/glusterfs/3.7.1/xlator/performance/io-threads.so(iot_worker+0x123)[0x7f0f5516b353]
/lib64/libpthread.so.0(+0x7df5)[0x7f0f635dadf5]
/lib64/libc.so.6(clone+0x6d)[0x7f0f62f211ad]
---------


I’m not sure if the missing file is the culprit, but if it is the cause, how 
can I solve it? For the moment I’ve recreated the bricks from a backup, so I’m 
fine, but it would be nice to understand what to do in case it happens again. I 
still have the contents of the old crashed brick.
The crash was happening every time I restarted glusterd, in the same way.
I’m using gluster 3.7.1 on CentOS 7.1, with the following kind of configuration:

# gluster volume info atlas-data-01
 
Volume Name: atlas-data-01
Type: Replicate
Volume ID: 854620a1-3e88-4e76-91ce-486996bf6a12
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: node1:/bricks/atlas/data01/data
Brick2: node2:/bricks/atlas/data01/data
Brick3: node3:/bricks/atlas/data02/data
Options Reconfigured:
features.inode-quota: on
features.quota: on
performance.readdir-ahead: on
nfs.disable: true
server.allow-insecure: on
ganesha.enable: off
nfs-ganesha: disable


I was playing with ganesha and tried to enable in on the volumes (but failed, 
as you can see from my other messages), and I’m not sure it is related, but all 
the crashed bricks were the ones belonging to the volumes where I tried to 
enable ganesha.
Thanks,

        Alessandro

smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Bricks crashing in 3.7.1

Reply via email to