Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Mahdi Adnan Mon, 01 Aug 2016 22:45:05 -0700

Hi,
The NFS just crashed again, latest bt;
(gdb) bt#0  0x00007f0b71a9f210 in pthread_spin_lock () from 
/lib64/libpthread.so.0#1  0x00007f0b72c6fcd5 in fd_anonymous (inode=0x0) at 
fd.c:804#2  0x00007f0b64ca5787 in shard_common_inode_write_do 
(frame=0x7f0b707c062c, this=0x7f0b6002ac10) at shard.c:3716#3  
0x00007f0b64ca5a53 in shard_common_inode_write_post_lookup_shards_handler 
(frame=<optimized out>, this=<optimized out>) at shard.c:3769#4  
0x00007f0b64c9eff5 in shard_common_lookup_shards_cbk (frame=0x7f0b707c062c, 
cookie=<optimized out>, this=0x7f0b6002ac10, op_ret=0,     op_errno=<optimized 
out>, inode=<optimized out>, buf=0x7f0b51407640, xdata=0x7f0b72f57648, 
postparent=0x7f0b514076b0) at shard.c:1601#5  0x00007f0b64efe141 in 
dht_lookup_cbk (frame=0x7f0b7075fcdc, cookie=<optimized out>, this=<optimized 
out>, op_ret=0, op_errno=0, inode=0x7f0b5f1d1f58,     stbuf=0x7f0b51407640, 
xattr=0x7f0b72f57648, postparent=0x7f0b514076b0) at dht-common.c:2174#6  
0x00007f0b651871f3 in afr_lookup_done (frame=frame@entry=0x7f0b7079a4c8, 
this=this@entry=0x7f0b60023ba0) at afr-common.c:1825#7  0x00007f0b65187b84 in 
afr_lookup_metadata_heal_check (frame=frame@entry=0x7f0b7079a4c8, 
this=0x7f0b60023ba0, this@entry=0xca0bd88259f5a800)    at afr-common.c:2068#8  
0x00007f0b6518834f in afr_lookup_entry_heal (frame=frame@entry=0x7f0b7079a4c8, 
this=0xca0bd88259f5a800, this@entry=0x7f0b60023ba0) at afr-common.c:2157#9  
0x00007f0b6518867d in afr_lookup_cbk (frame=0x7f0b7079a4c8, cookie=<optimized 
out>, this=0x7f0b60023ba0, op_ret=<optimized out>,     op_errno=<optimized 
out>, inode=<optimized out>, buf=0x7f0b564e9940, xdata=0x7f0b72f708c8, 
postparent=0x7f0b564e99b0) at afr-common.c:2205#10 0x00007f0b653d6e42 in 
client3_3_lookup_cbk (req=<optimized out>, iov=<optimized out>, 
count=<optimized out>, myframe=0x7f0b7076354c)    at client-rpc-fops.c:2981#11 
0x00007f0b72a00a30 in rpc_clnt_handle_reply (clnt=clnt@entry=0x7f0b603393c0, 
pollin=pollin@entry=0x7f0b50c1c2d0) at rpc-clnt.c:764#12 0x00007f0b72a00cef in 
rpc_clnt_notify (trans=<optimized out>, mydata=0x7f0b603393f0, event=<optimized 
out>, data=0x7f0b50c1c2d0) at rpc-clnt.c:925#13 0x00007f0b729fc7c3 in 
rpc_transport_notify (this=this@entry=0x7f0b60349040, 
event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry=0x7f0b50c1c2d0)   
 at rpc-transport.c:546#14 0x00007f0b678c39a4 in socket_event_poll_in 
(this=this@entry=0x7f0b60349040) at socket.c:2353#15 0x00007f0b678c65e4 in 
socket_event_handler (fd=fd@entry=29, idx=idx@entry=17, data=0x7f0b60349040, 
poll_in=1, poll_out=0, poll_err=0) at socket.c:2466#16 0x00007f0b72ca0f7a in 
event_dispatch_epoll_handler (event=0x7f0b564e9e80, event_pool=0x7f0b7349bf20) 
at event-epoll.c:575#17 event_dispatch_epoll_worker (data=0x7f0b60152d40) at 
event-epoll.c:678#18 0x00007f0b71a9adc5 in start_thread () from 
/lib64/libpthread.so.0#19 0x00007f0b713dfced in clone () from /lib64/libc.so.6



-- 



Respectfully

    Mahdi A. Mahdi

From: mahdi.ad...@outlook.com
To: kdhan...@redhat.com
Date: Mon, 1 Aug 2016 16:31:50 +0300
CC: gluster-users@gluster.org
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash




Many thanks,
here's the results;

(gdb) p cur_block$15 = 4088(gdb) p last_block$16 = 4088(gdb) p 
local->first_block$17 = 4087(gdb) p odirect$18 = _gf_false(gdb) p fd->flags$19 
= 2(gdb) p local->call_count$20 = 2

If you need more core dumps, i have several files i can upload.

-- 



Respectfully

    Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Mon, 1 Aug 2016 18:39:27 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Sorry I didn't make myself  clear. The reason I asked YOU to do it is because i 
tried it on my system and im not getting the backtrace (it's all question 
marks).

Attach the core to gdb.
At the gdb prompt, go to frame 2 by typing
(gdb) f 2

There, for each of the variables i asked you to get the values of, type p 
followed by the variable name.
For instance, to get the value of the variable 'odirect', do this:

(gdb) p odirect

and gdb will print its value for you in response.

-Krutika

On Mon, Aug 1, 2016 at 4:55 PM, Mahdi Adnan <mahdi.ad...@outlook.com> wrote:



Hi,
How to get the results of the below variables ? i cant get the results from gdb.



-- 



Respectfully

    Mahdi A. Mahdi



From: kdhan...@redhat.com
Date: Mon, 1 Aug 2016 15:51:38 +0530
Subject: Re: [Gluster-users] Gluster 3.7.13 NFS Crash
To: mahdi.ad...@outlook.com
CC: gluster-users@gluster.org

Could you also print and share the values of the following variables from the 
backtrace please:

i. cur_block
ii. last_block
iii. local->first_block
iv. odirect
v. fd->flags
vi. local->call_count

-Krutika

On Sat, Jul 30, 2016 at 5:04 PM, Mahdi Adnan <mahdi.ad...@outlook.com> wrote:



Hi,
I really appreciate if someone can help me fix my nfs crash, its happening a 
lot and it's causing lots of issues to my VMs;the problem is every few hours 
the native nfs crash and the volume become unavailable from the affected node 
unless i restart glusterd.the volume is used by vmware esxi as a datastore for 
it's VMs with the following options;

OS: CentOS 7.2Gluster: 3.7.13
Volume Name: vlm01Type: Distributed-ReplicateVolume ID: 
eacd8248-dca3-4530-9aed-7714a5a114f2Status: StartedNumber of Bricks: 7 x 3 = 
21Transport-type: tcpBricks:Brick1: gfs01:/bricks/b01/vlm01Brick2: 
gfs02:/bricks/b01/vlm01Brick3: gfs03:/bricks/b01/vlm01Brick4: 
gfs01:/bricks/b02/vlm01Brick5: gfs02:/bricks/b02/vlm01Brick6: 
gfs03:/bricks/b02/vlm01Brick7: gfs01:/bricks/b03/vlm01Brick8: 
gfs02:/bricks/b03/vlm01Brick9: gfs03:/bricks/b03/vlm01Brick10: 
gfs01:/bricks/b04/vlm01Brick11: gfs02:/bricks/b04/vlm01Brick12: 
gfs03:/bricks/b04/vlm01Brick13: gfs01:/bricks/b05/vlm01Brick14: 
gfs02:/bricks/b05/vlm01Brick15: gfs03:/bricks/b05/vlm01Brick16: 
gfs01:/bricks/b06/vlm01Brick17: gfs02:/bricks/b06/vlm01Brick18: 
gfs03:/bricks/b06/vlm01Brick19: gfs01:/bricks/b07/vlm01Brick20: 
gfs02:/bricks/b07/vlm01Brick21: gfs03:/bricks/b07/vlm01Options 
Reconfigured:performance.readdir-ahead: offperformance.quick-read: 
offperformance.read-ahead: offperformance.io-cache: 
offperformance.stat-prefetch: offcluster.eager-lock: enablenetwork.remote-dio: 
enablecluster.quorum-type: autocluster.server-quorum-type: 
serverperformance.strict-write-ordering: onperformance.write-behind: 
offcluster.data-self-heal-algorithm: fullcluster.self-heal-window-size: 
128features.shard-block-size: 16MBfeatures.shard: onauth.allow: 
192.168.221.50,192.168.221.51,192.168.221.52,192.168.221.56,192.168.208.130,192.168.208.131,192.168.208.132,192.168.208.89,192.168.208.85,192.168.208.208.86network.ping-timeout:
 10

latest bt;

(gdb) bt #0  0x00007f196acab210 in pthread_spin_lock () from 
/lib64/libpthread.so.0#1  0x00007f196be7bcd5 in fd_anonymous (inode=0x0) at 
fd.c:804#2  0x00007f195deb1787 in shard_common_inode_write_do 
(frame=0x7f19699f1164, this=0x7f195802ac10) at shard.c:3716#3  
0x00007f195deb1a53 in shard_common_inode_write_post_lookup_shards_handler 
(frame=<optimized out>, this=<optimized out>) at shard.c:3769#4  
0x00007f195deaaff5 in shard_common_lookup_shards_cbk (frame=0x7f19699f1164, 
cookie=<optimized out>, this=0x7f195802ac10, op_ret=0,     op_errno=<optimized 
out>, inode=<optimized out>, buf=0x7f194970bc40, xdata=0x7f196c15451c, 
postparent=0x7f194970bcb0) at shard.c:1601#5  0x00007f195e10a141 in 
dht_lookup_cbk (frame=0x7f196998e7d4, cookie=<optimized out>, this=<optimized 
out>, op_ret=0, op_errno=0, inode=0x7f195c532b18,     stbuf=0x7f194970bc40, 
xattr=0x7f196c15451c, postparent=0x7f194970bcb0) at dht-common.c:2174#6  
0x00007f195e3931f3 in afr_lookup_done (frame=frame@entry=0x7f196997f8a4, 
this=this@entry=0x7f1958022a20) at afr-common.c:1825#7  0x00007f195e393b84 in 
afr_lookup_metadata_heal_check (frame=frame@entry=0x7f196997f8a4, 
this=0x7f1958022a20, this@entry=0xe3a929e0b67fa500)    at afr-common.c:2068#8  
0x00007f195e39434f in afr_lookup_entry_heal (frame=frame@entry=0x7f196997f8a4, 
this=0xe3a929e0b67fa500, this@entry=0x7f1958022a20) at afr-common.c:2157#9  
0x00007f195e39467d in afr_lookup_cbk (frame=0x7f196997f8a4, cookie=<optimized 
out>, this=0x7f1958022a20, op_ret=<optimized out>,     op_errno=<optimized 
out>, inode=<optimized out>, buf=0x7f195effa940, xdata=0x7f196c1853b0, 
postparent=0x7f195effa9b0) at afr-common.c:2205#10 0x00007f195e5e2e42 in 
client3_3_lookup_cbk (req=<optimized out>, iov=<optimized out>, 
count=<optimized out>, myframe=0x7f196999952c)    at client-rpc-fops.c:2981#11 
0x00007f196bc0ca30 in rpc_clnt_handle_reply (clnt=clnt@entry=0x7f19583adaf0, 
pollin=pollin@entry=0x7f195907f930) at rpc-clnt.c:764#12 0x00007f196bc0ccef in 
rpc_clnt_notify (trans=<optimized out>, mydata=0x7f19583adb20, event=<optimized 
out>, data=0x7f195907f930) at rpc-clnt.c:925#13 0x00007f196bc087c3 in 
rpc_transport_notify (this=this@entry=0x7f19583bd770, 
event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry=0x7f195907f930)   
 at rpc-transport.c:546#14 0x00007f1960acf9a4 in socket_event_poll_in 
(this=this@entry=0x7f19583bd770) at socket.c:2353#15 0x00007f1960ad25e4 in 
socket_event_handler (fd=fd@entry=25, idx=idx@entry=14, data=0x7f19583bd770, 
poll_in=1, poll_out=0, poll_err=0) at socket.c:2466#16 0x00007f196beacf7a in 
event_dispatch_epoll_handler (event=0x7f195effae80, event_pool=0x7f196dbf5f20) 
at event-epoll.c:575#17 event_dispatch_epoll_worker (data=0x7f196dc41e10) at 
event-epoll.c:678#18 0x00007f196aca6dc5 in start_thread () from 
/lib64/libpthread.so.0#19 0x00007f196a5ebced in clone () from /lib64/libc.so.6



nfs logs and the core dump can be found in the dropbox link 
below;https://db.tt/rZrC9d7f


thanks in advance.Respectfully
Mahdi A. Mahdi

                                          

_______________________________________________

Gluster-users mailing list

Gluster-users@gluster.org

http://www.gluster.org/mailman/listinfo/gluster-users

                                          

                                          

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster 3.7.13 NFS Crash

Reply via email to