On Fri, Oct 21, 2016 at 6:36 PM, Soumya Koduri <skod...@redhat.com> wrote:
> > > On 10/21/2016 02:03 PM, Xavier Hernandez wrote: > >> Hi Niels, >> >> On 21/10/16 10:03, Niels de Vos wrote: >> >>> On Fri, Oct 21, 2016 at 09:03:30AM +0200, Xavier Hernandez wrote: >>> >>>> Hi, >>>> >>>> I've just tried Gluster 3.8.5 with Proxmox using gfapi and I >>>> consistently >>>> see a crash each time an attempt to connect to the volume is made. >>>> >>> >>> Thanks, that likely is the same bug as >>> https://bugzilla.redhat.com/1379241 . >>> >> >> I'm not sure it's the same problem. The crash on my case happens always >> and immediately. When creating an image, the file is created but size is >> 0. The stack trace is quite different also. >> > > Right. The issue reported in sug1379241 looks like the one we hit with > client-io-threads enabled (already discussed in gluster-devel). Disabling > that option may prevent the crash seen. > Pranith has sent a fix http://review.gluster.org/#/c/15620/ for the same. > > Thanks, > Soumya > > > >> Xavi >> >> >>> Satheesaran, could you revert commit 7a50690 from the build that you >>> were testing, and see if that causes the problem to go away again? Let >>> me know of you want me to provide RPMs for testing. >>> >>> Niels >>> >>> >>>> The backtrace of the crash shows this: >>>> >>>> #0 pthread_spin_lock () at >>>> ../nptl/sysdeps/x86_64/pthread_spin_lock.S:24 >>>> #1 0x00007fe5345776a5 in fd_unref (fd=0x7fe523f7205c) at fd.c:553 >>>> #2 0x00007fe53482ba18 in glfs_io_async_cbk (op_ret=<optimized out>, >>>> op_errno=0, frame=<optimized out>, cookie=0x7fe526c67040, >>>> iovec=iovec@entry=0x0, count=count@entry=0) >>>> at glfs-fops.c:839 >>>> #3 0x00007fe53482beed in glfs_fsync_async_cbk (frame=<optimized out>, >>>> cookie=<optimized out>, this=<optimized out>, op_ret=<optimized out>, >>>> op_errno=<optimized out>, >>>> prebuf=<optimized out>, postbuf=0x7fe5217fe890, xdata=0x0) at >>>> glfs-fops.c:1382 >>>> #4 0x00007fe520be2eb7 in ?? () from >>>> /usr/lib/x86_64-linux-gnu/glusterfs/3.8.5/xlator/debug/io-stats.so >>>> #5 0x00007fe5345d118a in default_fsync_cbk (frame=0x7fe52ceef3ac, >>>> cookie=0x560ef95398e8, this=0x8, op_ret=0, op_errno=0, prebuf=0x1, >>>> postbuf=0x7fe5217fe890, xdata=0x0) at defaults.c:1508 >>>> #6 0x00007fe5345d118a in default_fsync_cbk (frame=0x7fe52ceef204, >>>> cookie=0x560ef95398e8, this=0x8, op_ret=0, op_errno=0, prebuf=0x1, >>>> postbuf=0x7fe5217fe890, xdata=0x0) at defaults.c:1508 >>>> #7 0x00007fe525f78219 in dht_fsync_cbk (frame=0x7fe52ceef2d8, >>>> cookie=0x560ef95398e8, this=0x0, op_ret=0, op_errno=0, >>>> prebuf=0x7fe5217fe820, postbuf=0x7fe5217fe890, xdata=0x0) >>>> at dht-inode-read.c:873 >>>> #8 0x00007fe5261bbc7f in client3_3_fsync_cbk (req=0x7fe525f78030 >>>> <dht_fsync_cbk>, iov=0x7fe526c61040, count=8, myframe=0x7fe52ceef130) at >>>> client-rpc-fops.c:975 >>>> #9 0x00007fe5343201f0 in rpc_clnt_handle_reply (clnt=0x18, >>>> clnt@entry=0x7fe526fafac0, pollin=0x7fe526c3a1c0) at rpc-clnt.c:791 >>>> #10 0x00007fe53432056c in rpc_clnt_notify (trans=<optimized out>, >>>> mydata=0x7fe526fafaf0, event=<optimized out>, data=0x7fe526c3a1c0) at >>>> rpc-clnt.c:962 >>>> #11 0x00007fe53431c8a3 in rpc_transport_notify (this=<optimized out>, >>>> event=<optimized out>, data=<optimized out>) at rpc-transport.c:541 >>>> #12 0x00007fe5283e8d96 in socket_event_poll_in (this=0x7fe526c69440) at >>>> socket.c:2267 >>>> #13 0x00007fe5283eaf37 in socket_event_handler (fd=<optimized out>, >>>> idx=5, >>>> data=0x7fe526c69440, poll_in=1, poll_out=0, poll_err=0) at socket.c:2397 >>>> #14 0x00007fe5345ab3f6 in event_dispatch_epoll_handler >>>> (event=0x7fe5217fecc0, event_pool=0x7fe526ca2040) at event-epoll.c:571 >>>> #15 event_dispatch_epoll_worker (data=0x7fe527c0f0c0) at >>>> event-epoll.c:674 >>>> #16 0x00007fe5324140a4 in start_thread (arg=0x7fe5217ff700) at >>>> pthread_create.c:309 >>>> #17 0x00007fe53214962d in clone () at >>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 >>>> >>>> The fd being unreferenced contains this: >>>> >>>> (gdb) print *fd >>>> $6 = { >>>> pid = 97649, >>>> flags = 2, >>>> refcount = 0, >>>> inode_list = { >>>> next = 0x7fe523f7206c, >>>> prev = 0x7fe523f7206c >>>> }, >>>> inode = 0x0, >>>> lock = { >>>> spinlock = 1, >>>> mutex = { >>>> __data = { >>>> __lock = 1, >>>> __count = 0, >>>> __owner = 0, >>>> __nusers = 0, >>>> __kind = 0, >>>> __spins = 0, >>>> __elision = 0, >>>> __list = { >>>> __prev = 0x0, >>>> __next = 0x0 >>>> } >>>> }, >>>> __size = "\001", '\000' <repeats 38 times>, >>>> __align = 1 >>>> } >>>> }, >>>> _ctx = 0x7fe52ec31c40, >>>> xl_count = 11, >>>> lk_ctx = 0x7fe526c126a0, >>>> anonymous = _gf_false >>>> } >>>> >>>> fd->inode is NULL, explaining the cause of the crash. We also see that >>>> fd->refcount is already 0. So I'm wondering if this couldn't be an extra >>>> fd_unref() introduced by that patch. >>>> >>>> The crash seems to happen immediately after a graph switch. >>>> >>>> Xavi >>>> >>> _______________________________________________ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel > -- ~ Atin (atinm)
_______________________________________________ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel