Thank you. In the meantime, turning off parallel readdir should prevent the first crash.
On 20 June 2018 at 21:42, mohammad kashif <kashif.a...@gmail.com> wrote: > Hi Nithya > > Thanks for the bug report. This new crash happened only once and only at > one client in the last 6 days. I will let you know if it happened again or > more frequently. > > Cheers > > Kashif > > On Wed, Jun 20, 2018 at 12:28 PM, Nithya Balachandran <nbala...@redhat.com > > wrote: > >> Hi Mohammad, >> >> This is a different crash. How often does it happen? >> >> >> We have managed to reproduce the first crash you reported and a bug has >> been filed at [1]. >> We will work on a fix for this. >> >> >> Regards, >> Nithya >> >> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1593199 >> >> >> On 18 June 2018 at 14:09, mohammad kashif <kashif.a...@gmail.com> wrote: >> >>> Hi >>> >>> Problem appeared again after few days. This time, the client >>> is glusterfs-3.10.12-1.el6.x86_64 and performance.parallel-readdir is >>> off. The log level was set to ERROR and I got this log at the time of crash >>> >>> [2018-06-14 08:45:43.551384] E [rpc-clnt.c:365:saved_frames_unwind] >>> (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x153)[0x7fac2e66ce03] >>> (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x7fac2e434867] >>> (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fac2e43497e] >>> (--> >>> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xa5)[0x7fac2e434a45] >>> (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x278)[0x7fac2e434d68] >>> ))))) 0-atlasglust-client-4: forced unwinding frame type(GlusterFS 3.3) >>> op(READDIRP(40)) called at 2018-06-14 08:45:43.483303 (xid=0x7553c7 >>> >>> Core dump was enabled on client so it created a dump. It is here >>> >>> http://www-pnp.physics.ox.ac.uk/~mohammad >>> <http://www-pnp.physics.ox.ac.uk/~mohammad/backtrace.log>/core.1002074 >>> >>> I used a gdb trace using this command >>> >>> gdb /usr/sbin/glusterfs core.1002074 -ex bt -ex quit |& tee >>> backtrace.log_18_16_1 >>> >>> >>> http://www-pnp.physics.ox.ac.uk/~mohammad >>> <http://www-pnp.physics.ox.ac.uk/~mohammad/backtrace.log> >>> /backtrace.log_18_16_1 >>> >>> I haven't used gdb much so let me know if you want me to run gdb in >>> different manner. >>> >>> Thanks >>> >>> Kashif >>> >>> >>> On Mon, Jun 18, 2018 at 6:27 AM, Raghavendra Gowdappa < >>> rgowd...@redhat.com> wrote: >>> >>>> >>>> >>>> On Mon, Jun 18, 2018 at 9:39 AM, Raghavendra Gowdappa < >>>> rgowd...@redhat.com> wrote: >>>> >>>>> >>>>> >>>>> On Mon, Jun 18, 2018 at 8:11 AM, Raghavendra Gowdappa < >>>>> rgowd...@redhat.com> wrote: >>>>> >>>>>> From the bt: >>>>>> >>>>>> #8 0x00007f6ef977e6de in rda_readdirp (frame=0x7f6eec862320, >>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=357, off=2, >>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266 >>>>>> #9 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized >>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0, >>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at >>>>>> dht-common.c:5388 >>>>>> #10 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec862210, >>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2, >>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266 >>>>>> #11 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized >>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0, >>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at >>>>>> dht-common.c:5388 >>>>>> #12 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec862100, >>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2, >>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266 >>>>>> #13 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized >>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0, >>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at >>>>>> dht-common.c:5388 >>>>>> #14 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861ff0, >>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2, >>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266 >>>>>> #15 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized >>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0, >>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at >>>>>> dht-common.c:5388 >>>>>> #16 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861ee0, >>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2, >>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266 >>>>>> #17 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized >>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0, >>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at >>>>>> dht-common.c:5388 >>>>>> #18 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861dd0, >>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2, >>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266 >>>>>> #19 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized >>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0, >>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at >>>>>> dht-common.c:5388 >>>>>> #20 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861cc0, >>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2, >>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266 >>>>>> #21 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized >>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0, >>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at >>>>>> dht-common.c:5388 >>>>>> #22 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861bb0, >>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2, >>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266 >>>>>> #23 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized >>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0, >>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at >>>>>> dht-common.c:5388 >>>>>> >>>>>> It looks like an infinite recursion. Note that readdirp is wound to >>>>>> the same subvol (value of "this" is same in all calls to rda_readdirp) at >>>>>> the same offset (of value 2). This may be a bug in DHT (winding down >>>>>> readdirp with wrong offset) or in readdir-ahead (populating incorrect >>>>>> offset values in dentries it returns as readdirp response). >>>>>> >>>>> >>>>> It looks to be a corruption. Value of size argument in rda_readdirp is >>>>> too big (around 127 TB) to be sane. If you've a reproducer, please run it >>>>> in valgrind or ASAN. >>>>> >>>> >>>> >>>> I spoke too early. It could be a negative value and hence it may not be >>>> a corruption. Is it possible to upload the core somewhere? Or better still >>>> access to gdb session with this core would be more helpful. >>>> >>>> >>>>> To make it explicit, ATM its not clear that there is bug in >>>>> readdir-ahead or DHT as it looks to be a memory corruption. Till I get a >>>>> reproducer or valgrind/ASAN output of client process when the issue >>>>> occcurs, I won't be working on this problem. >>>>> >>>>> >>>>>> >>>>>> On Wed, Jun 13, 2018 at 4:29 PM, mohammad kashif < >>>>>> kashif.a...@gmail.com> wrote: >>>>>> >>>>>>> Hi Milind >>>>>>> >>>>>>> Thanks a lot, I manage to run gdb and produced traceback as well. >>>>>>> Its here >>>>>>> >>>>>>> http://www-pnp.physics.ox.ac.uk/~mohammad/backtrace.log >>>>>>> >>>>>>> >>>>>>> I am trying to understand but still not able to make sense out of it. >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> Kashif >>>>>>> >>>>>>> On Wed, Jun 13, 2018 at 11:34 AM, Milind Changire < >>>>>>> mchan...@redhat.com> wrote: >>>>>>> >>>>>>>> Kashif, >>>>>>>> FYI: http://debuginfo.centos.org/centos/6/storage/x86_64/ >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Jun 13, 2018 at 3:21 PM, mohammad kashif < >>>>>>>> kashif.a...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi Milind >>>>>>>>> >>>>>>>>> There is no glusterfs-debuginfo available for gluster-3.12 from >>>>>>>>> http://mirror.centos.org/centos/6/storage/x86_64/gluster-3.12/ >>>>>>>>> repo. Do you know from where I can get it? >>>>>>>>> Also when I run gdb, it says >>>>>>>>> >>>>>>>>> Missing separate debuginfos, use: debuginfo-install >>>>>>>>> glusterfs-fuse-3.12.9-1.el6.x86_64 >>>>>>>>> >>>>>>>>> I can't find debug package for glusterfs-fuse either >>>>>>>>> >>>>>>>>> Thanks from the pit of despair ;) >>>>>>>>> >>>>>>>>> Kashif >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Jun 12, 2018 at 5:01 PM, mohammad kashif < >>>>>>>>> kashif.a...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hi Milind >>>>>>>>>> >>>>>>>>>> I will send you links for logs. >>>>>>>>>> >>>>>>>>>> I collected these core dumps at client and there is no glusterd >>>>>>>>>> process running on client. >>>>>>>>>> >>>>>>>>>> Kashif >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Jun 12, 2018 at 4:14 PM, Milind Changire < >>>>>>>>>> mchan...@redhat.com> wrote: >>>>>>>>>> >>>>>>>>>>> Kashif, >>>>>>>>>>> Could you also send over the client/mount log file as Vijay >>>>>>>>>>> suggested ? >>>>>>>>>>> Or maybe the lines with the crash backtrace lines >>>>>>>>>>> >>>>>>>>>>> Also, you've mentioned that you straced glusterd, but when you >>>>>>>>>>> ran gdb, you ran it over /usr/sbin/glusterfs >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, Jun 12, 2018 at 8:19 PM, Vijay Bellur < >>>>>>>>>>> vbel...@redhat.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Jun 12, 2018 at 7:40 AM, mohammad kashif < >>>>>>>>>>>> kashif.a...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Milind >>>>>>>>>>>>> >>>>>>>>>>>>> The operating system is Scientific Linux 6 which is based on >>>>>>>>>>>>> RHEL6. The cpu arch is Intel x86_64. >>>>>>>>>>>>> >>>>>>>>>>>>> I will send you a separate email with link to core dump. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> You could also grep for crash in the client log file and the >>>>>>>>>>>> lines following crash would have a backtrace in most cases. >>>>>>>>>>>> >>>>>>>>>>>> HTH, >>>>>>>>>>>> Vijay >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for your help. >>>>>>>>>>>>> >>>>>>>>>>>>> Kashif >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Jun 12, 2018 at 3:16 PM, Milind Changire < >>>>>>>>>>>>> mchan...@redhat.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Kashif, >>>>>>>>>>>>>> Could you share the core dump via Google Drive or something >>>>>>>>>>>>>> similar >>>>>>>>>>>>>> >>>>>>>>>>>>>> Also, let me know the CPU arch and OS Distribution on which >>>>>>>>>>>>>> you are running gluster. >>>>>>>>>>>>>> >>>>>>>>>>>>>> If you've installed the glusterfs-debuginfo package, you'll >>>>>>>>>>>>>> also get the source lines in the backtrace via gdb >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Jun 12, 2018 at 5:59 PM, mohammad kashif < >>>>>>>>>>>>>> kashif.a...@gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Milind, Vijay >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, I have some more information now as I straced >>>>>>>>>>>>>>> glusterd on client >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 138544 0.000131 mprotect(0x7f2f70785000, 4096, >>>>>>>>>>>>>>> PROT_READ|PROT_WRITE) = 0 <0.000026> >>>>>>>>>>>>>>> 138544 0.000128 mprotect(0x7f2f70786000, 4096, >>>>>>>>>>>>>>> PROT_READ|PROT_WRITE) = 0 <0.000027> >>>>>>>>>>>>>>> 138544 0.000126 mprotect(0x7f2f70787000, 4096, >>>>>>>>>>>>>>> PROT_READ|PROT_WRITE) = 0 <0.000027> >>>>>>>>>>>>>>> 138544 0.000124 --- SIGSEGV {si_signo=SIGSEGV, >>>>>>>>>>>>>>> si_code=SEGV_ACCERR, si_addr=0x7f2f7c60ef88} --- >>>>>>>>>>>>>>> 138544 0.000051 --- SIGSEGV {si_signo=SIGSEGV, >>>>>>>>>>>>>>> si_code=SI_KERNEL, si_addr=0} --- >>>>>>>>>>>>>>> 138551 0.105048 +++ killed by SIGSEGV (core dumped) +++ >>>>>>>>>>>>>>> 138550 0.000041 +++ killed by SIGSEGV (core dumped) +++ >>>>>>>>>>>>>>> 138547 0.000008 +++ killed by SIGSEGV (core dumped) +++ >>>>>>>>>>>>>>> 138546 0.000007 +++ killed by SIGSEGV (core dumped) +++ >>>>>>>>>>>>>>> 138545 0.000007 +++ killed by SIGSEGV (core dumped) +++ >>>>>>>>>>>>>>> 138544 0.000008 +++ killed by SIGSEGV (core dumped) +++ >>>>>>>>>>>>>>> 138543 0.000007 +++ killed by SIGSEGV (core dumped) +++ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> As for I understand that somehow gluster is trying to access >>>>>>>>>>>>>>> memory in appropriate manner and kernel sends SIGSEGV >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I also got the core dump. I am trying gdb first time so I am >>>>>>>>>>>>>>> not sure whether I am using it correctly >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> gdb /usr/sbin/glusterfs core.138536 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> It just tell me that program terminated with signal 11, >>>>>>>>>>>>>>> segmentation fault . >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The problem is not limited to one client but happening to >>>>>>>>>>>>>>> many clients. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I will really appreciate any help as whole file system has >>>>>>>>>>>>>>> become unusable >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Kashif >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Tue, Jun 12, 2018 at 12:26 PM, Milind Changire < >>>>>>>>>>>>>>> mchan...@redhat.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Kashif, >>>>>>>>>>>>>>>> You can change the log level by: >>>>>>>>>>>>>>>> $ gluster volume set <vol> diagnostics.brick-log-level TRACE >>>>>>>>>>>>>>>> $ gluster volume set <vol> diagnostics.client-log-level >>>>>>>>>>>>>>>> TRACE >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> and see how things fare >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> If you want fewer logs you can change the log-level to >>>>>>>>>>>>>>>> DEBUG instead of TRACE. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Tue, Jun 12, 2018 at 3:37 PM, mohammad kashif < >>>>>>>>>>>>>>>> kashif.a...@gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi Vijay >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Now it is unmounting every 30 mins ! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The server log at >>>>>>>>>>>>>>>>> /var/log/glusterfs/bricks/glusteratlas-brics001-gv0.log >>>>>>>>>>>>>>>>> have this line only >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 2018-06-12 09:53:19.303102] I [MSGID: 115013] >>>>>>>>>>>>>>>>> [server-helpers.c:289:do_fd_cleanup] 0-atlasglust-server: >>>>>>>>>>>>>>>>> fd cleanup on /atlas/atlasdata/zgubic/hmumu/ >>>>>>>>>>>>>>>>> histograms/v14.3/Signal >>>>>>>>>>>>>>>>> [2018-06-12 09:53:19.306190] I [MSGID: 101055] >>>>>>>>>>>>>>>>> [client_t.c:443:gf_client_unref] 0-atlasglust-server: >>>>>>>>>>>>>>>>> Shutting down connection <server-name> >>>>>>>>>>>>>>>>> -2224879-2018/06/12-09:51:01:4 >>>>>>>>>>>>>>>>> 60889-atlasglust-client-0-0-0 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> There is no other information. Is there any way to >>>>>>>>>>>>>>>>> increase log verbosity? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> on the client >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 2018-06-12 09:51:01.744980] I [MSGID: 114057] >>>>>>>>>>>>>>>>> [client-handshake.c:1478:select_server_supported_programs] >>>>>>>>>>>>>>>>> 0-atlasglust-client-5: Using Program GlusterFS 3.3, Num >>>>>>>>>>>>>>>>> (1298437), Version >>>>>>>>>>>>>>>>> (330) >>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.746508] I [MSGID: 114046] >>>>>>>>>>>>>>>>> [client-handshake.c:1231:client_setvolume_cbk] >>>>>>>>>>>>>>>>> 0-atlasglust-client-5: Connected to atlasglust-client-5, >>>>>>>>>>>>>>>>> attached to remote >>>>>>>>>>>>>>>>> volume '/glusteratlas/brick006/gv0'. >>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.746543] I [MSGID: 114047] >>>>>>>>>>>>>>>>> [client-handshake.c:1242:client_setvolume_cbk] >>>>>>>>>>>>>>>>> 0-atlasglust-client-5: Server and Client lk-version numbers >>>>>>>>>>>>>>>>> are not same, >>>>>>>>>>>>>>>>> reopening the fds >>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.746814] I [MSGID: 114035] >>>>>>>>>>>>>>>>> [client-handshake.c:202:client_set_lk_version_cbk] >>>>>>>>>>>>>>>>> 0-atlasglust-client-5: Server lk version = 1 >>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.748449] I [MSGID: 114057] >>>>>>>>>>>>>>>>> [client-handshake.c:1478:select_server_supported_programs] >>>>>>>>>>>>>>>>> 0-atlasglust-client-6: Using Program GlusterFS 3.3, Num >>>>>>>>>>>>>>>>> (1298437), Version >>>>>>>>>>>>>>>>> (330) >>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.750219] I [MSGID: 114046] >>>>>>>>>>>>>>>>> [client-handshake.c:1231:client_setvolume_cbk] >>>>>>>>>>>>>>>>> 0-atlasglust-client-6: Connected to atlasglust-client-6, >>>>>>>>>>>>>>>>> attached to remote >>>>>>>>>>>>>>>>> volume '/glusteratlas/brick007/gv0'. >>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.750261] I [MSGID: 114047] >>>>>>>>>>>>>>>>> [client-handshake.c:1242:client_setvolume_cbk] >>>>>>>>>>>>>>>>> 0-atlasglust-client-6: Server and Client lk-version numbers >>>>>>>>>>>>>>>>> are not same, >>>>>>>>>>>>>>>>> reopening the fds >>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.750503] I [MSGID: 114035] >>>>>>>>>>>>>>>>> [client-handshake.c:202:client_set_lk_version_cbk] >>>>>>>>>>>>>>>>> 0-atlasglust-client-6: Server lk version = 1 >>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.752207] I >>>>>>>>>>>>>>>>> [fuse-bridge.c:4205:fuse_init] 0-glusterfs-fuse: FUSE inited >>>>>>>>>>>>>>>>> with protocol >>>>>>>>>>>>>>>>> versions: glusterfs 7.24 kernel 7.14 >>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.752261] I >>>>>>>>>>>>>>>>> [fuse-bridge.c:4835:fuse_graph_sync] 0-fuse: switched to >>>>>>>>>>>>>>>>> graph 0 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> is there a problem with server and client 1k version? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks for your help. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Kashif >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Mon, Jun 11, 2018 at 11:52 PM, Vijay Bellur < >>>>>>>>>>>>>>>>> vbel...@redhat.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Mon, Jun 11, 2018 at 8:50 AM, mohammad kashif < >>>>>>>>>>>>>>>>>> kashif.a...@gmail.com> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Since I have updated our gluster server and client to >>>>>>>>>>>>>>>>>>> latest version 3.12.9-1, I am having this issue of gluster >>>>>>>>>>>>>>>>>>> getting >>>>>>>>>>>>>>>>>>> unmounted from client very regularly. It was not a problem >>>>>>>>>>>>>>>>>>> before update. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Its a distributed file system with no replication. We >>>>>>>>>>>>>>>>>>> have seven servers totaling around 480TB data. Its 97% full. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I am using following config on server >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> gluster volume set atlasglust >>>>>>>>>>>>>>>>>>> features.cache-invalidation on >>>>>>>>>>>>>>>>>>> gluster volume set atlasglust >>>>>>>>>>>>>>>>>>> features.cache-invalidation-timeout 600 >>>>>>>>>>>>>>>>>>> gluster volume set atlasglust performance.stat-prefetch >>>>>>>>>>>>>>>>>>> on >>>>>>>>>>>>>>>>>>> gluster volume set atlasglust >>>>>>>>>>>>>>>>>>> performance.cache-invalidation on >>>>>>>>>>>>>>>>>>> gluster volume set atlasglust >>>>>>>>>>>>>>>>>>> performance.md-cache-timeout 600 >>>>>>>>>>>>>>>>>>> gluster volume set atlasglust >>>>>>>>>>>>>>>>>>> performance.parallel-readdir on >>>>>>>>>>>>>>>>>>> gluster volume set atlasglust performance.cache-size 1GB >>>>>>>>>>>>>>>>>>> gluster volume set atlasglust >>>>>>>>>>>>>>>>>>> performance.client-io-threads on >>>>>>>>>>>>>>>>>>> gluster volume set atlasglust cluster.lookup-optimize on >>>>>>>>>>>>>>>>>>> gluster volume set atlasglust performance.stat-prefetch >>>>>>>>>>>>>>>>>>> on >>>>>>>>>>>>>>>>>>> gluster volume set atlasglust client.event-threads 4 >>>>>>>>>>>>>>>>>>> gluster volume set atlasglust server.event-threads 4 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> clients are mounted with this option >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> defaults,direct-io-mode=disabl >>>>>>>>>>>>>>>>>>> e,attribute-timeout=600,entry- >>>>>>>>>>>>>>>>>>> timeout=600,negative-timeout=600,fopen-keep-cache,rw,_netdev >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I can't see anything in the log file. Can someone >>>>>>>>>>>>>>>>>>> suggest that how to troubleshoot this issue? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Can you please share the log file? Checking for messages >>>>>>>>>>>>>>>>>> related to disconnections/crashes in the log file would be a >>>>>>>>>>>>>>>>>> good way to >>>>>>>>>>>>>>>>>> start troubleshooting the problem. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> Vijay >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>> Gluster-users mailing list >>>>>>>>>>>>>>>>> Gluster-users@gluster.org >>>>>>>>>>>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Milind >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Milind >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Milind >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Milind >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Gluster-users mailing list >>>>>>> Gluster-users@gluster.org >>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users@gluster.org >>> http://lists.gluster.org/mailman/listinfo/gluster-users >>> >> >> >
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users