Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

Nithya Balachandran Wed, 20 Jun 2018 20:23:40 -0700

Thank you. In the meantime, turning off parallel readdir should prevent the
first crash.



On 20 June 2018 at 21:42, mohammad kashif <kashif.a...@gmail.com> wrote:

> Hi Nithya
>
> Thanks for the bug report. This new crash happened only once and only at
> one client in the last 6 days. I will let you know if it happened again or
> more frequently.
>
> Cheers
>
> Kashif
>
> On Wed, Jun 20, 2018 at 12:28 PM, Nithya Balachandran <nbala...@redhat.com
> > wrote:
>
>> Hi Mohammad,
>>
>> This is a different crash. How often does it happen?
>>
>>
>> We have managed to reproduce the first crash you reported and a bug has
>> been filed at [1].
>> We will work on a fix for this.
>>
>>
>> Regards,
>> Nithya
>>
>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1593199
>>
>>
>> On 18 June 2018 at 14:09, mohammad kashif <kashif.a...@gmail.com> wrote:
>>
>>> Hi
>>>
>>> Problem appeared again after few days. This time, the client
>>> is glusterfs-3.10.12-1.el6.x86_64 and performance.parallel-readdir is
>>> off. The log level was set to ERROR and I got this log at the time of crash
>>>
>>> [2018-06-14 08:45:43.551384] E [rpc-clnt.c:365:saved_frames_unwind]
>>> (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x153)[0x7fac2e66ce03]
>>> (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x7fac2e434867]
>>> (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fac2e43497e]
>>> (--> 
>>> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xa5)[0x7fac2e434a45]
>>> (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x278)[0x7fac2e434d68]
>>> ))))) 0-atlasglust-client-4: forced unwinding frame type(GlusterFS 3.3)
>>> op(READDIRP(40)) called at 2018-06-14 08:45:43.483303 (xid=0x7553c7
>>>
>>> Core dump was enabled on client so it created a dump. It is here
>>>
>>> http://www-pnp.physics.ox.ac.uk/~mohammad
>>> <http://www-pnp.physics.ox.ac.uk/~mohammad/backtrace.log>/core.1002074
>>>
>>> I used a gdb trace using this command
>>>
>>> gdb /usr/sbin/glusterfs core.1002074 -ex bt -ex quit |& tee
>>> backtrace.log_18_16_1
>>>
>>>
>>> http://www-pnp.physics.ox.ac.uk/~mohammad
>>> <http://www-pnp.physics.ox.ac.uk/~mohammad/backtrace.log>
>>> /backtrace.log_18_16_1
>>>
>>> I haven't used gdb much so let me know if you want me to run gdb in
>>> different manner.
>>>
>>> Thanks
>>>
>>> Kashif
>>>
>>>
>>> On Mon, Jun 18, 2018 at 6:27 AM, Raghavendra Gowdappa <
>>> rgowd...@redhat.com> wrote:
>>>
>>>>
>>>>
>>>> On Mon, Jun 18, 2018 at 9:39 AM, Raghavendra Gowdappa <
>>>> rgowd...@redhat.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Mon, Jun 18, 2018 at 8:11 AM, Raghavendra Gowdappa <
>>>>> rgowd...@redhat.com> wrote:
>>>>>
>>>>>> From the bt:
>>>>>>
>>>>>> #8  0x00007f6ef977e6de in rda_readdirp (frame=0x7f6eec862320,
>>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=357, off=2,
>>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>>>>>> #9  0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized
>>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at
>>>>>> dht-common.c:5388
>>>>>> #10 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec862210,
>>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
>>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>>>>>> #11 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized
>>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at
>>>>>> dht-common.c:5388
>>>>>> #12 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec862100,
>>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
>>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>>>>>> #13 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized
>>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at
>>>>>> dht-common.c:5388
>>>>>> #14 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861ff0,
>>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
>>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>>>>>> #15 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized
>>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at
>>>>>> dht-common.c:5388
>>>>>> #16 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861ee0,
>>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
>>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>>>>>> #17 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized
>>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at
>>>>>> dht-common.c:5388
>>>>>> #18 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861dd0,
>>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
>>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>>>>>> #19 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized
>>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at
>>>>>> dht-common.c:5388
>>>>>> #20 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861cc0,
>>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
>>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>>>>>> #21 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized
>>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at
>>>>>> dht-common.c:5388
>>>>>> #22 0x00007f6ef977e7d7 in rda_readdirp (frame=0x7f6eec861bb0,
>>>>>> this=0x7f6ef4019f20, fd=0x7f6ed40077b0, size=140114606084288, off=2,
>>>>>> xdata=0x7f6eec0085a0) at readdir-ahead.c:266
>>>>>> #23 0x00007f6ef952db4c in dht_readdirp_cbk (frame=<value optimized
>>>>>> out>, cookie=0x7f6ef4019f20, this=0x7f6ef40218a0, op_ret=2, op_errno=0,
>>>>>> orig_entries=<value optimized out>, xdata=0x7f6eec0085a0) at
>>>>>> dht-common.c:5388
>>>>>>
>>>>>> It looks like an infinite recursion. Note that readdirp is wound to
>>>>>> the same subvol (value of "this" is same in all calls to rda_readdirp) at
>>>>>> the same offset (of value 2). This may be a bug in DHT (winding down
>>>>>> readdirp with wrong offset) or in readdir-ahead (populating incorrect
>>>>>> offset values in dentries it returns as readdirp response).
>>>>>>
>>>>>
>>>>> It looks to be a corruption. Value of size argument in rda_readdirp is
>>>>> too big (around 127 TB) to be sane. If you've a reproducer, please run it
>>>>> in valgrind or ASAN.
>>>>>
>>>>
>>>>
>>>> I spoke too early. It could be a negative value and hence it may not be
>>>> a corruption. Is it possible to upload the core somewhere? Or better still
>>>> access to gdb session with this core would be more helpful.
>>>>
>>>>
>>>>> To make it explicit, ATM its not clear that there is bug in
>>>>> readdir-ahead or DHT as it looks to be a memory corruption. Till I get a
>>>>> reproducer or valgrind/ASAN output of client process when the issue
>>>>> occcurs, I won't be working on this problem.
>>>>>
>>>>>
>>>>>>
>>>>>> On Wed, Jun 13, 2018 at 4:29 PM, mohammad kashif <
>>>>>> kashif.a...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Milind
>>>>>>>
>>>>>>> Thanks a lot, I manage to run gdb and produced traceback as well.
>>>>>>> Its here
>>>>>>>
>>>>>>> http://www-pnp.physics.ox.ac.uk/~mohammad/backtrace.log
>>>>>>>
>>>>>>>
>>>>>>> I am trying to understand but still not able to make sense out of it.
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Kashif
>>>>>>>
>>>>>>> On Wed, Jun 13, 2018 at 11:34 AM, Milind Changire <
>>>>>>> mchan...@redhat.com> wrote:
>>>>>>>
>>>>>>>> Kashif,
>>>>>>>> FYI: http://debuginfo.centos.org/centos/6/storage/x86_64/
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jun 13, 2018 at 3:21 PM, mohammad kashif <
>>>>>>>> kashif.a...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Milind
>>>>>>>>>
>>>>>>>>> There is no glusterfs-debuginfo available for gluster-3.12 from
>>>>>>>>> http://mirror.centos.org/centos/6/storage/x86_64/gluster-3.12/
>>>>>>>>> repo. Do you know from where I can get it?
>>>>>>>>> Also when I run gdb, it says
>>>>>>>>>
>>>>>>>>> Missing separate debuginfos, use: debuginfo-install
>>>>>>>>> glusterfs-fuse-3.12.9-1.el6.x86_64
>>>>>>>>>
>>>>>>>>> I can't find debug package for glusterfs-fuse either
>>>>>>>>>
>>>>>>>>> Thanks from the pit of despair ;)
>>>>>>>>>
>>>>>>>>> Kashif
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Jun 12, 2018 at 5:01 PM, mohammad kashif <
>>>>>>>>> kashif.a...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Milind
>>>>>>>>>>
>>>>>>>>>> I will send you links for logs.
>>>>>>>>>>
>>>>>>>>>> I collected these core dumps at client and there is no glusterd
>>>>>>>>>> process running on client.
>>>>>>>>>>
>>>>>>>>>> Kashif
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Jun 12, 2018 at 4:14 PM, Milind Changire <
>>>>>>>>>> mchan...@redhat.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Kashif,
>>>>>>>>>>> Could you also send over the client/mount log file as Vijay
>>>>>>>>>>> suggested ?
>>>>>>>>>>> Or maybe the lines with the crash backtrace lines
>>>>>>>>>>>
>>>>>>>>>>> Also, you've mentioned that you straced glusterd, but when you
>>>>>>>>>>> ran gdb, you ran it over /usr/sbin/glusterfs
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jun 12, 2018 at 8:19 PM, Vijay Bellur <
>>>>>>>>>>> vbel...@redhat.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Jun 12, 2018 at 7:40 AM, mohammad kashif <
>>>>>>>>>>>> kashif.a...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Milind
>>>>>>>>>>>>>
>>>>>>>>>>>>> The operating system is Scientific Linux 6 which is based on
>>>>>>>>>>>>> RHEL6. The cpu arch is Intel x86_64.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I will send you a separate email with link to core dump.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> You could also grep for crash in the client log file and the
>>>>>>>>>>>> lines following crash would have a backtrace in most cases.
>>>>>>>>>>>>
>>>>>>>>>>>> HTH,
>>>>>>>>>>>> Vijay
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for your help.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Kashif
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Jun 12, 2018 at 3:16 PM, Milind Changire <
>>>>>>>>>>>>> mchan...@redhat.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Kashif,
>>>>>>>>>>>>>> Could you share the core dump via Google Drive or something
>>>>>>>>>>>>>> similar
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Also, let me know the CPU arch and OS Distribution on which
>>>>>>>>>>>>>> you are running gluster.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If you've installed the glusterfs-debuginfo package, you'll
>>>>>>>>>>>>>> also get the source lines in the backtrace via gdb
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Jun 12, 2018 at 5:59 PM, mohammad kashif <
>>>>>>>>>>>>>> kashif.a...@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Milind, Vijay
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks, I have some more information now as I straced
>>>>>>>>>>>>>>> glusterd on client
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 138544      0.000131 mprotect(0x7f2f70785000, 4096,
>>>>>>>>>>>>>>> PROT_READ|PROT_WRITE) = 0 <0.000026>
>>>>>>>>>>>>>>> 138544      0.000128 mprotect(0x7f2f70786000, 4096,
>>>>>>>>>>>>>>> PROT_READ|PROT_WRITE) = 0 <0.000027>
>>>>>>>>>>>>>>> 138544      0.000126 mprotect(0x7f2f70787000, 4096,
>>>>>>>>>>>>>>> PROT_READ|PROT_WRITE) = 0 <0.000027>
>>>>>>>>>>>>>>> 138544      0.000124 --- SIGSEGV {si_signo=SIGSEGV,
>>>>>>>>>>>>>>> si_code=SEGV_ACCERR, si_addr=0x7f2f7c60ef88} ---
>>>>>>>>>>>>>>> 138544      0.000051 --- SIGSEGV {si_signo=SIGSEGV,
>>>>>>>>>>>>>>> si_code=SI_KERNEL, si_addr=0} ---
>>>>>>>>>>>>>>> 138551      0.105048 +++ killed by SIGSEGV (core dumped) +++
>>>>>>>>>>>>>>> 138550      0.000041 +++ killed by SIGSEGV (core dumped) +++
>>>>>>>>>>>>>>> 138547      0.000008 +++ killed by SIGSEGV (core dumped) +++
>>>>>>>>>>>>>>> 138546      0.000007 +++ killed by SIGSEGV (core dumped) +++
>>>>>>>>>>>>>>> 138545      0.000007 +++ killed by SIGSEGV (core dumped) +++
>>>>>>>>>>>>>>> 138544      0.000008 +++ killed by SIGSEGV (core dumped) +++
>>>>>>>>>>>>>>> 138543      0.000007 +++ killed by SIGSEGV (core dumped) +++
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As for I understand that somehow gluster is trying to access
>>>>>>>>>>>>>>> memory in appropriate manner and kernel sends SIGSEGV
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I also got the core dump. I am trying gdb first time so I am
>>>>>>>>>>>>>>> not sure whether I am using it correctly
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> gdb /usr/sbin/glusterfs core.138536
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It just tell me that program terminated with signal 11,
>>>>>>>>>>>>>>> segmentation fault .
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The problem is not limited to one client but happening to
>>>>>>>>>>>>>>> many clients.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I will really appreciate any help as whole file system has
>>>>>>>>>>>>>>> become unusable
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Kashif
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Jun 12, 2018 at 12:26 PM, Milind Changire <
>>>>>>>>>>>>>>> mchan...@redhat.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Kashif,
>>>>>>>>>>>>>>>> You can change the log level by:
>>>>>>>>>>>>>>>> $ gluster volume set <vol> diagnostics.brick-log-level TRACE
>>>>>>>>>>>>>>>> $ gluster volume set <vol> diagnostics.client-log-level
>>>>>>>>>>>>>>>> TRACE
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> and see how things fare
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If you want fewer logs you can change the log-level to
>>>>>>>>>>>>>>>> DEBUG instead of TRACE.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Jun 12, 2018 at 3:37 PM, mohammad kashif <
>>>>>>>>>>>>>>>> kashif.a...@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Vijay
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Now it is unmounting every 30 mins !
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The server log at 
>>>>>>>>>>>>>>>>> /var/log/glusterfs/bricks/glusteratlas-brics001-gv0.log
>>>>>>>>>>>>>>>>> have this line only
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2018-06-12 09:53:19.303102] I [MSGID: 115013]
>>>>>>>>>>>>>>>>> [server-helpers.c:289:do_fd_cleanup] 0-atlasglust-server:
>>>>>>>>>>>>>>>>> fd cleanup on /atlas/atlasdata/zgubic/hmumu/
>>>>>>>>>>>>>>>>> histograms/v14.3/Signal
>>>>>>>>>>>>>>>>> [2018-06-12 09:53:19.306190] I [MSGID: 101055]
>>>>>>>>>>>>>>>>> [client_t.c:443:gf_client_unref] 0-atlasglust-server:
>>>>>>>>>>>>>>>>> Shutting down connection <server-name> 
>>>>>>>>>>>>>>>>> -2224879-2018/06/12-09:51:01:4
>>>>>>>>>>>>>>>>> 60889-atlasglust-client-0-0-0
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> There is no other information. Is there any way to
>>>>>>>>>>>>>>>>> increase log verbosity?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> on the client
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2018-06-12 09:51:01.744980] I [MSGID: 114057]
>>>>>>>>>>>>>>>>> [client-handshake.c:1478:select_server_supported_programs]
>>>>>>>>>>>>>>>>> 0-atlasglust-client-5: Using Program GlusterFS 3.3, Num 
>>>>>>>>>>>>>>>>> (1298437), Version
>>>>>>>>>>>>>>>>> (330)
>>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.746508] I [MSGID: 114046]
>>>>>>>>>>>>>>>>> [client-handshake.c:1231:client_setvolume_cbk]
>>>>>>>>>>>>>>>>> 0-atlasglust-client-5: Connected to atlasglust-client-5, 
>>>>>>>>>>>>>>>>> attached to remote
>>>>>>>>>>>>>>>>> volume '/glusteratlas/brick006/gv0'.
>>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.746543] I [MSGID: 114047]
>>>>>>>>>>>>>>>>> [client-handshake.c:1242:client_setvolume_cbk]
>>>>>>>>>>>>>>>>> 0-atlasglust-client-5: Server and Client lk-version numbers 
>>>>>>>>>>>>>>>>> are not same,
>>>>>>>>>>>>>>>>> reopening the fds
>>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.746814] I [MSGID: 114035]
>>>>>>>>>>>>>>>>> [client-handshake.c:202:client_set_lk_version_cbk]
>>>>>>>>>>>>>>>>> 0-atlasglust-client-5: Server lk version = 1
>>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.748449] I [MSGID: 114057]
>>>>>>>>>>>>>>>>> [client-handshake.c:1478:select_server_supported_programs]
>>>>>>>>>>>>>>>>> 0-atlasglust-client-6: Using Program GlusterFS 3.3, Num 
>>>>>>>>>>>>>>>>> (1298437), Version
>>>>>>>>>>>>>>>>> (330)
>>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.750219] I [MSGID: 114046]
>>>>>>>>>>>>>>>>> [client-handshake.c:1231:client_setvolume_cbk]
>>>>>>>>>>>>>>>>> 0-atlasglust-client-6: Connected to atlasglust-client-6, 
>>>>>>>>>>>>>>>>> attached to remote
>>>>>>>>>>>>>>>>> volume '/glusteratlas/brick007/gv0'.
>>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.750261] I [MSGID: 114047]
>>>>>>>>>>>>>>>>> [client-handshake.c:1242:client_setvolume_cbk]
>>>>>>>>>>>>>>>>> 0-atlasglust-client-6: Server and Client lk-version numbers 
>>>>>>>>>>>>>>>>> are not same,
>>>>>>>>>>>>>>>>> reopening the fds
>>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.750503] I [MSGID: 114035]
>>>>>>>>>>>>>>>>> [client-handshake.c:202:client_set_lk_version_cbk]
>>>>>>>>>>>>>>>>> 0-atlasglust-client-6: Server lk version = 1
>>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.752207] I
>>>>>>>>>>>>>>>>> [fuse-bridge.c:4205:fuse_init] 0-glusterfs-fuse: FUSE inited 
>>>>>>>>>>>>>>>>> with protocol
>>>>>>>>>>>>>>>>> versions: glusterfs 7.24 kernel 7.14
>>>>>>>>>>>>>>>>> [2018-06-12 09:51:01.752261] I
>>>>>>>>>>>>>>>>> [fuse-bridge.c:4835:fuse_graph_sync] 0-fuse: switched to
>>>>>>>>>>>>>>>>> graph 0
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> is there a problem with server and client 1k version?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks for your help.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Kashif
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mon, Jun 11, 2018 at 11:52 PM, Vijay Bellur <
>>>>>>>>>>>>>>>>> vbel...@redhat.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon, Jun 11, 2018 at 8:50 AM, mohammad kashif <
>>>>>>>>>>>>>>>>>> kashif.a...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Since I have updated our gluster server and client to
>>>>>>>>>>>>>>>>>>> latest version 3.12.9-1, I am having this issue of gluster 
>>>>>>>>>>>>>>>>>>> getting
>>>>>>>>>>>>>>>>>>> unmounted from client very regularly. It was not a problem 
>>>>>>>>>>>>>>>>>>> before update.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Its a distributed file system with no replication. We
>>>>>>>>>>>>>>>>>>> have seven servers totaling around 480TB data. Its 97% full.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I am using following config on server
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> gluster volume set atlasglust
>>>>>>>>>>>>>>>>>>> features.cache-invalidation on
>>>>>>>>>>>>>>>>>>> gluster volume set atlasglust
>>>>>>>>>>>>>>>>>>> features.cache-invalidation-timeout 600
>>>>>>>>>>>>>>>>>>> gluster volume set atlasglust performance.stat-prefetch
>>>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>> gluster volume set atlasglust
>>>>>>>>>>>>>>>>>>> performance.cache-invalidation on
>>>>>>>>>>>>>>>>>>> gluster volume set atlasglust
>>>>>>>>>>>>>>>>>>> performance.md-cache-timeout 600
>>>>>>>>>>>>>>>>>>> gluster volume set atlasglust
>>>>>>>>>>>>>>>>>>> performance.parallel-readdir on
>>>>>>>>>>>>>>>>>>> gluster volume set atlasglust performance.cache-size 1GB
>>>>>>>>>>>>>>>>>>> gluster volume set atlasglust
>>>>>>>>>>>>>>>>>>> performance.client-io-threads on
>>>>>>>>>>>>>>>>>>> gluster volume set atlasglust cluster.lookup-optimize on
>>>>>>>>>>>>>>>>>>> gluster volume set atlasglust performance.stat-prefetch
>>>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>> gluster volume set atlasglust client.event-threads 4
>>>>>>>>>>>>>>>>>>> gluster volume set atlasglust server.event-threads 4
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> clients are mounted with this option
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> defaults,direct-io-mode=disabl
>>>>>>>>>>>>>>>>>>> e,attribute-timeout=600,entry-
>>>>>>>>>>>>>>>>>>> timeout=600,negative-timeout=600,fopen-keep-cache,rw,_netdev
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I can't see anything in the log file. Can someone
>>>>>>>>>>>>>>>>>>> suggest that how to troubleshoot this issue?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Can you please share the log file? Checking for messages
>>>>>>>>>>>>>>>>>> related to disconnections/crashes in the log file would be a 
>>>>>>>>>>>>>>>>>> good way to
>>>>>>>>>>>>>>>>>> start troubleshooting the problem.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Vijay
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>>>>>>>>> Gluster-users@gluster.org
>>>>>>>>>>>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Milind
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Milind
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Milind
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Milind
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Gluster-users mailing list
>>>>>>> Gluster-users@gluster.org
>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> Gluster-users@gluster.org
>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>>
>

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Client un-mounting since upgrade to 3.12.9-1 version

Reply via email to