02.10.2018 12:59, Amar Tumballi пишет:
Recently, in one of the situation, we found that locks were not freed up due to not getting TCP timeout..

Can you try the option like below and let us know?

`gluster volume set $volname tcp-user-timeout 42`

(ref: https://review.gluster.org/21170/ )

Regards,
Amar


Thank you, we'll try this.


On Tue, Oct 2, 2018 at 10:40 AM Dmitry Melekhov <d...@belkam.com <mailto:d...@belkam.com>> wrote:

    01.10.2018 23:09, Danny Lee пишет:
    Ran into this issue too with 4.1.5 with an arbiter setup.  Also
    could not run a statedump due to "Segmentation fault".

    Tried with 3.12.13 and had issues with locked files as well.  We
    were able to do a statedump and found that some of our files were
    "BLOCKED" (xlator.features.locks.vol-locks.inode).  Attached part
    of statedump.

    Also tried clearing the locks using clear-locks, which did remove
    the lock, but as soon as I tried to cat the file, it got locked
    again and the cat process hung.

    I created issue in bugzilla, can't find it though :-(
    Looks like there is no activity after I sent all logs...



    On Wed, Aug 29, 2018, 3:13 AM Dmitry Melekhov <d...@belkam.com
    <mailto:d...@belkam.com>> wrote:

        28.08.2018 10:43, Amar Tumballi пишет:


        On Tue, Aug 28, 2018 at 11:24 AM, Dmitry Melekhov
        <d...@belkam.com <mailto:d...@belkam.com>> wrote:

            Hello!


            Yesterday we hit something like this on 4.1.2

            Centos 7.5.


            Volume is replicated - two bricks and one arbiter.


            We rebooted arbiter, waited for heal end, and tried to
            live migrate VM to another node ( we run VMs on gluster
            nodes ):


            [2018-08-27 09:56:22.085411] I [MSGID: 115029]
            [server-handshake.c:763:server_setvolume] 0-pool-server:
            accepted client from
            
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
            client-6-RECON_NO:-0 (version: 4.1.2)
            [2018-08-27 09:56:22.107609] I [MSGID: 115036]
            [server.c:483:server_rpc_notify] 0-pool-server:
            disconnecting connection from
            
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-
            client-6-RECON_NO:-0
            [2018-08-27 09:56:22.107747] I [MSGID: 101055]
            [client_t.c:444:gf_client_unref] 0-pool-server: Shutting
            down connection
            
CTX_ID:b55f4a90-e241-48ce-bd4d-268c8a956f4a-GRAPH_ID:0-PID:8887-HOST:son-PC_NAME:pool-clien
            t-6-RECON_NO:-0
            [2018-08-27 09:58:37.905829] I [MSGID: 115036]
            [server.c:483:server_rpc_notify] 0-pool-server:
            disconnecting connection from
            
CTX_ID:c3eb6cfc-2ef9-470a-89d1-a87170d00da5-GRAPH_ID:0-PID:30292-HOST:father-PC_NAME:p
            ool-client-6-RECON_NO:-0
            [2018-08-27 09:58:37.905926] W
            [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server:
            releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
            held by {client=0x7ffb58035bc0, pid=30292
            lk-owner=28c831d8bc550000}
            [2018-08-27 09:58:37.905959] W
            [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server:
            releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
            held by {client=0x7ffb58035bc0, pid=30292
            lk-owner=2870a7d6bc550000}
            [2018-08-27 09:58:37.905979] W
            [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server:
            releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
            held by {client=0x7ffb58035bc0, pid=30292
            lk-owner=2880a7d6bc550000}
            [2018-08-27 09:58:37.905997] W
            [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server:
            releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
            held by {client=0x7ffb58035bc0, pid=30292
            lk-owner=28f031d8bc550000}
            [2018-08-27 09:58:37.906016] W
            [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server:
            releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
            held by {client=0x7ffb58035bc0, pid=30292
            lk-owner=28b07dd5bc550000}
            [2018-08-27 09:58:37.906034] W
            [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server:
            releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
            held by {client=0x7ffb58035bc0, pid=30292
            lk-owner=28e0a7d6bc550000}
            [2018-08-27 09:58:37.906056] W
            [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server:
            releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
            held by {client=0x7ffb58035bc0, pid=30292
            lk-owner=28b845d8bc550000}
            [2018-08-27 09:58:37.906079] W
            [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server:
            releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
            held by {client=0x7ffb58035bc0, pid=30292
            lk-owner=2858a7d8bc550000}
            [2018-08-27 09:58:37.906098] W
            [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server:
            releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
            held by {client=0x7ffb58035bc0, pid=30292
            lk-owner=2868a8d7bc550000}
            [2018-08-27 09:58:37.906121] W
            [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server:
            releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
            held by {client=0x7ffb58035bc0, pid=30292
            lk-owner=28f80bd7bc550000}
            ...

            [2018-08-27 09:58:37.907375] W
            [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server:
            releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
            held by {client=0x7ffb58035bc0, pid=30292
            lk-owner=28a8cdd6bc550000}
            [2018-08-27 09:58:37.907393] W
            [inodelk.c:610:pl_inodelk_log_cleanup] 0-pool-server:
            releasing lock on 12172afe-f0a4-4e10-bc0f-c5e4e0d9f318
            held by {client=0x7ffb58035bc0, pid=30292
            lk-owner=2880cdd6bc550000}
            [2018-08-27 09:58:37.907476] I
            [socket.c:3837:socket_submit_reply] 0-tcp.pool-server:
            not connected (priv->connected = -1)
            [2018-08-27 09:58:37.907520] E
            [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service:
            failed to submit message (XID: 0xcb88cb, Program:
            GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to
            rpc-transport (tcp.pool-server)
            [2018-08-27 09:58:37.910727] E
            [server.c:137:server_submit_reply]
            (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
            [0x7ffb64379084]
            -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
            ba) [0x7ffb5fddf5ba]
            -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
            [0x7ffb5fd89fce] ) 0-: Reply submission failed
            [2018-08-27 09:58:37.910814] E
            [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service:
            failed to submit message (XID: 0xcb88ce, Program:
            GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to
            rpc-transport (tcp.pool-server)
            [2018-08-27 09:58:37.910861] E
            [server.c:137:server_submit_reply]
            (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
            [0x7ffb64379084]
            -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
            ba) [0x7ffb5fddf5ba]
            -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
            [0x7ffb5fd89fce] ) 0-: Reply submission failed
            [2018-08-27 09:58:37.910904] E
            [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service:
            failed to submit message (XID: 0xcb88cf, Program:
            GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to
            rpc-transport (tcp.pool-server)
            [2018-08-27 09:58:37.910940] E
            [server.c:137:server_submit_reply]
            (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
            [0x7ffb64379084]
            -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
            ba) [0x7ffb5fddf5ba]
            -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
            [0x7ffb5fd89fce] ) 0-: Reply submission failed
            [2018-08-27 09:58:37.910979] E
            [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service:
            failed to submit message (XID: 0xcb88d1, Program:
            GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to
            rpc-transport (tcp.pool-server)
            [2018-08-27 09:58:37.911012] E
            [server.c:137:server_submit_reply]
            (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
            [0x7ffb64379084]
            -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
            ba) [0x7ffb5fddf5ba]
            -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
            [0x7ffb5fd89fce] ) 0-: Reply submission failed
            [2018-08-27 09:58:37.911050] E
            [rpcsvc.c:1378:rpcsvc_submit_generic] 0-rpc-service:
            failed to submit message (XID: 0xcb88d8, Program:
            GlusterFS 4.x v1, ProgVers: 400, Proc: 30) to
            rpc-transport (tcp.pool-server)
            [2018-08-27 09:58:37.911083] E
            [server.c:137:server_submit_reply]
            (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
            [0x7ffb64379084]
            -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
            ba) [0x7ffb5fddf5ba]
            -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
            [0x7ffb5fd89fce] ) 0-: Reply submission failed
            [2018-08-27 09:58:37.916217] E
            [server.c:137:server_submit_reply]
            (-->/usr/lib64/glusterfs/4.1.2/xlator/debug/io-stats.so(+0x20084)
            [0x7ffb64379084]
            -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0x605
            ba) [0x7ffb5fddf5ba]
            -->/usr/lib64/glusterfs/4.1.2/xlator/protocol/server.so(+0xafce)
            [0x7ffb5fd89fce] ) 0-: Reply submission failed
            [2018-08-27 09:58:37.916520] I [MSGID: 115013]
            [server-helpers.c:286:do_fd_cleanup] 0-pool-server: fd
            cleanup on /balamak.img


            after this I/O on  /balamak.img was blocked.


            Only solution we found was to reboot all 3 nodes.


            Is there any bug report in bugzilla we can add logs?


        Not aware of such bugs!

            Is it possible to turn of these locks?


        Not sure, will get back on this one!


        btw, found this link
        
https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-filelocks/

        tried on another (test) cluster:

         [root@marduk ~]# gluster volume statedump pool
        Segmentation fault (core dumped)


        4.1.2 too...

        something is wrong here.


            Thank you!




            _______________________________________________
            Gluster-users mailing list
            Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
            https://lists.gluster.org/mailman/listinfo/gluster-users




-- Amar Tumballi (amarts)


        _______________________________________________
        Gluster-users mailing list
        Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
        https://lists.gluster.org/mailman/listinfo/gluster-users


    _______________________________________________
    Gluster-users mailing list
    Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
    https://lists.gluster.org/mailman/listinfo/gluster-users



--
Amar Tumballi (amarts)


_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Reply via email to