Re: [Gluster-users] RE : Frequent connect and disconnect messages flooded in logs

Micha Ober Thu, 08 Dec 2016 03:08:11 -0800

Hi Rafi,

thank you for your support. It is greatly appreciated.


Just some more thoughts from my side:

There have been no reports from other users in *this* thread until now,but I have found at least one user with a very simiar problem in anolder thread:


https://www.gluster.org/pipermail/gluster-users/2014-November/019637.html

He is also reporting disconnects with no apparent reasons, althogh hissetup is a bit more complicated, also involving a firewall. In oursetup, all servers/clients are connected via 1 GbE with no firewall oranything that might block/throttle traffic. Also, we are using exactlythe same software versions on all nodes.

I can also find some reports in the bugtracker when searching for"rpc_client_ping_timer_expired" and "rpc_clnt_ping_timer_expired" (lookslike spelling changed during versions).


https://bugzilla.redhat.com/show_bug.cgi?id=1096729
https://bugzilla.redhat.com/show_bug.cgi?id=1370683

But both reports involve large traffic/load on the bricks/disks, whichis not the case for out setup.To give a ballpark figure: Over three days, 30 GiB were written. And thedata was not written at once, but continuously over the whole time.

Just to be sure, I have checked the logfiles of one of the otherclusters right now, which are sitting in the same building, in the samerack, even on the same switch, running the same jobs, but with glusterfs3.4.2 and I can see no disconnects in the logfiles. So I can definitelyrule out our infrastructure as problem.


Regards,
Micha


Am 07.12.2016 um 18:08 schrieb Mohammed Rafi K C:

Hi Micha,
This is great. I will provide you one debug build which has two fixeswhich I possible suspect for a frequent disconnect issue, though Idon't have much data to validate my theory. So I will take one moreday to dig in to that.
Thanks for your support, and opensource++

Regards

Rafi KC

On 12/07/2016 05:02 AM, Micha Ober wrote:
Hi,

thank you for your answer and even more for the question!
Until now, I was using FUSE. Today I changed all mounts to NFS usingthe same 3.7.17 version.
But: The problem is still the same. Now, the NFS logfile containslines like these:
[2016-12-06 15:12:29.006325] C[rpc-clnt-ping.c:165:rpc_clnt_ping_timer_expired] 0-gv0-client-7:server X.X.18.62:49153 has not responded in the last 42 seconds,disconnecting.
Interestingly enough, the IP address X.X.18.62 is the same machine!As I wrote earlier, each node serves both as a server and a client,as each node contributes bricks to the volume. Every server isconnecting to itself via its hostname. For example, the fstab on thenode "giant2" looks like:
#giant2:/gv0    /shared_data    glusterfs defaults,noauto 0       0
#giant2:/gv2    /shared_slurm   glusterfs defaults,noauto 0       0

giant2:/gv0     /shared_data    nfs defaults,_netdev,vers=3 0       0
giant2:/gv2     /shared_slurm   nfs defaults,_netdev,vers=3 0       0

So I understand the disconnects even less.
I don't know if it's possible to create a dummy cluster which exposesthe same behaviour, because the disconnects only happen when thereare compute jobs running on those nodes - and they are GPU computejobs, so that's something which cannot be easily emulated in a VM.
As we have more clusters (which are running fine with an ancient 3.4version :-)) and we are currently not dependent on this particularcluster (which may stay like this for this month, I think) I shouldbe able to deploy the debug build on the "real" cluster, if you canprovide a debug build.
Regards and thanks,
Micha



Am 06.12.2016 um 08:15 schrieb Mohammed Rafi K C:
On 12/03/2016 12:56 AM, Micha Ober wrote:
** Update: ** I have downgraded from 3.8.6 to 3.7.17 now, but theproblem still exists.
Client log: http://paste.ubuntu.com/23569065/
Brick log: http://paste.ubuntu.com/23569067/

Please note that each server has two bricks.
Whereas, according to the logs, one brick loses the connection toall other hosts:
[2016-12-02 18:38:53.703301] W [socket.c:596:__socket_rwv] 0-tcp.gv0-server: 
writev on X.X.X.219:49121 failed (Broken pipe)
[2016-12-02 18:38:53.703381] W [socket.c:596:__socket_rwv] 0-tcp.gv0-server: 
writev on X.X.X.62:49118 failed (Broken pipe)
[2016-12-02 18:38:53.703380] W [socket.c:596:__socket_rwv] 0-tcp.gv0-server: 
writev on X.X.X.107:49121 failed (Broken pipe)
[2016-12-02 18:38:53.703424] W [socket.c:596:__socket_rwv] 0-tcp.gv0-server: 
writev on X.X.X.206:49120 failed (Broken pipe)
[2016-12-02 18:38:53.703359] W [socket.c:596:__socket_rwv] 0-tcp.gv0-server: 
writev on X.X.X.58:49121 failed (Broken pipe)

The SECOND brick on the SAME host is NOT affected, i.e. no disconnects!
As I said, the network connection is fine and the disks are idle.
The CPU always has 2 free cores.

It looks like I have to downgrade to 3.4 now in order for the disconnects to 
stop.
Hi Micha,
Thanks for the update and sorry for what happened with glusterhigher versions. I can understand the need for downgrade as it is aproduction setup.
Can you tell me the clients used here ? whether it is afuse,nfs,nfs-ganesha, smb or libgfapi ?
Since I'm not able to reproduce the issue (I have been trying fromlast 3days) and the logs are not much helpful here (we don't havemuch logs in socket layer), Could you please create a dummy clusterand try to reproduce the issue? If then we can play with that volumeand I could provide some debug build which we can use for furtherdebugging?
If you don't have bandwidth for this, please leave it ;).

Regards
Rafi KC
- Micha

Am 30.11.2016 um 06:57 schrieb Mohammed Rafi K C:
Hi Micha,
I have changed the thread and subject so that your original threadremain same for your query. Let's try to fix the problem what youobserved with 3.8.4, So I have started a new thread to discuss thefrequent disconnect problem.
*If any one else has experienced the same problem, please respondto the mail.*
It would be very helpful if you could give us some more logs fromclients and bricks. Also any reproducible steps will surely helpto chase the problem further.
Regards

Rafi KC

On 11/30/2016 04:44 AM, Micha Ober wrote:
I had opened another thread on this mailing list (Subject: "Afterupgrade from 3.4.2 to 3.8.5 - High CPU usage resulting indisconnects and split-brain").
The title may be a bit misleading now, as I am no longerobserving high CPU usage after upgrading to 3.8.6, but thedisconnects are still happening and the number of files insplit-brain is growing.
Setup: 6 compute nodes, each serving as a glusterfs server andclient, Ubuntu 14.04, two bricks per node, distribute-replicate
I have two gluster volumes set up (one for scratch data, one forthe slurm scheduler). Only the scratch data volume shows criticalerrors "[...] has not responded in the last 42 seconds,disconnecting.". So I can rule out network problems, the gigabitlink between the nodes is not saturated at all. The disks arealmost idle (<10%).
I have glusterfs 3.4.2 on Ubuntu 12.04 on a another computecluster, running fine since it was deployed.I had glusterfs 3.4.2 on Ubuntu 14.04 on this cluster, runningfine for almost a year.
After upgrading to 3.8.5, the problems (as described) started. Iwould like to use some of the new features of the newer versions(like bitrot), but the users can't run their compute jobs rightnow because the result files are garbled.
There also seems to be a bug report with a smiliar problem: (butno progress)
https://bugzilla.redhat.com/show_bug.cgi?id=1370683

For me, ALL servers are affected (not isolated to one or two servers)
I also see messages like "INFO: task gpu_graphene_bv:4476 blockedfor more than 120 seconds." in the syslog.
For completeness (gv0 is the scratch volume, gv2 the slurm volume):

[root@giant2: ~]# gluster v info

Volume Name: gv0
Type: Distributed-Replicate
Volume ID: 993ec7c9-e4bc-44d0-b7c4-2d977e622e86
Status: Started
Snapshot Count: 0
Number of Bricks: 6 x 2 = 12
Transport-type: tcp
Bricks:
Brick1: giant1:/gluster/sdc/gv0
Brick2: giant2:/gluster/sdc/gv0
Brick3: giant3:/gluster/sdc/gv0
Brick4: giant4:/gluster/sdc/gv0
Brick5: giant5:/gluster/sdc/gv0
Brick6: giant6:/gluster/sdc/gv0
Brick7: giant1:/gluster/sdd/gv0
Brick8: giant2:/gluster/sdd/gv0
Brick9: giant3:/gluster/sdd/gv0
Brick10: giant4:/gluster/sdd/gv0
Brick11: giant5:/gluster/sdd/gv0
Brick12: giant6:/gluster/sdd/gv0
Options Reconfigured:
auth.allow: X.X.X.*,127.0.0.1
nfs.disable: on

Volume Name: gv2
Type: Replicate
Volume ID: 30c78928-5f2c-4671-becc-8deaee1a7a8d
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: giant1:/gluster/sdd/gv2
Brick2: giant2:/gluster/sdd/gv2
Options Reconfigured:
auth.allow: X.X.X.*,127.0.0.1
cluster.granular-entry-heal: on
cluster.locking-scheme: granular
nfs.disable: on
2016-11-30 0:10 GMT+01:00 Micha Ober <mich...@gmail.com<mailto:mich...@gmail.com>>:
    There also seems to be a bug report with a smiliar problem:
    (but no progress)
    https://bugzilla.redhat.com/show_bug.cgi?id=1370683
    <https://bugzilla.redhat.com/show_bug.cgi?id=1370683>

    For me, ALL servers are affected (not isolated to one or two
    servers)

    I also see messages like "INFO: task gpu_graphene_bv:4476
    blocked for more than 120 seconds." in the syslog.

    For completeness (gv0 is the scratch volume, gv2 the slurm
    volume):

    [root@giant2: ~]# gluster v info

    Volume Name: gv0
    Type: Distributed-Replicate
    Volume ID: 993ec7c9-e4bc-44d0-b7c4-2d977e622e86
    Status: Started
    Snapshot Count: 0
    Number of Bricks: 6 x 2 = 12
    Transport-type: tcp
    Bricks:
    Brick1: giant1:/gluster/sdc/gv0
    Brick2: giant2:/gluster/sdc/gv0
    Brick3: giant3:/gluster/sdc/gv0
    Brick4: giant4:/gluster/sdc/gv0
    Brick5: giant5:/gluster/sdc/gv0
    Brick6: giant6:/gluster/sdc/gv0
    Brick7: giant1:/gluster/sdd/gv0
    Brick8: giant2:/gluster/sdd/gv0
    Brick9: giant3:/gluster/sdd/gv0
    Brick10: giant4:/gluster/sdd/gv0
    Brick11: giant5:/gluster/sdd/gv0
    Brick12: giant6:/gluster/sdd/gv0
    Options Reconfigured:
    auth.allow: X.X.X.*,127.0.0.1
    nfs.disable: on

    Volume Name: gv2
    Type: Replicate
    Volume ID: 30c78928-5f2c-4671-becc-8deaee1a7a8d
    Status: Started
    Snapshot Count: 0
    Number of Bricks: 1 x 2 = 2
    Transport-type: tcp
    Bricks:
    Brick1: giant1:/gluster/sdd/gv2
    Brick2: giant2:/gluster/sdd/gv2
    Options Reconfigured:
    auth.allow: X.X.X.*,127.0.0.1
    cluster.granular-entry-heal: on
    cluster.locking-scheme: granular
    nfs.disable: on


    2016-11-29 19:21 GMT+01:00 Micha Ober <mich...@gmail.com>:

        I had opened another thread on this mailing list
        (Subject: "After upgrade from 3.4.2 to 3.8.5 - High CPU
        usage resulting in disconnects and split-brain").

        The title may be a bit misleading now, as I am no longer
        observing high CPU usage after upgrading to 3.8.6, but
        the disconnects are still happening and the number of
        files in split-brain is growing.

        Setup: 6 compute nodes, each serving as a glusterfs
        server and client, Ubuntu 14.04, two bricks per node,
        distribute-replicate

        I have two gluster volumes set up (one for scratch data,
        one for the slurm scheduler). Only the scratch data
        volume shows critical errors "[...] has not responded in
        the last 42 seconds, disconnecting.". So I can rule out
        network problems, the gigabit link between the nodes is
        not saturated at all. The disks are almost idle (<10%).

        I have glusterfs 3.4.2 on Ubuntu 12.04 on a another
        compute cluster, running fine since it was deployed.
        I had glusterfs 3.4.2 on Ubuntu 14.04 on this cluster,
        running fine for almost a year.

        After upgrading to 3.8.5, the problems (as described)
        started. I would like to use some of the new features of
        the newer versions (like bitrot), but the users can't run
        their compute jobs right now because the result files are
        garbled.

        2016-11-29 18:53 GMT+01:00 Atin Mukherjee
        <amukh...@redhat.com>:

            Would you be able to share what is not working for
            you in 3.8.x (mention the exact version). 3.4 is
            quite old and falling back to an unsupported version
            doesn't look a feasible option.

            On Tue, 29 Nov 2016 at 17:01, Micha Ober
            <mich...@gmail.com> wrote:

                Hi,

                I was using gluster 3.4 and upgraded to 3.8, but
                that version showed to be unusable for me. I now
                need to downgrade.

                I'm running Ubuntu 14.04. As upgrades of the op
                version are irreversible, I guess I have to
                delete all gluster volumes and re-create them
                with the downgraded version.

                0. Backup data
                1. Unmount all gluster volumes
                2. apt-get purge glusterfs-server glusterfs-client
                3. Remove PPA for 3.8
                4. Add PPA for older version
                5. apt-get install glusterfs-server glusterfs-client
                6. Create volumes

                Is "purge" enough to delete all configuration
                files of the currently installed version or do I
                need to  manually clear some residues before
                installing an older version?

                Thanks.
                _______________________________________________
                Gluster-users mailing list
                Gluster-users@gluster.org
                <mailto:Gluster-users@gluster.org>
                http://www.gluster.org/mailman/listinfo/gluster-users
                <http://www.gluster.org/mailman/listinfo/gluster-users>
--- Atin (atinm)
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] RE : Frequent connect and disconnect messages flooded in logs

Reply via email to