Re: [Gluster-users] libgfapi failover problem on replica bricks

Pranith Kumar Karampuri Wed, 06 Aug 2014 05:16:26 -0700


On 08/06/2014 11:30 AM, Roman wrote:

Also, this time files are not the same!
root@stor1:~# md5sum/exports/fast-test/150G/images/127/vm-127-disk-1.qcow232411360c53116b96a059f17306caeda/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
root@stor2:~# md5sum/exports/fast-test/150G/images/127/vm-127-disk-1.qcow265b8a6031bcb6f5fb3a11cb1e8b1c9c9/exports/fast-test/150G/images/127/vm-127-disk-1.qcow2

What is the getfattr output?

Pranith

2014-08-05 16:33 GMT+03:00 Roman <rome...@gmail.com<mailto:rome...@gmail.com>>:


    Nope, it is not working. But this time it went a bit other way

    root@gluster-client:~# dmesg
    Segmentation fault


    I was not able even to start the VM after I done the tests

    Could not read qcow2 header: Operation not permitted

    And it seems, it never starts to sync files after first
    disconnect. VM survives first disconnect, but not second (I waited
    around 30 minutes). Also, I've got network.ping-timeout: 2 in
    volume settings, but logs react on first disconnect around 30
    seconds. Second was faster, 2 seconds.

    Reaction was different also:

    slower one:
    [2014-08-05 13:26:19.558435] W [socket.c:514:__socket_rwv]
    0-glusterfs: readv failed (Connection timed out)
    [2014-08-05 13:26:19.558485] W
    [socket.c:1962:__socket_proto_state_machine] 0-glusterfs: reading
    from socket failed. Error (Connection timed out), peer
    (10.250.0.1:24007 <http://10.250.0.1:24007>)
    [2014-08-05 13:26:21.281426] W [socket.c:514:__socket_rwv]
    0-HA-fast-150G-PVE1-client-0: readv failed (Connection timed out)
    [2014-08-05 13:26:21.281474] W
    [socket.c:1962:__socket_proto_state_machine]
    0-HA-fast-150G-PVE1-client-0: reading from socket failed. Error
    (Connection timed out), peer (10.250.0.1:49153
    <http://10.250.0.1:49153>)
    [2014-08-05 13:26:21.281507] I [client.c:2098:client_rpc_notify]
    0-HA-fast-150G-PVE1-client-0: disconnected

    the fast one:
    2014-08-05 12:52:44.607389] C
    [client-handshake.c:127:rpc_client_ping_timer_expired]
    0-HA-fast-150G-PVE1-client-1: server 10.250.0.2:49153
    <http://10.250.0.2:49153> has not responded in the last 2 seconds,
    disconnecting.
    [2014-08-05 12:52:44.607491] W [socket.c:514:__socket_rwv]
    0-HA-fast-150G-PVE1-client-1: readv failed (No data available)
    [2014-08-05 12:52:44.607585] E
    [rpc-clnt.c:368:saved_frames_unwind]
    (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8)
    [0x7fcb1b4b0558]
    
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)
    [0x7fcb1b4aea63]
    (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)
    [0x7fcb1b4ae97e]))) 0-HA-fast-150G-PVE1-client-1: forced unwinding
    frame type(GlusterFS 3.3) op(LOOKUP(27)) called at 2014-08-05
    12:52:42.463881 (xid=0x381883x)
    [2014-08-05 12:52:44.607604] W
    [client-rpc-fops.c:2624:client3_3_lookup_cbk]
    0-HA-fast-150G-PVE1-client-1: remote operation failed: Transport
    endpoint is not connected. Path: /
    (00000000-0000-0000-0000-000000000001)
    [2014-08-05 12:52:44.607736] E
    [rpc-clnt.c:368:saved_frames_unwind]
    (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xf8)
    [0x7fcb1b4b0558]
    
(-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xc3)
    [0x7fcb1b4aea63]
    (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(saved_frames_destroy+0xe)
    [0x7fcb1b4ae97e]))) 0-HA-fast-150G-PVE1-client-1: forced unwinding
    frame type(GlusterFS Handshake) op(PING(3)) called at 2014-08-05
    12:52:42.463891 (xid=0x381884x)
    [2014-08-05 12:52:44.607753] W
    [client-handshake.c:276:client_ping_cbk]
    0-HA-fast-150G-PVE1-client-1: timer must have expired
    [2014-08-05 12:52:44.607776] I [client.c:2098:client_rpc_notify]
    0-HA-fast-150G-PVE1-client-1: disconnected



    I've got SSD disks (just for an info).
    Should I go and give a try for 3.5.2?



    2014-08-05 13:06 GMT+03:00 Pranith Kumar Karampuri
    <pkara...@redhat.com <mailto:pkara...@redhat.com>>:

        reply along with gluster-users please :-). May be you are
        hitting 'reply' instead of 'reply all'?

        Pranith

        On 08/05/2014 03:35 PM, Roman wrote:

        To make sure and clean, I've created another VM with raw
        format and goint to repeat those steps. So now I've got two
        VM-s one with qcow2 format and other with raw format. I will
        send another e-mail shortly.


        2014-08-05 13:01 GMT+03:00 Pranith Kumar Karampuri
        <pkara...@redhat.com <mailto:pkara...@redhat.com>>:


            On 08/05/2014 03:07 PM, Roman wrote:

            really, seems like the same file

            stor1:
            a951641c5230472929836f9fcede6b04
             /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2

            stor2:
            a951641c5230472929836f9fcede6b04
             /exports/fast-test/150G/images/127/vm-127-disk-1.qcow2


            one thing I've seen from logs, that somehow proxmox VE
            is connecting with wrong version to servers?
            [2014-08-05 09:23:45.218550] I
            [client-handshake.c:1659:select_server_supported_programs]
            0-HA-fast-150G-PVE1-client-0: Using Program GlusterFS
            3.3, Num (1298437), Version (330)

            It is the rpc (over the network data structures) version,
            which is not changed at all from 3.3 so thats not a
            problem. So what is the conclusion? Is your test case
            working now or not?

            Pranith

            but if I issue:
            root@pve1:~# glusterfs -V
            glusterfs 3.4.4 built on Jun 28 2014 03:44:57
            seems ok.

            server  use 3.4.4 meanwhile
            [2014-08-05 09:23:45.117875] I
            [server-handshake.c:567:server_setvolume]
            0-HA-fast-150G-PVE1-server: accepted client from
            stor1-9004-2014/08/05-09:23:45:93538-HA-fast-150G-PVE1-client-1-0
            (version: 3.4.4)
            [2014-08-05 09:23:49.103035] I
            [server-handshake.c:567:server_setvolume]
            0-HA-fast-150G-PVE1-server: accepted client from
            stor1-8998-2014/08/05-09:23:45:89883-HA-fast-150G-PVE1-client-0-0
            (version: 3.4.4)

            if this could be the reason, of course.
            I did restart the Proxmox VE yesterday (just for an
            information)





            2014-08-05 12:30 GMT+03:00 Pranith Kumar Karampuri
            <pkara...@redhat.com <mailto:pkara...@redhat.com>>:


                On 08/05/2014 02:33 PM, Roman wrote:

                Waited long enough for now, still different sizes
                and no logs about healing :(

                stor1
                # file:
                exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
                
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000
                
trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000
                trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921

                root@stor1:~# du -sh
                /exports/fast-test/150G/images/127/
                1.2G  /exports/fast-test/150G/images/127/


                stor2
                # file:
                exports/fast-test/150G/images/127/vm-127-disk-1.qcow2
                
trusted.afr.HA-fast-150G-PVE1-client-0=0x000000000000000000000000
                
trusted.afr.HA-fast-150G-PVE1-client-1=0x000000000000000000000000
                trusted.gfid=0xf10ad81b58484bcd9b385a36a207f921


                root@stor2:~# du -sh
                /exports/fast-test/150G/images/127/
                1.4G  /exports/fast-test/150G/images/127/

                According to the changelogs, the file doesn't need
                any healing. Could you stop the operations on the
                VMs and take md5sum on both these machines?

                Pranith





                2014-08-05 11:49 GMT+03:00 Pranith Kumar Karampuri
                <pkara...@redhat.com <mailto:pkara...@redhat.com>>:


                    On 08/05/2014 02:06 PM, Roman wrote:

                    Well, it seems like it doesn't see the changes
                    were made to the volume ? I created two files
                    200 and 100 MB (from /dev/zero) after I
                    disconnected the first brick. Then connected
                    it back and got these logs:

                    [2014-08-05 08:30:37.830150] I
                    [glusterfsd-mgmt.c:1584:mgmt_getspec_cbk]
                    0-glusterfs: No change in volfile, continuing
                    [2014-08-05 08:30:37.830207] I
                    [rpc-clnt.c:1676:rpc_clnt_reconfig]
                    0-HA-fast-150G-PVE1-client-0: changing port to
                    49153 (from 0)
                    [2014-08-05 08:30:37.830239] W
                    [socket.c:514:__socket_rwv]
                    0-HA-fast-150G-PVE1-client-0: readv failed (No
                    data available)
                    [2014-08-05 08:30:37.831024] I
                    [client-handshake.c:1659:select_server_supported_programs]
                    0-HA-fast-150G-PVE1-client-0: Using Program
                    GlusterFS 3.3, Num (1298437), Version (330)
                    [2014-08-05 08:30:37.831375] I
                    [client-handshake.c:1456:client_setvolume_cbk]
                    0-HA-fast-150G-PVE1-client-0: Connected to
                    10.250.0.1:49153 <http://10.250.0.1:49153>,
                    attached to remote volume
                    '/exports/fast-test/150G'.
                    [2014-08-05 08:30:37.831394] I
                    [client-handshake.c:1468:client_setvolume_cbk]
                    0-HA-fast-150G-PVE1-client-0: Server and
                    Client lk-version numbers are not same,
                    reopening the fds
                    [2014-08-05 08:30:37.831566] I
                    [client-handshake.c:450:client_set_lk_version_cbk]
                    0-HA-fast-150G-PVE1-client-0: Server lk
                    version = 1


                    [2014-08-05 08:30:37.830150] I
                    [glusterfsd-mgmt.c:1584:mgmt_getspec_cbk]
                    0-glusterfs: No change in volfile, continuing
                    this line seems weird to me tbh.
                    I do not see any traffic on switch interfaces
                    between gluster servers, which means, there is
                    no syncing between them.
                    I tried to ls -l the files on the client and
                    servers to trigger the healing, but seems like
                    no success. Should I wait more?

                    Yes, it should take around 10-15 minutes. Could
                    you provide 'getfattr -d -m. -e hex
                    <file-on-brick>' on both the bricks.

                    Pranith



                    2014-08-05 11:25 GMT+03:00 Pranith Kumar
                    Karampuri <pkara...@redhat.com
                    <mailto:pkara...@redhat.com>>:


                        On 08/05/2014 01:10 PM, Roman wrote:

                        Ahha! For some reason I was not able to
                        start the VM anymore, Proxmox VE told me,
                        that it is not able to read the qcow2
                        header due to permission is denied for
                        some reason. So I just deleted that file
                        and created a new VM. And the nex message
                        I've got was this:

                        Seems like these are the messages where
                        you took down the bricks before self-heal.
                        Could you restart the run waiting for
                        self-heals to complete before taking down
                        the next brick?

                        Pranith



                        [2014-08-05 07:31:25.663412] E
                        
[afr-self-heal-common.c:197:afr_sh_print_split_brain_log]
                        0-HA-fast-150G-PVE1-replicate-0: Unable
                        to self-heal contents of
                        '/images/124/vm-124-disk-1.qcow2'
                        (possible split-brain). Please delete the
                        file from all but the preferred
                        subvolume.- Pending matrix:  [ [ 0 60 ] [
                        11 0 ] ]
                        [2014-08-05 07:31:25.663955] E
                        
[afr-self-heal-common.c:2262:afr_self_heal_completion_cbk]
                        0-HA-fast-150G-PVE1-replicate-0:
                        background  data self-heal failed on
                        /images/124/vm-124-disk-1.qcow2



                        2014-08-05 10:13 GMT+03:00 Pranith Kumar
                        Karampuri <pkara...@redhat.com
                        <mailto:pkara...@redhat.com>>:

                            I just responded to your earlier mail
                            about how the log looks. The log
                            comes on the mount's logfile

                            Pranith

                            On 08/05/2014 12:41 PM, Roman wrote:

                            Ok, so I've waited enough, I think.
                            Had no any traffic on switch ports
                            between servers. Could not find any
                            suitable log message about completed
                            self-heal (waited about 30 minutes).
                            Plugged out the other server's UTP
                            cable this time and got in the same
                            situation:
                            root@gluster-test1:~# cat /var/log/dmesg
                            -bash: /bin/cat: Input/output error

                            brick logs:
                            [2014-08-05 07:09:03.005474] I
                            [server.c:762:server_rpc_notify]
                            0-HA-fast-150G-PVE1-server:
                            disconnecting connectionfrom
                            
pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0
                            [2014-08-05 07:09:03.005530] I
                            [server-helpers.c:729:server_connection_put]
                            0-HA-fast-150G-PVE1-server: Shutting
                            down connection
                            
pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0
                            [2014-08-05 07:09:03.005560] I
                            [server-helpers.c:463:do_fd_cleanup]
                            0-HA-fast-150G-PVE1-server: fd
                            cleanup on
                            /images/124/vm-124-disk-1.qcow2
                            [2014-08-05 07:09:03.005797] I
                            [server-helpers.c:617:server_connection_destroy]
                            0-HA-fast-150G-PVE1-server:
                            destroyed connection of
                            
pve1-27649-2014/08/04-13:27:54:720789-HA-fast-150G-PVE1-client-0-0





                            2014-08-05 9:53 GMT+03:00 Pranith
                            Kumar Karampuri <pkara...@redhat.com
                            <mailto:pkara...@redhat.com>>:

                                Do you think it is possible for
                                you to do these tests on the
                                latest version 3.5.2? 'gluster
                                volume heal <volname> info'
                                would give you that information
                                in versions > 3.5.1.
                                Otherwise you will have to check
                                it from either the logs, there
                                will be self-heal completed
                                message on the mount logs (or)
                                by observing 'getfattr -d -m. -e
                                hex <image-file-on-bricks>'

                                Pranith


                                On 08/05/2014 12:09 PM, Roman wrote:

                                Ok, I understand. I will try
                                this shortly.
                                How can I be sure, that healing
                                process is done, if I am not
                                able to see its status?


                                2014-08-05 9:30 GMT+03:00
                                Pranith Kumar Karampuri
                                <pkara...@redhat.com
                                <mailto:pkara...@redhat.com>>:

                                    Mounts will do the healing,
                                    not the self-heal-daemon.
                                    The problem I feel is that
                                    whichever process does the
                                    healing has the latest
                                    information about the good
                                    bricks in this usecase.
                                    Since for VM usecase,
                                    mounts should have the
                                    latest information, we
                                    should let the mounts do
                                    the healing. If the mount
                                    accesses the VM image
                                    either by someone doing
                                    operations inside the VM or
                                    explicit stat on the file
                                    it should do the healing.

                                    Pranith.


                                    On 08/05/2014 10:39 AM,
                                    Roman wrote:

                                    Hmmm, you told me to turn
                                    it off. Did I understood
                                    something wrong? After I
                                    issued the command you've
                                    sent me, I was not able to
                                    watch the healing process,
                                    it said, it won't be
                                    healed, becouse its turned
                                    off.


                                    2014-08-05 5:39 GMT+03:00
                                    Pranith Kumar Karampuri
                                    <pkara...@redhat.com
                                    <mailto:pkara...@redhat.com>>:

                                        You didn't mention
                                        anything about
                                        self-healing. Did you
                                        wait until the
                                        self-heal is complete?

                                        Pranith

                                        On 08/04/2014 05:49
                                        PM, Roman wrote:

                                        Hi!
                                        Result is pretty
                                        same. I set the
                                        switch port down for
                                        1st server, it was
                                        ok. Then set it up
                                        back and set other
                                        server's port off.
                                        and it triggered IO
                                        error on two virtual
                                        machines: one with
                                        local root FS but
                                        network mounted
                                        storage. and other
                                        with network root FS.
                                        1st gave an error on
                                        copying to or from
                                        the mounted network
                                        disk, other just gave
                                        me an error for even
                                        reading log.files.

                                        cat:
                                        /var/log/alternatives.log:
                                        Input/output error
                                        then I reset the kvm
                                        VM and it said me,
                                        there is no boot
                                        device. Next I
                                        virtually powered it
                                        off and then back on
                                        and it has booted.

                                        By the way, did I
                                        have to start/stop
                                        volume?

                                        >> Could you do the
                                        following and test it
                                        again?
                                        >> gluster volume set
                                        <volname>
                                        cluster.self-heal-daemon
                                        off

                                        >>Pranith




                                        2014-08-04 14:10
                                        GMT+03:00 Pranith
                                        Kumar Karampuri
                                        <pkara...@redhat.com
                                        <mailto:pkara...@redhat.com>>:


                                            On 08/04/2014
                                            03:33 PM, Roman
                                            wrote:

                                            Hello!

                                            Facing the same
                                            problem as
                                            mentioned here:

                                            
http://supercolony.gluster.org/pipermail/gluster-users/2014-April/039959.html

                                            my set up is up
                                            and running, so
                                            i'm ready to
                                            help you back
                                            with feedback.

                                            setup:
                                            proxmox server
                                            as client
                                            2 gluster
                                            physical  servers

                                            server side and
                                            client side both
                                            running atm
                                            3.4.4 glusterfs
                                            from gluster repo.

                                            the problem is:

                                            1. craeted
                                            replica bricks.
                                            2. mounted in
                                            proxmox (tried
                                            both promox
                                            ways: via GUI
                                            and fstab (with
                                            backup volume
                                            line), btw while
                                            mounting via
                                            fstab I'm unable
                                            to launch a VM
                                            without cache,
                                            meanwhile
                                            direct-io-mode
                                            is enabled in
                                            fstab line)
                                            3. installed VM
                                            4. bring one
                                            volume down - ok
                                            5. bringing up,
                                            waiting for sync
                                            is done.
                                            6. bring other
                                            volume down -
                                            getting IO
                                            errors on VM
                                            guest and not
                                            able to restore
                                            the VM after I
                                            reset the VM via
                                            host. It says
                                            (no bootable
                                            media). After I
                                            shut it down
                                            (forced) and
                                            bring back up,
                                            it boots.

                                            Could you do the
                                            following and
                                            test it again?
                                            gluster volume
                                            set <volname>
                                            cluster.self-heal-daemon
                                            off

                                            Pranith


                                            Need help. Tried
                                            3.4.3, 3.4.4.
                                            Still missing
                                            pkg-s for 3.4.5
                                            for debian and
                                            3.5.2 (3.5.1
                                            always gives a
                                            healing error
                                            for some reason)

--Best regards,

                                            Roman.


                                            
_______________________________________________
                                            Gluster-users mailing list
                                            Gluster-users@gluster.org  
<mailto:Gluster-users@gluster.org>
                                            
http://supercolony.gluster.org/mailman/listinfo/gluster-users

--Best regards,

                                        Roman.

--Best regards,

                                    Roman.

--Best regards,

                                Roman.

--Best regards,

                            Roman.

--Best regards,

                        Roman.

--Best regards,

                    Roman.

--Best regards,

                Roman.

--Best regards,

            Roman.

--Best regards,

        Roman.

--Best regards,

    Roman.




--
Best regards,
Roman.

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] libgfapi failover problem on replica bricks

Reply via email to