If you are trying this again, please 'gluster volume set $volname client-log-level DEBUG`before attempting the add-brick and attach the gvol0-add-brick-mount.log here. After that, you can change the client-log-level back to INFO.

-Ravi

On 22/05/19 11:32 AM, Ravishankar N wrote:


On 22/05/19 11:23 AM, David Cunningham wrote:
Hi Ravi,

I'd already done exactly that before, where step 3 was a simple 'rm -rf /nodirectwritedata/gluster/gvol0'. Have you another suggestion on what the cleanup or reformat should be?
`rm -rf /nodirectwritedata/gluster/gvol0` does look okay to me David. Basically, '/nodirectwritedata/gluster/gvol0' must be empty and must not have any extended attributes set on it. Why fuse_first_lookup() is failing is a bit of a mystery to me at this point. :-(
Regards,
Ravi

Thank you.


On Wed, 22 May 2019 at 13:56, Ravishankar N <ravishan...@redhat.com <mailto:ravishan...@redhat.com>> wrote:

    Hmm, so the volume info seems to indicate that the add-brick was
    successful but the gfid xattr is missing on the new brick (as are
    the actual files, barring the .glusterfs folder, according to
    your previous mail).

    Do you want to try removing and adding it again?

    1. `gluster volume remove-brick gvol0 replica 2
    gfs3:/nodirectwritedata/gluster/gvol0 force` from gfs1

    2. Check that gluster volume info is now back to a 1x2 volume on
    all nodes and `gluster peer status` is connected on all nodes.

    3. Cleanup or reformat '/nodirectwritedata/gluster/gvol0' on gfs3.

    4. `gluster volume add-brick gvol0 replica 3 arbiter 1
    gfs3:/nodirectwritedata/gluster/gvol0` from gfs1.

    5. Check that the files are getting healed on to the new brick.

    Thanks,
    Ravi
    On 22/05/19 6:50 AM, David Cunningham wrote:
    Hi Ravi,

    Certainly. On the existing two nodes:

    gfs1 # getfattr -d -m. -e hex /nodirectwritedata/gluster/gvol0
    getfattr: Removing leading '/' from absolute path names
    # file: nodirectwritedata/gluster/gvol0
    trusted.afr.dirty=0x000000000000000000000000
    trusted.afr.gvol0-client-2=0x000000000000000000000000
    trusted.gfid=0x00000000000000000000000000000001
    trusted.glusterfs.dht=0x000000010000000000000000ffffffff
    trusted.glusterfs.volume-id=0xfb5af69e1c3e41648b23c1d7bec9b1b6

    gfs2 # getfattr -d -m. -e hex /nodirectwritedata/gluster/gvol0
    getfattr: Removing leading '/' from absolute path names
    # file: nodirectwritedata/gluster/gvol0
    trusted.afr.dirty=0x000000000000000000000000
    trusted.afr.gvol0-client-0=0x000000000000000000000000
    trusted.afr.gvol0-client-2=0x000000000000000000000000
    trusted.gfid=0x00000000000000000000000000000001
    trusted.glusterfs.dht=0x000000010000000000000000ffffffff
    trusted.glusterfs.volume-id=0xfb5af69e1c3e41648b23c1d7bec9b1b6

    On the new node:

    gfs3 # getfattr -d -m. -e hex /nodirectwritedata/gluster/gvol0
    getfattr: Removing leading '/' from absolute path names
    # file: nodirectwritedata/gluster/gvol0
    trusted.afr.dirty=0x000000000000000000000001
    trusted.glusterfs.volume-id=0xfb5af69e1c3e41648b23c1d7bec9b1b6

    Output of "gluster volume info" is the same on all 3 nodes and is:

    # gluster volume info

    Volume Name: gvol0
    Type: Replicate
    Volume ID: fb5af69e-1c3e-4164-8b23-c1d7bec9b1b6
    Status: Started
    Snapshot Count: 0
    Number of Bricks: 1 x (2 + 1) = 3
    Transport-type: tcp
    Bricks:
    Brick1: gfs1:/nodirectwritedata/gluster/gvol0
    Brick2: gfs2:/nodirectwritedata/gluster/gvol0
    Brick3: gfs3:/nodirectwritedata/gluster/gvol0 (arbiter)
    Options Reconfigured:
    performance.client-io-threads: off
    nfs.disable: on
    transport.address-family: inet


    On Wed, 22 May 2019 at 12:43, Ravishankar N
    <ravishan...@redhat.com <mailto:ravishan...@redhat.com>> wrote:

        Hi David,
        Could you provide the `getfattr -d -m. -e hex
        /nodirectwritedata/gluster/gvol0` output of all bricks and
        the output of `gluster volume info`?

        Thanks,
        Ravi
        On 22/05/19 4:57 AM, David Cunningham wrote:
        Hi Sanju,

        Here's what glusterd.log says on the new arbiter server
        when trying to add the node:

        [2019-05-22 00:15:05.963059] I [run.c:242:runner_log]
        (-->/usr/lib64/glusterfs/5.6/xlator/mgmt/glusterd.so(+0x3b2cd)
        [0x7fe4ca9102cd]
        -->/usr/lib64/glusterfs/5.6/xlator/mgmt/glusterd.so(+0xe6b85)
        [0x7fe4ca9bbb85]
        -->/lib64/libglusterfs.so.0(runner_log+0x115)
        [0x7fe4d5ecc955] ) 0-management: Ran script:
        
/var/lib/glusterd/hooks/1/add-brick/pre/S28Quota-enable-root-xattr-heal.sh
        --volname=gvol0 --version=1 --volume-op=add-brick
        --gd-workdir=/var/lib/glusterd
        [2019-05-22 00:15:05.963177] I [MSGID: 106578]
        [glusterd-brick-ops.c:1355:glusterd_op_perform_add_bricks]
        0-management: replica-count is set 3
        [2019-05-22 00:15:05.963228] I [MSGID: 106578]
        [glusterd-brick-ops.c:1360:glusterd_op_perform_add_bricks]
        0-management: arbiter-count is set 1
        [2019-05-22 00:15:05.963257] I [MSGID: 106578]
        [glusterd-brick-ops.c:1364:glusterd_op_perform_add_bricks]
        0-management: type is set 0, need to change it
        [2019-05-22 00:15:17.015268] E [MSGID: 106053]
        [glusterd-utils.c:13942:glusterd_handle_replicate_brick_ops]
        0-management: Failed to set extended attribute
        trusted.add-brick : Transport endpoint is not connected
        [Transport endpoint is not connected]
        [2019-05-22 00:15:17.036479] E [MSGID: 106073]
        [glusterd-brick-ops.c:2595:glusterd_op_add_brick]
        0-glusterd: Unable to add bricks
        [2019-05-22 00:15:17.036595] E [MSGID: 106122]
        [glusterd-mgmt.c:299:gd_mgmt_v3_commit_fn] 0-management:
        Add-brick commit failed.
        [2019-05-22 00:15:17.036710] E [MSGID: 106122]
        [glusterd-mgmt-handler.c:594:glusterd_handle_commit_fn]
        0-management: commit failed on operation Add brick

        As before gvol0-add-brick-mount.log said:

        [2019-05-22 00:15:17.005695] I
        [fuse-bridge.c:4267:fuse_init] 0-glusterfs-fuse: FUSE
        inited with protocol versions: glusterfs 7.24 kernel 7.22
        [2019-05-22 00:15:17.005749] I
        [fuse-bridge.c:4878:fuse_graph_sync] 0-fuse: switched to
        graph 0
        [2019-05-22 00:15:17.010101] E
        [fuse-bridge.c:4336:fuse_first_lookup] 0-fuse: first lookup
        on root failed (Transport endpoint is not connected)
        [2019-05-22 00:15:17.014217] W
        [fuse-bridge.c:897:fuse_attr_cbk] 0-glusterfs-fuse: 2:
        LOOKUP() / => -1 (Transport endpoint is not connected)
        [2019-05-22 00:15:17.015097] W
        [fuse-resolve.c:127:fuse_resolve_gfid_cbk] 0-fuse:
        00000000-0000-0000-0000-000000000001: failed to resolve
        (Transport endpoint is not connected)
        [2019-05-22 00:15:17.015158] W
        [fuse-bridge.c:3294:fuse_setxattr_resume] 0-glusterfs-fuse:
        3: SETXATTR 00000000-0000-0000-0000-000000000001/1
        (trusted.add-brick) resolution failed
        [2019-05-22 00:15:17.035636] I
        [fuse-bridge.c:5144:fuse_thread_proc] 0-fuse: initating
        unmount of /tmp/mntYGNbj9
        [2019-05-22 00:15:17.035854] W
        [glusterfsd.c:1500:cleanup_and_exit]
        (-->/lib64/libpthread.so.0(+0x7dd5) [0x7f7745ccedd5]
        -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5)
        [0x55c81b63de75]
        -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b)
        [0x55c81b63dceb] ) 0-: received signum (15), shutting down
        [2019-05-22 00:15:17.035942] I [fuse-bridge.c:5914:fini]
        0-fuse: Unmounting '/tmp/mntYGNbj9'.
        [2019-05-22 00:15:17.035966] I [fuse-bridge.c:5919:fini]
        0-fuse: Closing fuse connection to '/tmp/mntYGNbj9'.

        Here are the processes running on the new arbiter server:
        # ps -ef | grep gluster
        root      3466     1  0 20:13 ?        00:00:00
        /usr/sbin/glusterfs -s localhost --volfile-id
        gluster/glustershd -p
        /var/run/gluster/glustershd/glustershd.pid -l
        /var/log/glusterfs/glustershd.log -S
        /var/run/gluster/24c12b09f93eec8e.socket --xlator-option
        *replicate*.node-uuid=2069cfb3-c798-47e3-8cf8-3c584cf7c412
        --process-name glustershd
        root      6832     1  0 May16 ?        00:02:10
        /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
        root     17841     1  0 May16 ?        00:00:58
        /usr/sbin/glusterfs --process-name fuse
        --volfile-server=gfs1 --volfile-id=/gvol0 /mnt/glusterfs

        Here are the files created on the new arbiter server:
        # find /nodirectwritedata/gluster/gvol0 | xargs ls -ald
        drwxr-xr-x 3 root root 4096 May 21 20:15
        /nodirectwritedata/gluster/gvol0
        drw------- 2 root root 4096 May 21 20:15
        /nodirectwritedata/gluster/gvol0/.glusterfs

        Thank you for your help!


        On Tue, 21 May 2019 at 00:10, Sanju Rakonde
        <srako...@redhat.com <mailto:srako...@redhat.com>> wrote:

            David,

            can you please attach glusterd.logs? As the error
            message says, Commit failed on the arbitar node, we
            might be able to find some issue on that node.

            On Mon, May 20, 2019 at 10:10 AM Nithya Balachandran
            <nbala...@redhat.com <mailto:nbala...@redhat.com>> wrote:



                On Fri, 17 May 2019 at 06:01, David Cunningham
                <dcunning...@voisonics.com
                <mailto:dcunning...@voisonics.com>> wrote:

                    Hello,

                    We're adding an arbiter node to an existing
                    volume and having an issue. Can anyone help?
                    The root cause error appears to be
                    "00000000-0000-0000-0000-000000000001: failed
                    to resolve (Transport endpoint is not
                    connected)", as below.

                    We are running glusterfs 5.6.1. Thanks in
                    advance for any assistance!

                    On existing node gfs1, trying to add new
                    arbiter node gfs3:

                    # gluster volume add-brick gvol0 replica 3
                    arbiter 1 gfs3:/nodirectwritedata/gluster/gvol0
                    volume add-brick: failed: Commit failed on
                    gfs3. Please check log file for details.


                This looks like a glusterd issue. Please check the
                glusterd logs for more info.
                Adding the glusterd dev to this thread. Sanju, can
                you take a look?
                Regards,
                Nithya


                    On new node gfs3 in gvol0-add-brick-mount.log:

                    [2019-05-17 01:20:22.689721] I
                    [fuse-bridge.c:4267:fuse_init]
                    0-glusterfs-fuse: FUSE inited with protocol
                    versions: glusterfs 7.24 kernel 7.22
                    [2019-05-17 01:20:22.689778] I
                    [fuse-bridge.c:4878:fuse_graph_sync] 0-fuse:
                    switched to graph 0
                    [2019-05-17 01:20:22.694897] E
                    [fuse-bridge.c:4336:fuse_first_lookup] 0-fuse:
                    first lookup on root failed (Transport endpoint
                    is not connected)
                    [2019-05-17 01:20:22.699770] W
                    [fuse-resolve.c:127:fuse_resolve_gfid_cbk]
                    0-fuse: 00000000-0000-0000-0000-000000000001:
                    failed to resolve (Transport endpoint is not
                    connected)
                    [2019-05-17 01:20:22.699834] W
                    [fuse-bridge.c:3294:fuse_setxattr_resume]
                    0-glusterfs-fuse: 2: SETXATTR
                    00000000-0000-0000-0000-000000000001/1
                    (trusted.add-brick) resolution failed
                    [2019-05-17 01:20:22.715656] I
                    [fuse-bridge.c:5144:fuse_thread_proc] 0-fuse:
                    initating unmount of /tmp/mntQAtu3f
                    [2019-05-17 01:20:22.715865] W
                    [glusterfsd.c:1500:cleanup_and_exit]
                    (-->/lib64/libpthread.so.0(+0x7dd5)
                    [0x7fb223bf6dd5]
                    -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5)
                    [0x560886581e75]
                    -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b)
                    [0x560886581ceb] ) 0-: received signum (15),
                    shutting down
                    [2019-05-17 01:20:22.715926] I
                    [fuse-bridge.c:5914:fini] 0-fuse: Unmounting
                    '/tmp/mntQAtu3f'.
                    [2019-05-17 01:20:22.715953] I
                    [fuse-bridge.c:5919:fini] 0-fuse: Closing fuse
                    connection to '/tmp/mntQAtu3f'.

                    Processes running on new node gfs3:

                    # ps -ef | grep gluster
                    root 6832     1  0 20:17 ? 00:00:00
                    /usr/sbin/glusterd -p /var/run/glusterd.pid
                    --log-level INFO
                    root 15799     1  0 20:17 ? 00:00:00
                    /usr/sbin/glusterfs -s localhost --volfile-id
                    gluster/glustershd -p
                    /var/run/gluster/glustershd/glustershd.pid -l
                    /var/log/glusterfs/glustershd.log -S
                    /var/run/gluster/24c12b09f93eec8e.socket
                    --xlator-option
                    *replicate*.node-uuid=2069cfb3-c798-47e3-8cf8-3c584cf7c412
                    --process-name glustershd
                    root     16856 16735  0 21:21 pts/0 00:00:00
                    grep --color=auto gluster

-- David Cunningham, Voisonics Limited
                    http://voisonics.com/
                    USA: +1 213 221 1092
                    New Zealand: +64 (0)28 2558 3782
                    _______________________________________________
                    Gluster-users mailing list
                    Gluster-users@gluster.org
                    <mailto:Gluster-users@gluster.org>
                    https://lists.gluster.org/mailman/listinfo/gluster-users



-- Thanks,
            Sanju



-- David Cunningham, Voisonics Limited
        http://voisonics.com/
        USA: +1 213 221 1092
        New Zealand: +64 (0)28 2558 3782

        _______________________________________________
        Gluster-users mailing list
        Gluster-users@gluster.org  <mailto:Gluster-users@gluster.org>
        https://lists.gluster.org/mailman/listinfo/gluster-users



-- David Cunningham, Voisonics Limited
    http://voisonics.com/
    USA: +1 213 221 1092
    New Zealand: +64 (0)28 2558 3782



--
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Reply via email to