Re: [Gluster-users] ganesha.nfsd process dies when copying files

Pui Edylie Tue, 14 Aug 2018 22:51:01 -0700

Hi Karli,

I think Alex is right in regards with the NFS version and state.


I am only using NFSv3 and the failover is working per expectation.

In my use case, I have 3 nodes with ESXI 6.7 as OS and setup 1x glusterVM on each of the ESXI host using its local datastore.

Once I have formed the replicate 3, I use the CTDB VIP to present theNFS3 back to the Vcenter and uses it as a shared storage.

Everything works great other than performance is not very good ... I amstill looking for ways to improve it.


Cheers,
Edy


On 8/15/2018 12:25 AM, Alex Chekholko wrote:

Hi Karli,

I'm not 100% sure this is related, but when I set up my ZFS NFS HA perhttps://github.com/ewwhite/zfs-ha/wiki I was not able to get thefailover to work with NFS v4 but only with NFS v3.

From the client point of view, it really looked like with NFS v4 thereis an open file handle and that just goes stale and hangs, orsomething like that, whereas with NFSv3 the client retries andrecovers and continues. I did not investigate further, I just usev3. I think it has something to do with NFSv4 being "stateful" andNFSv3 being "stateless".

Can you re-run your test but using NFSv3 on the client mount? Or doyou need to use v4.x?


Regards,
Alex

On Tue, Aug 14, 2018 at 6:11 AM Karli Sjöberg <ka...@inparadise.se<mailto:ka...@inparadise.se>> wrote:


    On Fri, 2018-08-10 at 09:39 -0400, Kaleb S. KEITHLEY wrote:
    > On 08/10/2018 09:23 AM, Karli Sjöberg wrote:
    > > On Fri, 2018-08-10 at 21:23 +0800, Pui Edylie wrote:
    > > > Hi Karli,
    > > >
    > > > Storhaug works with glusterfs 4.1.2 and latest nfs-ganesha.
    > > >
    > > > I just installed them last weekend ... they are working very
    well
    > > > :)
    > >
    > > Okay, awesome!
    > >
    > > Is there any documentation on how to do that?
    > >
    >
    > https://github.com/gluster/storhaug/wiki
    >

    Thanks Kaleb and Edy!

    I have now redone the cluster using the latest and greatest following
    the above guide and repeated the same test I was doing before (the
    rsync while loop) with success. I let (forgot) it run for about a day
    and it was still chugging along nicely when I aborted it, so success
    there!

    On to the next test; the catastrophic failure test- where one of the
    servers dies, I'm having a more difficult time with.

    1) I start with mounting the share over NFS 4.1 and then proceed with
    writing a 8 GiB large random data file with 'dd', while "hard-cutting"
    the power to the server I'm writing to, the transfer just stops
    indefinitely, until the server comes back again. Is that supposed to
    happen? Like this:

    # dd if=/dev/urandom of=/var/tmp/test.bin bs=1M count=8192
    # mount -o vers=4.1 hv03v.localdomain:/data /mnt/
    # dd if=/var/tmp/test.bin of=/mnt/test.bin bs=1M status=progress
    2434793472 bytes (2,4 GB, 2,3 GiB) copied, 42 s, 57,9 MB/s

    (here I cut the power and let it be for almost two hours before
    turning
    it on again)

    dd: error writing '/mnt/test.bin': Remote I/O error
    2325+0 records in
    2324+0 records out
    2436890624 bytes (2,4 GB, 2,3 GiB) copied, 6944,84 s, 351 kB/s
    # umount /mnt

    Here the unmount command hung and I had to hard reset the client.

    2) Another question I have is why some files "change" as you copy them
    out to the Gluster storage? Is that the way it should be? This time, I
    deleted eveything in the destination directory to start over:

    # mount -o vers=4.1 hv03v.localdomain:/data /mnt/
    # rm -f /mnt/test.bin
    # dd if=/var/tmp/test.bin of=/mnt/test.bin bs=1M status=progress
    8557428736 bytes (8,6 GB, 8,0 GiB) copied, 122 s, 70,1 MB/s
    8192+0 records in
    8192+0 records out
    8589934592 bytes (8,6 GB, 8,0 GiB) copied, 123,039 s, 69,8 MB/s
    # md5sum /var/tmp/test.bin
    073867b68fa8eaa382ffe05adb90b583  /var/tmp/test.bin
    # md5sum /mnt/test.bin
    634187d367f856f3f5fb31846f796397  /mnt/test.bin
    # umount /mnt

    Thanks in advance!

    /K
    _______________________________________________
    Gluster-users mailing list
    Gluster-users@gluster.org <mailto:Gluster-users@gluster.org>
    https://lists.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] ganesha.nfsd process dies when copying files

Reply via email to