Hi Pranith,

My question was about setting up a gluster volume on an ext4 partition. I thought we had the bricks mounted as xfs for compatibility with gluster?

Pat


On 05/11/2017 12:06 PM, Pranith Kumar Karampuri wrote:


On Thu, May 11, 2017 at 9:32 PM, Pat Haley <pha...@mit.edu <mailto:pha...@mit.edu>> wrote:


    Hi Pranith,

    The /home partition is mounted as ext4
    /home              ext4 defaults,usrquota,grpquota      1 2

    The brick partitions are mounted ax xfs
    /mnt/brick1  xfs defaults        0 0
    /mnt/brick2  xfs defaults        0 0

    Will this cause a problem with creating a volume under /home?


I don't think the bottleneck is disk. You can do the same tests you did on your new volume to confirm?


    Pat



    On 05/11/2017 11:32 AM, Pranith Kumar Karampuri wrote:


    On Thu, May 11, 2017 at 8:57 PM, Pat Haley <pha...@mit.edu
    <mailto:pha...@mit.edu>> wrote:


        Hi Pranith,

        Unfortunately, we don't have similar hardware for a small
        scale test.  All we have is our production hardware.


    You said something about /home partition which has lesser disks,
    we can create plain distribute volume inside one of those
    directories. After we are done, we can remove the setup. What do
    you say?


        Pat




        On 05/11/2017 07:05 AM, Pranith Kumar Karampuri wrote:


        On Thu, May 11, 2017 at 2:48 AM, Pat Haley <pha...@mit.edu
        <mailto:pha...@mit.edu>> wrote:


            Hi Pranith,

            Since we are mounting the partitions as the bricks, I
            tried the dd test writing to
            <brick-path>/.glusterfs/<file-to-be-removed-after-test>.
            The results without oflag=sync were 1.6 Gb/s (faster
            than gluster but not as fast as I was expecting given
            the 1.2 Gb/s to the no-gluster area w/ fewer disks).


        Okay, then 1.6Gb/s is what we need to target for,
        considering your volume is just distribute. Is there any way
        you can do tests on similar hardware but at a small scale?
        Just so we can run the workload to learn more about the
        bottlenecks in the system? We can probably try to get the
        speed to 1.2Gb/s on your /home partition you were telling me
        yesterday. Let me know if that is something you are okay to do.


            Pat



            On 05/10/2017 01:27 PM, Pranith Kumar Karampuri wrote:


            On Wed, May 10, 2017 at 10:15 PM, Pat Haley
            <pha...@mit.edu <mailto:pha...@mit.edu>> wrote:


                Hi Pranith,

                Not entirely sure (this isn't my area of
                expertise). I'll run your answer by some other
                people who are more familiar with this.

                I am also uncertain about how to interpret the
                results when we also add the dd tests writing to
                the /home area (no gluster, still on the same machine)

                  * dd test without oflag=sync (rough average of
                    multiple tests)
                      o gluster w/ fuse mount : 570 Mb/s
                      o gluster w/ nfs mount: 390 Mb/s
                      o nfs (no gluster):  1.2 Gb/s
                  * dd test with oflag=sync (rough average of
                    multiple tests)
                      o gluster w/ fuse mount:  5 Mb/s
                      o gluster w/ nfs mount: 200 Mb/s
                      o nfs (no gluster): 20 Mb/s

                Given that the non-gluster area is a RAID-6 of 4
                disks while each brick of the gluster area is a
                RAID-6 of 32 disks, I would naively expect the
                writes to the gluster area to be roughly 8x faster
                than to the non-gluster.


            I think a better test is to try and write to a file
            using nfs without any gluster to a location that is not
            inside the brick but someother location that is on same
            disk(s). If you are mounting the partition as the
            brick, then we can write to a file inside .glusterfs
            directory, something like
            <brick-path>/.glusterfs/<file-to-be-removed-after-test>.


                I still think we have a speed issue, I can't tell
                if fuse vs nfs is part of the problem.


            I got interested in the post because I read that fuse
            speed is lesser than nfs speed which is
            counter-intuitive to my understanding. So wanted
            clarifications. Now that I got my clarifications where
            fuse outperformed nfs without sync, we can resume
            testing as described above and try to find what it is.
            Based on your email-id I am guessing you are from
            Boston and I am from Bangalore so if you are okay with
            doing this debugging for multiple days because of
            timezones, I will be happy to help. Please be a bit
            patient with me, I am under a release crunch but I am
            very curious with the problem you posted.

                Was there anything useful in the profiles?


            Unfortunately profiles didn't help me much, I think we
            are collecting the profiles from an active volume, so
            it has a lot of information that is not pertaining to
            dd so it is difficult to find the contributions of dd.
            So I went through your post again and found something I
            didn't pay much attention to earlier i.e. oflag=sync,
            so did my own tests on my setup with FUSE so sent that
            reply.


                Pat



                On 05/10/2017 12:15 PM, Pranith Kumar Karampuri wrote:
                Okay good. At least this validates my doubts.
                Handling O_SYNC in gluster NFS and fuse is a bit
                different.
                When application opens a file with O_SYNC on fuse
                mount then each write syscall has to be written to
                disk as part of the syscall where as in case of
                NFS, there is no concept of open. NFS performs
                write though a handle saying it needs to be a
                synchronous write, so write() syscall is performed
                first then it performs fsync(). so an write on an
                fd with O_SYNC becomes write+fsync. I am
                suspecting that when multiple threads do this
                write+fsync() operation on the same file, multiple
                writes are batched together to be written do disk
                so the throughput on the disk is increasing is my
                guess.

                Does it answer your doubts?

                On Wed, May 10, 2017 at 9:35 PM, Pat Haley
                <pha...@mit.edu <mailto:pha...@mit.edu>> wrote:


                    Without the oflag=sync and only a single test
                    of each, the FUSE is going faster than NFS:

                    FUSE:
                    mseas-data2(dri_nascar)% dd if=/dev/zero
                    count=4096 bs=1048576 of=zeros.txt conv=sync
                    4096+0 records in
                    4096+0 records out
                    4294967296 bytes (4.3 GB) copied, 7.46961 s,
                    575 MB/s


                    NFS
                    mseas-data2(HYCOM)% dd if=/dev/zero count=4096
                    bs=1048576 of=zeros.txt conv=sync
                    4096+0 records in
                    4096+0 records out
                    4294967296 bytes (4.3 GB) copied, 11.4264 s,
                    376 MB/s



                    On 05/10/2017 11:53 AM, Pranith Kumar
                    Karampuri wrote:
                    Could you let me know the speed without
                    oflag=sync on both the mounts? No need to
                    collect profiles.

                    On Wed, May 10, 2017 at 9:17 PM, Pat Haley
                    <pha...@mit.edu <mailto:pha...@mit.edu>> wrote:


                        Here is what I see now:

                        [root@mseas-data2 ~]# gluster volume info

                        Volume Name: data-volume
                        Type: Distribute
                        Volume ID:
                        c162161e-2a2d-4dac-b015-f31fd89ceb18
                        Status: Started
                        Number of Bricks: 2
                        Transport-type: tcp
                        Bricks:
                        Brick1: mseas-data2:/mnt/brick1
                        Brick2: mseas-data2:/mnt/brick2
                        Options Reconfigured:
                        diagnostics.count-fop-hits: on
                        diagnostics.latency-measurement: on
                        nfs.exports-auth-enable: on
                        diagnostics.brick-sys-log-level: WARNING
                        performance.readdir-ahead: on
                        nfs.disable: on
                        nfs.export-volumes: off



                        On 05/10/2017 11:44 AM, Pranith Kumar
                        Karampuri wrote:
                        Is this the volume info you have?

                        >/[root at mseas-data2
                        <http://www.gluster.org/mailman/listinfo/gluster-users>
                        ~]# gluster volume info />//>/Volume Name: data-volume 
/>/Type: Distribute />/Volume ID:
                        c162161e-2a2d-4dac-b015-f31fd89ceb18 />/Status: Started />/Number of Bricks: 2 
/>/Transport-type: tcp />/Bricks: />/Brick1: mseas-data2:/mnt/brick1 />/Brick2: 
mseas-data2:/mnt/brick2 />/Options Reconfigured: />/performance.readdir-ahead: on />/nfs.disable: on 
/>/nfs.export-volumes: off /
                        ​I copied this from old thread from
                        2016. This is distribute volume. Did you
                        change any of the options in between?

--
                        
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                        Pat Haley                          Email:pha...@mit.edu 
<mailto:pha...@mit.edu>
                        Center for Ocean Engineering       Phone:  (617) 
253-6824
                        Dept. of Mechanical Engineering    Fax:    (617) 
253-8125
                        MIT, Room 5-213http://web.mit.edu/phaley/www/
                        77 Massachusetts Avenue
                        Cambridge, MA  02139-4301

-- Pranith
--
                    
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                    Pat Haley                          Email:pha...@mit.edu 
<mailto:pha...@mit.edu>
                    Center for Ocean Engineering       Phone:  (617) 253-6824
                    Dept. of Mechanical Engineering    Fax:    (617) 253-8125
                    MIT, Room 5-213http://web.mit.edu/phaley/www/
                    77 Massachusetts Avenue
                    Cambridge, MA  02139-4301

-- Pranith
--
                
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
                Pat Haley                          Email:pha...@mit.edu 
<mailto:pha...@mit.edu>
                Center for Ocean Engineering       Phone:  (617) 253-6824
                Dept. of Mechanical Engineering    Fax:    (617) 253-8125
                MIT, Room 5-213http://web.mit.edu/phaley/www/
                77 Massachusetts Avenue
                Cambridge, MA  02139-4301

-- Pranith
--
            -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
            Pat Haley                          Email:pha...@mit.edu 
<mailto:pha...@mit.edu>
            Center for Ocean Engineering       Phone:  (617) 253-6824
            Dept. of Mechanical Engineering    Fax:    (617) 253-8125
            MIT, Room 5-213http://web.mit.edu/phaley/www/
            77 Massachusetts Avenue
            Cambridge, MA  02139-4301

-- Pranith
--
        -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
        Pat Haley                          Email:pha...@mit.edu 
<mailto:pha...@mit.edu>
        Center for Ocean Engineering       Phone:  (617) 253-6824
        Dept. of Mechanical Engineering    Fax:    (617) 253-8125
        MIT, Room 5-213http://web.mit.edu/phaley/www/
        77 Massachusetts Avenue
        Cambridge, MA  02139-4301

-- Pranith
--
    -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
    Pat Haley                          Email:pha...@mit.edu 
<mailto:pha...@mit.edu>
    Center for Ocean Engineering       Phone:  (617) 253-6824
    Dept. of Mechanical Engineering    Fax:    (617) 253-8125
    MIT, Room 5-213http://web.mit.edu/phaley/www/
    77 Massachusetts Avenue
    Cambridge, MA  02139-4301

--
Pranith
--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley                          Email:  pha...@mit.edu
Center for Ocean Engineering       Phone:  (617) 253-6824
Dept. of Mechanical Engineering    Fax:    (617) 253-8125
MIT, Room 5-213                    http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Reply via email to