I sent a fix <http://review.gluster.org/#/c/10478/> but abandoned it since Susant (CC'ed) has already sent one http://review.gluster.org/#/c/10459/
I think it needs re-submission, but more review-eyes are welcome.
-Ravi

On 05/01/2015 12:18 PM, Benjamin Turner wrote:
There was a segfault on gqas001, have a look when you get a sec:

Core was generated by `/usr/sbin/glusterfs -s localhost --volfile-id rebalance/testvol --xlator-option'.
Program terminated with signal 11, Segmentation fault.
#0 gf_defrag_get_entry (this=0x7f26f8011180, defrag=0x7f26f8031ef0, loc=0x7f26f4dbbfd0, migrate_data=0x7f2707874be8) at dht-rebalance.c:2032
2032           GF_FREE (tmp_container->parent_loc);
(gdb) bt
#0 gf_defrag_get_entry (this=0x7f26f8011180, defrag=0x7f26f8031ef0, loc=0x7f26f4dbbfd0, migrate_data=0x7f2707874be8) at dht-rebalance.c:2032 #1 gf_defrag_process_dir (this=0x7f26f8011180, defrag=0x7f26f8031ef0, loc=0x7f26f4dbbfd0, migrate_data=0x7f2707874be8) at dht-rebalance.c:2207 #2 0x00007f26fdae1eb8 in gf_defrag_fix_layout (this=0x7f26f8011180, defrag=0x7f26f8031ef0, loc=0x7f26f4dbbfd0, fix_layout=0x7f2707874b5c, migrate_data=0x7f2707874be8)
    at dht-rebalance.c:2299
#3 0x00007f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180, defrag=0x7f26f8031ef0, loc=0x7f26f4dbc200, fix_layout=0x7f2707874b5c, migrate_data=0x7f2707874be8)
    at dht-rebalance.c:2416
#4 0x00007f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180, defrag=0x7f26f8031ef0, loc=0x7f26f4dbc430, fix_layout=0x7f2707874b5c, migrate_data=0x7f2707874be8)
    at dht-rebalance.c:2416
#5 0x00007f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180, defrag=0x7f26f8031ef0, loc=0x7f26f4dbc660, fix_layout=0x7f2707874b5c, migrate_data=0x7f2707874be8)
    at dht-rebalance.c:2416
#6 0x00007f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180, defrag=0x7f26f8031ef0, loc=0x7f26f4dbc890, fix_layout=0x7f2707874b5c, migrate_data=0x7f2707874be8)
    at dht-rebalance.c:2416
#7 0x00007f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180, defrag=0x7f26f8031ef0, loc=0x7f26f4dbcac0, fix_layout=0x7f2707874b5c, migrate_data=0x7f2707874be8)
    at dht-rebalance.c:2416
#8 0x00007f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180, defrag=0x7f26f8031ef0, loc=0x7f26f4dbccf0, fix_layout=0x7f2707874b5c, migrate_data=0x7f2707874be8)
    at dht-rebalance.c:2416
#9 0x00007f26fdae1f4b in gf_defrag_fix_layout (this=0x7f26f8011180, defrag=0x7f26f8031ef0, loc=0x7f26f4dbcf60, fix_layout=0x7f2707874b5c, migrate_data=0x7f2707874be8)
    at dht-rebalance.c:2416
#10 0x00007f26fdae2524 in gf_defrag_start_crawl (data=0x7f26f8011180) at dht-rebalance.c:2599 #11 0x00007f2709024f62 in synctask_wrap (old_task=<value optimized out>) at syncop.c:375 #12 0x0000003648c438f0 in ?? () from /lib64/libc-2.12.so <http://libc-2.12.so>
#13 0x0000000000000000 in ?? ()


On Fri, May 1, 2015 at 12:53 AM, Benjamin Turner <bennytu...@gmail.com <mailto:bennytu...@gmail.com>> wrote:

Ok I have all my data created and I just started the rebalance. One thing to not in the client log I see the following spamming:

    [root@gqac006 ~]# cat /var/log/glusterfs/gluster-mount-.log | wc -l
    394042

    [2015-05-01 00:47:55.591150] I [MSGID: 109036]
    [dht-common.c:6478:dht_log_new_layout_for_dir_selfheal]
    0-testvol-dht: Setting layout of
    
/file_dstdir/gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_006
    <http://gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_006>
    with [Subvol_name: testvol-replicate-0, Err: -1 , Start: 0 , Stop:
    2141429669 ], [Subvol_name: testvol-replicate-1, Err: -1 , Start:
    2141429670 , Stop: 4294967295 ],
    [2015-05-01 00:47:55.596147] I
    [dht-selfheal.c:1587:dht_selfheal_layout_new_directory]
    0-testvol-dht: chunk size = 0xffffffff / 19920276 = 0xd7
    [2015-05-01 00:47:55.596177] I
    [dht-selfheal.c:1626:dht_selfheal_layout_new_directory]
    0-testvol-dht: assigning range size 0x7fa39fa6 to testvol-replicate-1
    [2015-05-01 00:47:55.596189] I
    [dht-selfheal.c:1626:dht_selfheal_layout_new_directory]
    0-testvol-dht: assigning range size 0x7fa39fa6 to testvol-replicate-0
    [2015-05-01 00:47:55.597081] I [MSGID: 109036]
    [dht-common.c:6478:dht_log_new_layout_for_dir_selfheal]
    0-testvol-dht: Setting layout of
    
/file_dstdir/gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_005
    <http://gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_005>
    with [Subvol_name: testvol-replicate-0, Err: -1 , Start:
    2141429670 , Stop: 4294967295 ], [Subvol_name:
    testvol-replicate-1, Err: -1 , Start: 0 , Stop: 2141429669 ],
    [2015-05-01 00:47:55.601853] I
    [dht-selfheal.c:1587:dht_selfheal_layout_new_directory]
    0-testvol-dht: chunk size = 0xffffffff / 19920276 = 0xd7
    [2015-05-01 00:47:55.601882] I
    [dht-selfheal.c:1626:dht_selfheal_layout_new_directory]
    0-testvol-dht: assigning range size 0x7fa39fa6 to testvol-replicate-1
    [2015-05-01 00:47:55.601895] I
    [dht-selfheal.c:1626:dht_selfheal_layout_new_directory]
    0-testvol-dht: assigning range size 0x7fa39fa6 to testvol-replicate-0

    Just to confirm the patch is
    in, glusterfs-3.8dev-0.71.gita7f8482.el6.x86_64. Correct?

    Here is the info on the data set:

    hosts in test : ['gqac006.sbu.lab.eng.bos.redhat.com
    <http://gqac006.sbu.lab.eng.bos.redhat.com>',
    'gqas003.sbu.lab.eng.bos.redhat.com
    <http://gqas003.sbu.lab.eng.bos.redhat.com>']
    top test directory(s) : ['/gluster-mount']
    peration : create
    files/thread : 500000
    threads : 8
    record size (KB, 0 = maximum) : 0
    file size (KB) : 64
    file size distribution : fixed
    files per dir : 100
    dirs per dir : 10
    total threads = 16
    total files = 7222600
    total data =   440.833 GB
     90.28% of requested files processed, minimum is  70.00
    8107.852862 sec elapsed time
    890.815377 files/sec
    890.815377 IOPS
    55.675961 MB/sec

    Here is the rebalance run after about 5 or so minutes:

    [root@gqas001 ~]# gluster v rebalance testvol status
Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 32203 2.0GB 120858 0 5184 in progress 1294.00
    gqas011.sbu.lab.eng.bos.redhat.com
<http://gqas011.sbu.lab.eng.bos.redhat.com> 0 0Bytes 0 0 0 failed 0.00
    gqas016.sbu.lab.eng.bos.redhat.com
<http://gqas016.sbu.lab.eng.bos.redhat.com> 9364 585.2MB 53121 0 0 in progress 1294.00
    gqas013.sbu.lab.eng.bos.redhat.com
<http://gqas013.sbu.lab.eng.bos.redhat.com> 0 0Bytes 14750 0 0 in progress 1294.00
    gqas014.sbu.lab.eng.bos.redhat.com
<http://gqas014.sbu.lab.eng.bos.redhat.com> 0 0Bytes 0 0 0 failed 0.00
    gqas015.sbu.lab.eng.bos.redhat.com
<http://gqas015.sbu.lab.eng.bos.redhat.com> 0 0Bytes 196382 0 0 in progress 1294.00
    volume rebalance: testvol: success:

    The hostnames are there if you want to poke around. I had a
    problem with one of the added systems being on a different version
    of glusterfs so I had to update everything to
    glusterfs-3.8dev-0.99.git7d7b80e.el6.x86_64, remove the bricks I
    just added, and add them back.  Something may have went wrong in
    that process but I thought I did everything correctly.  I'll start
    fresh tomorrow.  I figured I'd let this run over night.

    -b




    On Wed, Apr 29, 2015 at 9:48 PM, Benjamin Turner
    <bennytu...@gmail.com <mailto:bennytu...@gmail.com>> wrote:

        Sweet!  Here is the baseline:

        [root@gqas001 ~]# gluster v rebalance testvol status
Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 1328575 81.1GB 9402953 0 0 completed
                  98500.00
        gqas012.sbu.lab.eng.bos.redhat.com
<http://gqas012.sbu.lab.eng.bos.redhat.com> 0 0Bytes 8000011 0 0
         completed           51982.00
        gqas003.sbu.lab.eng.bos.redhat.com
<http://gqas003.sbu.lab.eng.bos.redhat.com> 0 0Bytes 8000011 0 0
         completed           51982.00
        gqas004.sbu.lab.eng.bos.redhat.com
<http://gqas004.sbu.lab.eng.bos.redhat.com> 1326290 81.0GB 9708625 0 0
         completed           98500.00
        gqas013.sbu.lab.eng.bos.redhat.com
<http://gqas013.sbu.lab.eng.bos.redhat.com> 0 0Bytes 8000011 0 0
         completed           51982.00
        gqas014.sbu.lab.eng.bos.redhat.com
<http://gqas014.sbu.lab.eng.bos.redhat.com> 0 0Bytes 8000011 0 0
         completed           51982.00
        volume rebalance: testvol: success:

        I'll have a run on the patch started tomorrow.

        -b

        On Wed, Apr 29, 2015 at 12:51 PM, Nithya Balachandran
        <nbala...@redhat.com <mailto:nbala...@redhat.com>> wrote:


            Doh my mistake, I thought it was merged.  I was just
            running with the
            upstream 3.7 daily.  Can I use this run as my baseline and
            then I can run
            next time on the patch to show the % improvement?  I'll
            wipe everything and
            try on the patch, any idea when it will be merged?

            Yes, it would be very useful to have this run as the
            baseline. The patch has just been merged in master. It
            should be backported to 3.7 in a day or so.

            Regards,
            Nithya


            > > > >
            > > > > >
            > > > > > On Wed, Apr 22, 2015 at 1:10 AM, Nithya Balachandran
            > > > > > <nbala...@redhat.com <mailto:nbala...@redhat.com>>
            > > > > > wrote:
            > > > > >
            > > > > > > That sounds great. Thanks.
            > > > > > >
            > > > > > > Regards,
            > > > > > > Nithya
            > > > > > >
            > > > > > > ----- Original Message -----
            > > > > > > From: "Benjamin Turner" <bennytu...@gmail.com
            <mailto:bennytu...@gmail.com>>
            > > > > > > To: "Nithya Balachandran" <nbala...@redhat.com
            <mailto:nbala...@redhat.com>>
            > > > > > > Cc: "Susant Palai" <spa...@redhat.com
            <mailto:spa...@redhat.com>>, "Gluster Devel" <
            > > > > > > gluster-devel@gluster.org
            <mailto:gluster-devel@gluster.org>>
            > > > > > > Sent: Wednesday, 22 April, 2015 12:14:14 AM
            > > > > > > Subject: Re: [Gluster-devel] Rebalance
            improvement design
            > > > > > >
            > > > > > > I am setting up a test env now, I'll have some
            feedback for you
            > this
            > > > > > > week.
            > > > > > >
            > > > > > > -b
            > > > > > >
            > > > > > > On Tue, Apr 21, 2015 at 11:36 AM, Nithya
            Balachandran
            > > > > > > <nbala...@redhat.com <mailto:nbala...@redhat.com>
            > > > > > > >
            > > > > > > wrote:
            > > > > > >
            > > > > > > > Hi Ben,
            > > > > > > >
            > > > > > > > Did you get a chance to try this out?
            > > > > > > >
            > > > > > > > Regards,
            > > > > > > > Nithya
            > > > > > > >
            > > > > > > > ----- Original Message -----
            > > > > > > > From: "Susant Palai" <spa...@redhat.com
            <mailto:spa...@redhat.com>>
            > > > > > > > To: "Benjamin Turner" <bennytu...@gmail.com
            <mailto:bennytu...@gmail.com>>
            > > > > > > > Cc: "Gluster Devel"
            <gluster-devel@gluster.org <mailto:gluster-devel@gluster.org>>
            > > > > > > > Sent: Monday, April 13, 2015 9:55:07 AM
            > > > > > > > Subject: Re: [Gluster-devel] Rebalance
            improvement design
            > > > > > > >
            > > > > > > > Hi Ben,
            > > > > > > >  Uploaded a new patch here:
            > http://review.gluster.org/#/c/9657/.
            > > > > > > >  We
            > > > > > > >  can
            > > > > > > > start perf test on it. :)
            > > > > > > >
            > > > > > > > Susant
            > > > > > > >
            > > > > > > > ----- Original Message -----
            > > > > > > > From: "Susant Palai" <spa...@redhat.com
            <mailto:spa...@redhat.com>>
            > > > > > > > To: "Benjamin Turner" <bennytu...@gmail.com
            <mailto:bennytu...@gmail.com>>
            > > > > > > > Cc: "Gluster Devel"
            <gluster-devel@gluster.org <mailto:gluster-devel@gluster.org>>
            > > > > > > > Sent: Thursday, 9 April, 2015 3:40:09 PM
            > > > > > > > Subject: Re: [Gluster-devel] Rebalance
            improvement design
            > > > > > > >
            > > > > > > > Thanks Ben. RPM is not available and I am
            planning to refresh
            > the
            > > > > > > > patch
            > > > > > > in
            > > > > > > > two days with some more regression fixes. I
            think we can run
            > the
            > > > > > > > tests
            > > > > > > post
            > > > > > > > that. Any larger data-set will be good(say 3
            to 5 TB).
            > > > > > > >
            > > > > > > > Thanks,
            > > > > > > > Susant
            > > > > > > >
            > > > > > > > ----- Original Message -----
            > > > > > > > From: "Benjamin Turner"
            <bennytu...@gmail.com <mailto:bennytu...@gmail.com>>
            > > > > > > > To: "Vijay Bellur" <vbel...@redhat.com
            <mailto:vbel...@redhat.com>>
            > > > > > > > Cc: "Susant Palai" <spa...@redhat.com
            <mailto:spa...@redhat.com>>, "Gluster Devel" <
            > > > > > > > gluster-devel@gluster.org
            <mailto:gluster-devel@gluster.org>>
            > > > > > > > Sent: Thursday, 9 April, 2015 2:10:30 AM
            > > > > > > > Subject: Re: [Gluster-devel] Rebalance
            improvement design
            > > > > > > >
            > > > > > > >
            > > > > > > > I have some rebalance perf regression stuff
            I have been
            > working on,
            > > > > > > > is
            > > > > > > > there an RPM with these patches anywhere so
            that I can try it
            > on my
            > > > > > > > systems? If not I'll just build from:
            > > > > > > >
            > > > > > > >
            > > > > > > > git fetch git://
            review.gluster.org/glusterfs
            <http://review.gluster.org/glusterfs>
            > > > > > > > refs/changes/57/9657/8
            > > > > > > > &&
            > > > > > > > git cherry-pick FETCH_HEAD
            > > > > > > >
            > > > > > > >
            > > > > > > >
            > > > > > > > I will have _at_least_ 10TB of storage, how
            many TBs of data
            > should
            > > > > > > > I
            > > > > > > > run
            > > > > > > > with?
            > > > > > > >
            > > > > > > >
            > > > > > > > -b
            > > > > > > >
            > > > > > > >
            > > > > > > > On Tue, Apr 7, 2015 at 9:07 AM, Vijay Bellur <
            > vbel...@redhat.com <mailto:vbel...@redhat.com> >
            > > > > > > wrote:
            > > > > > > >
            > > > > > > >
            > > > > > > >
            > > > > > > >
            > > > > > > > On 04/07/2015 03:08 PM, Susant Palai wrote:
            > > > > > > >
            > > > > > > >
            > > > > > > > Here is one test performed on a 300GB data
            set and around
            > 100%(1/2
            > > > > > > > the
            > > > > > > > time) improvement was seen.
            > > > > > > >
            > > > > > > > [root@gprfs031 ~]# gluster v i
            > > > > > > >
            > > > > > > > Volume Name: rbperf
            > > > > > > > Type: Distribute
            > > > > > > > Volume ID: 35562662-337e-4923-b862- d0bbb0748003
            > > > > > > > Status: Started
            > > > > > > > Number of Bricks: 4
            > > > > > > > Transport-type: tcp
            > > > > > > > Bricks:
            > > > > > > > Brick1: gprfs029-10ge:/bricks/ gprfs029/brick1
            > > > > > > > Brick2: gprfs030-10ge:/bricks/ gprfs030/brick1
            > > > > > > > Brick3: gprfs031-10ge:/bricks/ gprfs031/brick1
            > > > > > > > Brick4: gprfs032-10ge:/bricks/ gprfs032/brick1
            > > > > > > >
            > > > > > > >
            > > > > > > > Added server 32 and started rebalance force.
            > > > > > > >
            > > > > > > > Rebalance stat for new changes:
            > > > > > > > [root@gprfs031 ~]# gluster v rebalance
            rbperf status
            > > > > > > > Node Rebalanced-files size scanned failures
            skipped status run
            > time
            > > > > > > > in
            > > > > > > secs
            > > > > > > > --------- ----------- -----------
            ----------- -----------
            > > > > > > > -----------
            > > > > > > > ------------ --------------
            > > > > > > > localhost 74639 36.1GB 297319 0 0 completed
            1743.00
            > > > > > > > 172.17.40.30 67512 33.5GB 269187 0 0
            completed 1395.00
            > > > > > > > gprfs029-10ge 79095 38.8GB 284105 0 0
            completed 1559.00
            > > > > > > > gprfs032-10ge 0 0Bytes 0 0 0 completed 402.00
            > > > > > > > volume rebalance: rbperf: success:
            > > > > > > >
            > > > > > > > Rebalance stat for old model:
            > > > > > > > [root@gprfs031 ~]# gluster v rebalance
            rbperf status
            > > > > > > > Node Rebalanced-files size scanned failures
            skipped status run
            > time
            > > > > > > > in
            > > > > > > secs
            > > > > > > > --------- ----------- -----------
            ----------- -----------
            > > > > > > > -----------
            > > > > > > > ------------ --------------
            > > > > > > > localhost 86493 42.0GB 634302 0 0 completed
            3329.00
            > > > > > > > gprfs029-10ge 94115 46.2GB 687852 0 0
            completed 3328.00
            > > > > > > > gprfs030-10ge 74314 35.9GB 651943 0 0
            completed 3072.00
            > > > > > > > gprfs032-10ge 0 0Bytes 594166 0 0 completed
            1943.00
            > > > > > > > volume rebalance: rbperf: success:
            > > > > > > >
            > > > > > > >
            > > > > > > > This is interesting. Thanks for sharing &
            well done! Maybe we
            > > > > > > > should
            > > > > > > > attempt a much larger data set and see how
            we fare there :).
            > > > > > > >
            > > > > > > > Regards,
            > > > > > > >
            > > > > > > >
            > > > > > > > Vijay
            > > > > > > >
            > > > > > > >
            > > > > > > > ______________________________ _________________
            > > > > > > > Gluster-devel mailing list
            > > > > > > > Gluster-devel@gluster.org
            <mailto:Gluster-devel@gluster.org>
            > > > > > > > http://www.gluster.org/
            mailman/listinfo/gluster-devel
            > > > > > > >
            > > > > > > > _______________________________________________
            > > > > > > > Gluster-devel mailing list
            > > > > > > > Gluster-devel@gluster.org
            <mailto:Gluster-devel@gluster.org>
            > > > > > > >
            http://www.gluster.org/mailman/listinfo/gluster-devel
            > > > > > > > _______________________________________________
            > > > > > > > Gluster-devel mailing list
            > > > > > > > Gluster-devel@gluster.org
            <mailto:Gluster-devel@gluster.org>
            > > > > > > >
            http://www.gluster.org/mailman/listinfo/gluster-devel
            > > > > > > >
            > > > > > >
            > > > > >
            > > > > _______________________________________________
            > > > > Gluster-devel mailing list
            > > > > Gluster-devel@gluster.org
            <mailto:Gluster-devel@gluster.org>
            > > > > http://www.gluster.org/mailman/listinfo/gluster-devel
            > > > >
            > > > _______________________________________________
            > > > Gluster-devel mailing list
            > > > Gluster-devel@gluster.org
            <mailto:Gluster-devel@gluster.org>
            > > > http://www.gluster.org/mailman/listinfo/gluster-devel
            > > >
            > >
            >






_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Reply via email to