Re: [Gluster-devel] Rebalance improvement design
Comments inline - Original Message - From: Benjamin Turner bennytu...@gmail.com To: Susant Palai spa...@redhat.com Cc: Vijay Bellur vbel...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Monday, May 4, 2015 8:58:13 PM Subject: Re: [Gluster-devel] Rebalance improvement design I see: #define GF_DECIDE_DEFRAG_THROTTLE_COUNT(throttle_count, conf) { \ \ throttle_count = MAX ((get_nprocs() - 4), 4); \ \ if (!strcmp (conf-dthrottle, lazy)) \ conf-defrag-rthcount = 1; \ \ if (!strcmp (conf-dthrottle, normal))\ conf-defrag-rthcount = (throttle_count / 2); \ \ if (!strcmp (conf-dthrottle, aggressive))\ conf-defrag-rthcount = throttle_count; \ So aggressive will give us the default of (20 + 16), normal is that divided The 16 here you mentioned are sync threads that scales with the workload independent of migration. The number 20 is the no. of dedicated threads for carrying out migration. Planning to make the maximum threads allowed to be the number of processing units available or 4. [MAX (get_nprocs() , 4)]. by 2, and lazy is 1, is that correct? If so that is what I was looking to see. The only other thing I can think of here is making the tunible a number like event threads, but I like this. IDK if I saw it documented but if its not we should note this in help. Sure will be documented. Also to note, the old time was 98500.00 the new one is 55088.00, that is a 44% improvement! -b On Mon, May 4, 2015 at 9:06 AM, Susant Palai spa...@redhat.com wrote: Ben, On no. of threads: Sent throttle patch here:http://review.gluster.org/#/c/10526/ to limit thread numbers[Not merged]. The rebalance process in current model spawns 20 threads and in addition to that there will be a max 16 syncop threads. Crash: The crash should be fixed by this: http://review.gluster.org/#/c/10459/. Rebalance time taken is a factor of number of files and their size. If the frequency of files getting added to the global queue[on which the migrator threads act] is higher, faster will be the rebalance. I guess here we are seeing the effect of local crawl mostly as only 81GB is migrated out of 500GB. Thanks, Susant - Original Message - From: Benjamin Turner bennytu...@gmail.com To: Vijay Bellur vbel...@redhat.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Monday, May 4, 2015 5:18:13 PM Subject: Re: [Gluster-devel] Rebalance improvement design Thanks Vijay! I forgot to upgrade the kernel(thinp 6.6 perf bug gah) before I created this data set, so its a bit smaller: total threads = 16 total files = 7,060,700 (64 kb files, 100 files per dir) total data = 430.951 GB 88.26% of requested files processed, minimum is 70.00 10101.355737 sec elapsed time 698.985382 files/sec 698.985382 IOPS 43.686586 MB/sec I updated everything and ran the rebalanace on glusterfs-3.8dev-0.107.git275f724.el6.x86_64.: [root@gqas001 ~]# gluster v rebalance testvol status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost 1327346 81.0GB 3999140 0 0 completed 55088.00 gqas013.sbu.lab.eng.bos.redhat.com 0 0Bytes 1 0 0 completed 26070.00 gqas011.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 0 0 failed 0.00 gqas014.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 0 0 failed 0.00 gqas016.sbu.lab.eng.bos.redhat.com 1325857 80.9GB 4000865 0 0 completed 55088.00 gqas015.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 0 0 failed 0.00 volume rebalance: testvol: success: A couple observations: I am seeing lots of threads / processes running: [root@gqas001 ~]# ps -eLf | grep glu | wc -l 96 - 96 gluster threads [root@gqas001 ~]# ps -eLf | grep rebal | wc -l 36 - 36 rebal threads. Is this tunible? Is there a use case where we would need to limit this? Just curious, how did we arrive at 36 rebal threads? # cat /var/log/glusterfs/testvol-rebalance.log | wc -l 4,577,583 [root@gqas001 ~]# ll /var/log/glusterfs/testvol-rebalance.log -h -rw--- 1 root root 1.6G May 3 12:29 /var/log/glusterfs/testvol-rebalance.log :) How big is this going to get when I do the 10-20 TB? I'll keep tabs on this, my default test setup only has: [root@gqas001 ~]# df -h Filesystem Size Used Avail Use
Re: [Gluster-devel] Rebalance improvement design
Thanks Vijay! I forgot to upgrade the kernel(thinp 6.6 perf bug gah) before I created this data set, so its a bit smaller: total threads = 16 total files = 7,060,700 (64 kb files, 100 files per dir) total data = 430.951 GB 88.26% of requested files processed, minimum is 70.00 10101.355737 sec elapsed time 698.985382 files/sec 698.985382 IOPS 43.686586 MB/sec I updated everything and ran the rebalanace on glusterfs-3.8dev-0.107.git275f724.el6.x86_64.: [root@gqas001 ~]# gluster v rebalance testvol status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost 132734681.0GB 3999140 0 0completed 55088.00 gqas013.sbu.lab.eng.bos.redhat.com00Bytes 1 0 0completed 26070.00 gqas011.sbu.lab.eng.bos.redhat.com00Bytes 0 0 0 failed 0.00 gqas014.sbu.lab.eng.bos.redhat.com00Bytes 0 0 0 failed 0.00 gqas016.sbu.lab.eng.bos.redhat.com 132585780.9GB 4000865 0 0completed 55088.00 gqas015.sbu.lab.eng.bos.redhat.com00Bytes 0 0 0 failed 0.00 volume rebalance: testvol: success: A couple observations: I am seeing lots of threads / processes running: [root@gqas001 ~]# ps -eLf | grep glu | wc -l 96 - 96 gluster threads [root@gqas001 ~]# ps -eLf | grep rebal | wc -l 36 - 36 rebal threads. Is this tunible? Is there a use case where we would need to limit this? Just curious, how did we arrive at 36 rebal threads? # cat /var/log/glusterfs/testvol-rebalance.log | wc -l 4,577,583 [root@gqas001 ~]# ll /var/log/glusterfs/testvol-rebalance.log -h -rw--- 1 root root 1.6G May 3 12:29 /var/log/glusterfs/testvol-rebalance.log :) How big is this going to get when I do the 10-20 TB? I'll keep tabs on this, my default test setup only has: [root@gqas001 ~]# df -h FilesystemSize Used Avail Use% Mounted on /dev/mapper/vg_gqas001-lv_root 50G 4.8G 42G 11% / tmpfs 24G 0 24G 0% /dev/shm /dev/sda1 477M 65M 387M 15% /boot /dev/mapper/vg_gqas001-lv_home 385G 71M 366G 1% /home /dev/mapper/gluster_vg-lv_bricks 9.5T 219G 9.3T 3% /bricks Next run I want to fill up a 10TB cluster and double the # of bricks to simulate running out of space doubling capacity. Any other fixes or changes that need to go in before I try a larger data set? Before that I may run my performance regression suite against a system while a rebal is in progress and check how it affects performance. I'll turn both these cases into perf regression tests that I run with iozone smallfile and such, any other use cases I should add? Should I add hard / soft links / whatever else tot he data set? -b On Sun, May 3, 2015 at 11:48 AM, Vijay Bellur vbel...@redhat.com wrote: On 05/01/2015 10:23 AM, Benjamin Turner wrote: Ok I have all my data created and I just started the rebalance. One thing to not in the client log I see the following spamming: [root@gqac006 ~]# cat /var/log/glusterfs/gluster-mount-.log | wc -l 394042 [2015-05-01 00:47:55.591150] I [MSGID: 109036] [dht-common.c:6478:dht_log_new_layout_for_dir_selfheal] 0-testvol-dht: Setting layout of /file_dstdir/ gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_006 http://gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_006 with [Subvol_name: testvol-replicate-0, Err: -1 , Start: 0 , Stop: 2141429669 ], [Subvol_name: testvol-replicate-1, Err: -1 , Start: 2141429670 , Stop: 4294967295 ], [2015-05-01 00:47:55.596147] I [dht-selfheal.c:1587:dht_selfheal_layout_new_directory] 0-testvol-dht: chunk size = 0x / 19920276 = 0xd7 [2015-05-01 00:47:55.596177] I [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-testvol-dht: assigning range size 0x7fa39fa6 to testvol-replicate-1 I also noticed the same set of excessive logs in my tests. Have sent across a patch [1] to address this problem. -Vijay [1] http://review.gluster.org/10281 ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Rebalance improvement design
I see: #define GF_DECIDE_DEFRAG_THROTTLE_COUNT(throttle_count, conf) { \ \ throttle_count = MAX ((get_nprocs() - 4), 4); \ \ if (!strcmp (conf-dthrottle, lazy)) \ conf-defrag-rthcount = 1; \ \ if (!strcmp (conf-dthrottle, normal))\ conf-defrag-rthcount = (throttle_count / 2); \ \ if (!strcmp (conf-dthrottle, aggressive))\ conf-defrag-rthcount = throttle_count; \ So aggressive will give us the default of (20 + 16), normal is that divided by 2, and lazy is 1, is that correct? If so that is what I was looking to see. The only other thing I can think of here is making the tunible a number like event threads, but I like this. IDK if I saw it documented but if its not we should note this in help. Also to note, the old time was 98500.00 the new one is 55088.00, that is a 44% improvement! -b On Mon, May 4, 2015 at 9:06 AM, Susant Palai spa...@redhat.com wrote: Ben, On no. of threads: Sent throttle patch here:http://review.gluster.org/#/c/10526/ to limit thread numbers[Not merged]. The rebalance process in current model spawns 20 threads and in addition to that there will be a max 16 syncop threads. Crash: The crash should be fixed by this: http://review.gluster.org/#/c/10459/. Rebalance time taken is a factor of number of files and their size. If the frequency of files getting added to the global queue[on which the migrator threads act] is higher, faster will be the rebalance. I guess here we are seeing the effect of local crawl mostly as only 81GB is migrated out of 500GB. Thanks, Susant - Original Message - From: Benjamin Turner bennytu...@gmail.com To: Vijay Bellur vbel...@redhat.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Monday, May 4, 2015 5:18:13 PM Subject: Re: [Gluster-devel] Rebalance improvement design Thanks Vijay! I forgot to upgrade the kernel(thinp 6.6 perf bug gah) before I created this data set, so its a bit smaller: total threads = 16 total files = 7,060,700 (64 kb files, 100 files per dir) total data = 430.951 GB 88.26% of requested files processed, minimum is 70.00 10101.355737 sec elapsed time 698.985382 files/sec 698.985382 IOPS 43.686586 MB/sec I updated everything and ran the rebalanace on glusterfs-3.8dev-0.107.git275f724.el6.x86_64.: [root@gqas001 ~]# gluster v rebalance testvol status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost 1327346 81.0GB 3999140 0 0 completed 55088.00 gqas013.sbu.lab.eng.bos.redhat.com 0 0Bytes 1 0 0 completed 26070.00 gqas011.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 0 0 failed 0.00 gqas014.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 0 0 failed 0.00 gqas016.sbu.lab.eng.bos.redhat.com 1325857 80.9GB 4000865 0 0 completed 55088.00 gqas015.sbu.lab.eng.bos.redhat.com 0 0Bytes 0 0 0 failed 0.00 volume rebalance: testvol: success: A couple observations: I am seeing lots of threads / processes running: [root@gqas001 ~]# ps -eLf | grep glu | wc -l 96 - 96 gluster threads [root@gqas001 ~]# ps -eLf | grep rebal | wc -l 36 - 36 rebal threads. Is this tunible? Is there a use case where we would need to limit this? Just curious, how did we arrive at 36 rebal threads? # cat /var/log/glusterfs/testvol-rebalance.log | wc -l 4,577,583 [root@gqas001 ~]# ll /var/log/glusterfs/testvol-rebalance.log -h -rw--- 1 root root 1.6G May 3 12:29 /var/log/glusterfs/testvol-rebalance.log :) How big is this going to get when I do the 10-20 TB? I'll keep tabs on this, my default test setup only has: [root@gqas001 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_gqas001-lv_root 50G 4.8G 42G 11% / tmpfs 24G 0 24G 0% /dev/shm /dev/sda1 477M 65M 387M 15% /boot /dev/mapper/vg_gqas001-lv_home 385G 71M 366G 1% /home /dev/mapper/gluster_vg-lv_bricks 9.5T 219G 9.3T 3% /bricks Next run I want to fill up a 10TB cluster and double the # of bricks to simulate running out of space doubling capacity. Any other fixes or changes that need to go in before I try a larger data set? Before that I may run my performance regression suite against a system while a rebal is in progress and check how it affects performance. I'll turn both these cases into perf regression tests that I run with iozone smallfile and such, any other use cases I should add? Should I add
Re: [Gluster-devel] Rebalance improvement design
0 0completed 51982.00 gqas004.sbu.lab.eng.bos.redhat.com 132629081.0GB 9708625 0 0completed 98500.00 gqas013.sbu.lab.eng.bos.redhat.com00Bytes 811 0 0completed 51982.00 gqas014.sbu.lab.eng.bos.redhat.com00Bytes 811 0 0completed 51982.00 volume rebalance: testvol: success: I'll have a run on the patch started tomorrow. -b On Wed, Apr 29, 2015 at 12:51 PM, Nithya Balachandran nbala...@redhat.com wrote: Doh my mistake, I thought it was merged. I was just running with the upstream 3.7 daily. Can I use this run as my baseline and then I can run next time on the patch to show the % improvement? I'll wipe everything and try on the patch, any idea when it will be merged? Yes, it would be very useful to have this run as the baseline. The patch has just been merged in master. It should be backported to 3.7 in a day or so. Regards, Nithya On Wed, Apr 22, 2015 at 1:10 AM, Nithya Balachandran nbala...@redhat.com wrote: That sounds great. Thanks. Regards, Nithya - Original Message - From: Benjamin Turner bennytu...@gmail.com To: Nithya Balachandran nbala...@redhat.com Cc: Susant Palai spa...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Wednesday, 22 April, 2015 12:14:14 AM Subject: Re: [Gluster-devel] Rebalance improvement design I am setting up a test env now, I'll have some feedback for you this week. -b On Tue, Apr 21, 2015 at 11:36 AM, Nithya Balachandran nbala...@redhat.com wrote: Hi Ben, Did you get a chance to try this out? Regards, Nithya - Original Message - From: Susant Palai spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Monday, April 13, 2015 9:55:07 AM Subject: Re: [Gluster-devel] Rebalance improvement design Hi Ben, Uploaded a new patch here: http://review.gluster.org/#/c/9657/. We can start perf test on it. :) Susant - Original Message - From: Susant Palai spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Thursday, 9 April, 2015 3:40:09 PM Subject: Re: [Gluster-devel] Rebalance improvement design Thanks Ben. RPM is not available and I am planning to refresh the patch in two days with some more regression fixes. I think we can run the tests post that. Any larger data-set will be good(say 3 to 5 TB). Thanks, Susant - Original Message - From: Benjamin Turner bennytu...@gmail.com To: Vijay Bellur vbel...@redhat.com Cc: Susant Palai spa...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Thursday, 9 April, 2015 2:10:30 AM Subject: Re: [Gluster-devel] Rebalance improvement design I have some rebalance perf regression stuff I have been working on, is there an RPM with these patches anywhere so that I can try it on my systems? If not I'll just build from: git fetch git:// review.gluster.org/glusterfs refs/changes/57/9657/8 git cherry-pick FETCH_HEAD I will have _at_least_ 10TB of storage, how many TBs of data should I run with? -b On Tue, Apr 7, 2015 at 9:07 AM, Vijay Bellur vbel...@redhat.com wrote: On 04/07/2015 03:08 PM, Susant Palai wrote: Here is one test performed on a 300GB data set and around 100%(1/2 the time) improvement was seen. [root@gprfs031 ~]# gluster v i Volume Name: rbperf Type: Distribute Volume ID: 35562662-337e-4923-b862- d0bbb0748003 Status: Started Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: gprfs029-10ge:/bricks/ gprfs029/brick1 Brick2: gprfs030-10ge:/bricks/ gprfs030/brick1 Brick3: gprfs031-10ge:/bricks/ gprfs031/brick1 Brick4: gprfs032-10ge:/bricks/ gprfs032/brick1 Added server 32 and started rebalance force. Rebalance stat for new changes: [root@gprfs031 ~]# gluster v rebalance rbperf
Re: [Gluster-devel] Rebalance improvement design
On 05/01/2015 10:23 AM, Benjamin Turner wrote: Ok I have all my data created and I just started the rebalance. One thing to not in the client log I see the following spamming: [root@gqac006 ~]# cat /var/log/glusterfs/gluster-mount-.log | wc -l 394042 [2015-05-01 00:47:55.591150] I [MSGID: 109036] [dht-common.c:6478:dht_log_new_layout_for_dir_selfheal] 0-testvol-dht: Setting layout of /file_dstdir/gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_006 http://gqac006.sbu.lab.eng.bos.redhat.com/thrd_05/d_001/d_000/d_004/d_006 with [Subvol_name: testvol-replicate-0, Err: -1 , Start: 0 , Stop: 2141429669 ], [Subvol_name: testvol-replicate-1, Err: -1 , Start: 2141429670 , Stop: 4294967295 ], [2015-05-01 00:47:55.596147] I [dht-selfheal.c:1587:dht_selfheal_layout_new_directory] 0-testvol-dht: chunk size = 0x / 19920276 = 0xd7 [2015-05-01 00:47:55.596177] I [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-testvol-dht: assigning range size 0x7fa39fa6 to testvol-replicate-1 I also noticed the same set of excessive logs in my tests. Have sent across a patch [1] to address this problem. -Vijay [1] http://review.gluster.org/10281 ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Rebalance improvement design
dirs per dir : 10 total threads = 16 total files = 7222600 total data = 440.833 GB 90.28% of requested files processed, minimum is 70.00 8107.852862 sec elapsed time 890.815377 files/sec 890.815377 IOPS 55.675961 MB/sec Here is the rebalance run after about 5 or so minutes: [root@gqas001 ~]# gluster v rebalance testvol status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost32203 2.0GB 120858 0 5184 in progress 1294.00 gqas011.sbu.lab.eng.bos.redhat.com00Bytes 0 0 0 failed 0.00 gqas016.sbu.lab.eng.bos.redhat.com 9364 585.2MB 53121 0 0 in progress 1294.00 gqas013.sbu.lab.eng.bos.redhat.com00Bytes 14750 0 0 in progress 1294.00 gqas014.sbu.lab.eng.bos.redhat.com00Bytes 0 0 0 failed 0.00 gqas015.sbu.lab.eng.bos.redhat.com00Bytes 196382 0 0 in progress 1294.00 volume rebalance: testvol: success: The hostnames are there if you want to poke around. I had a problem with one of the added systems being on a different version of glusterfs so I had to update everything to glusterfs-3.8dev-0.99.git7d7b80e.el6.x86_64, remove the bricks I just added, and add them back. Something may have went wrong in that process but I thought I did everything correctly. I'll start fresh tomorrow. I figured I'd let this run over night. -b On Wed, Apr 29, 2015 at 9:48 PM, Benjamin Turner bennytu...@gmail.com wrote: Sweet! Here is the baseline: [root@gqas001 ~]# gluster v rebalance testvol status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost 132857581.1GB 9402953 0 0completed 98500.00 gqas012.sbu.lab.eng.bos.redhat.com00Bytes 811 0 0completed 51982.00 gqas003.sbu.lab.eng.bos.redhat.com00Bytes 811 0 0completed 51982.00 gqas004.sbu.lab.eng.bos.redhat.com 132629081.0GB 9708625 0 0completed 98500.00 gqas013.sbu.lab.eng.bos.redhat.com00Bytes 811 0 0completed 51982.00 gqas014.sbu.lab.eng.bos.redhat.com00Bytes 811 0 0completed 51982.00 volume rebalance: testvol: success: I'll have a run on the patch started tomorrow. -b On Wed, Apr 29, 2015 at 12:51 PM, Nithya Balachandran nbala...@redhat.com wrote: Doh my mistake, I thought it was merged. I was just running with the upstream 3.7 daily. Can I use this run as my baseline and then I can run next time on the patch to show the % improvement? I'll wipe everything and try on the patch, any idea when it will be merged? Yes, it would be very useful to have this run as the baseline. The patch has just been merged in master. It should be backported to 3.7 in a day or so. Regards, Nithya On Wed, Apr 22, 2015 at 1:10 AM, Nithya Balachandran nbala...@redhat.com wrote: That sounds great. Thanks. Regards, Nithya - Original Message - From: Benjamin Turner bennytu...@gmail.com To: Nithya Balachandran nbala...@redhat.com Cc: Susant Palai spa...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Wednesday, 22 April, 2015 12:14:14 AM Subject: Re: [Gluster-devel] Rebalance improvement design I am setting up a test env now, I'll have some feedback for you this week. -b On Tue, Apr 21, 2015 at 11:36 AM, Nithya Balachandran nbala...@redhat.com wrote: Hi Ben, Did you get a chance to try this out? Regards, Nithya - Original Message - From: Susant Palai spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com
Re: [Gluster-devel] Rebalance improvement design
...@redhat.com wrote: Doh my mistake, I thought it was merged. I was just running with the upstream 3.7 daily. Can I use this run as my baseline and then I can run next time on the patch to show the % improvement? I'll wipe everything and try on the patch, any idea when it will be merged? Yes, it would be very useful to have this run as the baseline. The patch has just been merged in master. It should be backported to 3.7 in a day or so. Regards, Nithya On Wed, Apr 22, 2015 at 1:10 AM, Nithya Balachandran nbala...@redhat.com mailto:nbala...@redhat.com wrote: That sounds great. Thanks. Regards, Nithya - Original Message - From: Benjamin Turner bennytu...@gmail.com mailto:bennytu...@gmail.com To: Nithya Balachandran nbala...@redhat.com mailto:nbala...@redhat.com Cc: Susant Palai spa...@redhat.com mailto:spa...@redhat.com, Gluster Devel gluster-devel@gluster.org mailto:gluster-devel@gluster.org Sent: Wednesday, 22 April, 2015 12:14:14 AM Subject: Re: [Gluster-devel] Rebalance improvement design I am setting up a test env now, I'll have some feedback for you this week. -b On Tue, Apr 21, 2015 at 11:36 AM, Nithya Balachandran nbala...@redhat.com mailto:nbala...@redhat.com wrote: Hi Ben, Did you get a chance to try this out? Regards, Nithya - Original Message - From: Susant Palai spa...@redhat.com mailto:spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com mailto:bennytu...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org mailto:gluster-devel@gluster.org Sent: Monday, April 13, 2015 9:55:07 AM Subject: Re: [Gluster-devel] Rebalance improvement design Hi Ben, Uploaded a new patch here: http://review.gluster.org/#/c/9657/. We can start perf test on it. :) Susant - Original Message - From: Susant Palai spa...@redhat.com mailto:spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com mailto:bennytu...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org mailto:gluster-devel@gluster.org Sent: Thursday, 9 April, 2015 3:40:09 PM Subject: Re: [Gluster-devel] Rebalance improvement design Thanks Ben. RPM is not available and I am planning to refresh the patch in two days with some more regression fixes. I think we can run the tests post that. Any larger data-set will be good(say 3 to 5 TB). Thanks, Susant - Original Message - From: Benjamin Turner bennytu...@gmail.com mailto:bennytu...@gmail.com To: Vijay Bellur vbel...@redhat.com mailto:vbel...@redhat.com Cc: Susant Palai spa...@redhat.com mailto:spa...@redhat.com, Gluster Devel gluster-devel@gluster.org mailto:gluster-devel@gluster.org Sent: Thursday, 9 April, 2015 2:10:30 AM Subject: Re: [Gluster-devel] Rebalance improvement design I have some rebalance perf regression stuff I have been working on, is there an RPM with these patches anywhere so that I can try it on my systems? If not I'll just build from: git fetch git:// review.gluster.org
Re: [Gluster-devel] Rebalance improvement design
51982.00 gqas003.sbu.lab.eng.bos.redhat.com00Bytes 811 0 0completed 51982.00 gqas004.sbu.lab.eng.bos.redhat.com 132629081.0GB 9708625 0 0completed 98500.00 gqas013.sbu.lab.eng.bos.redhat.com00Bytes 811 0 0completed 51982.00 gqas014.sbu.lab.eng.bos.redhat.com00Bytes 811 0 0completed 51982.00 volume rebalance: testvol: success: I'll have a run on the patch started tomorrow. -b On Wed, Apr 29, 2015 at 12:51 PM, Nithya Balachandran nbala...@redhat.com wrote: Doh my mistake, I thought it was merged. I was just running with the upstream 3.7 daily. Can I use this run as my baseline and then I can run next time on the patch to show the % improvement? I'll wipe everything and try on the patch, any idea when it will be merged? Yes, it would be very useful to have this run as the baseline. The patch has just been merged in master. It should be backported to 3.7 in a day or so. Regards, Nithya On Wed, Apr 22, 2015 at 1:10 AM, Nithya Balachandran nbala...@redhat.com wrote: That sounds great. Thanks. Regards, Nithya - Original Message - From: Benjamin Turner bennytu...@gmail.com To: Nithya Balachandran nbala...@redhat.com Cc: Susant Palai spa...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Wednesday, 22 April, 2015 12:14:14 AM Subject: Re: [Gluster-devel] Rebalance improvement design I am setting up a test env now, I'll have some feedback for you this week. -b On Tue, Apr 21, 2015 at 11:36 AM, Nithya Balachandran nbala...@redhat.com wrote: Hi Ben, Did you get a chance to try this out? Regards, Nithya - Original Message - From: Susant Palai spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Monday, April 13, 2015 9:55:07 AM Subject: Re: [Gluster-devel] Rebalance improvement design Hi Ben, Uploaded a new patch here: http://review.gluster.org/#/c/9657/. We can start perf test on it. :) Susant - Original Message - From: Susant Palai spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Thursday, 9 April, 2015 3:40:09 PM Subject: Re: [Gluster-devel] Rebalance improvement design Thanks Ben. RPM is not available and I am planning to refresh the patch in two days with some more regression fixes. I think we can run the tests post that. Any larger data-set will be good(say 3 to 5 TB). Thanks, Susant - Original Message - From: Benjamin Turner bennytu...@gmail.com To: Vijay Bellur vbel...@redhat.com Cc: Susant Palai spa...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Thursday, 9 April, 2015 2:10:30 AM Subject: Re: [Gluster-devel] Rebalance improvement design I have some rebalance perf regression stuff I have been working on, is there an RPM with these patches anywhere so that I can try it on my systems? If not I'll just build from: git fetch git:// review.gluster.org/glusterfs refs/changes/57/9657/8 git cherry-pick FETCH_HEAD I will have _at_least_ 10TB of storage, how many TBs of data should I run with? -b On Tue, Apr 7, 2015 at 9:07 AM, Vijay Bellur vbel...@redhat.com wrote: On 04/07/2015 03:08 PM, Susant Palai wrote: Here is one test performed on a 300GB data set and around 100%(1/2 the time) improvement was seen. [root@gprfs031 ~]# gluster v i Volume Name: rbperf Type: Distribute Volume ID: 35562662-337e-4923-b862- d0bbb0748003 Status: Started Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: gprfs029-10ge:/bricks/ gprfs029/brick1 Brick2: gprfs030-10ge:/bricks/ gprfs030/brick1 Brick3: gprfs031-10ge:/bricks/ gprfs031/brick1 Brick4: gprfs032-10ge:/bricks/ gprfs032/brick1 Added server 32 and started rebalance force
Re: [Gluster-devel] Rebalance improvement design
Hi Ben I checked out the glusterfs process attaching gdb and I could not find the newer code. Can you confirm whether you took the new patch ? patch i: http://review.gluster.org/#/c/9657/ Thanks, Susant - Original Message - From: Susant Palai spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com, Nithya Balachandran nbala...@redhat.com Cc: Shyamsundar Ranganathan srang...@redhat.com Sent: Wednesday, April 29, 2015 1:22:02 PM Subject: Re: [Gluster-devel] Rebalance improvement design This is how it looks for 2000 file. each 1MB. Done rebalance on 2*2 + 2. OLDER: [root@gprfs030 ~]# gluster v rebalance test1 status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost 2000 1.9GB 3325 0 0 completed 63.00 gprfs032-10ge00Bytes 2158 0 0 completed 6.00 volume rebalance: test1: success: [root@gprfs030 ~]# NEW: [root@gprfs030 upstream_rebalance]# gluster v rebalance test1 status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost 2000 1.9GB 2011 0 0 completed 12.00 gprfs032-10ge00Bytes 0 0 0 failed 0.00 [Failed because of a crash which I will address in next patch] volume rebalance: test1: success: Just trying out replica behaviour for rebalance. Here is the volume info. [root@gprfs030 ~]# gluster v i Volume Name: test1 Type: Distributed-Replicate Volume ID: e12ef289-86f2-454a-beaa-72ea763dbada Status: Started Number of Bricks: 3 x 2 = 6 Transport-type: tcp Bricks: Brick1: gprfs030-10ge:/bricks/gprfs030/brick1 Brick2: gprfs032-10ge:/bricks/gprfs032/brick1 Brick3: gprfs030-10ge:/bricks/gprfs030/brick2 Brick4: gprfs032-10ge:/bricks/gprfs032/brick2 Brick5: gprfs030-10ge:/bricks/gprfs030/brick3 Brick6: gprfs032-10ge:/bricks/gprfs032/brick3 - Original Message - From: Susant Palai spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Wednesday, April 29, 2015 1:13:04 PM Subject: Re: [Gluster-devel] Rebalance improvement design Ben, will you be able to give rebal stat for the same configuration and data set with older rebalance infra ? Thanks, Susant - Original Message - From: Susant Palai spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Wednesday, April 29, 2015 12:08:38 PM Subject: Re: [Gluster-devel] Rebalance improvement design Hi Ben, Yes we were using pure dist volume. Will check in to your systems for more info. Can you please update which patch set you used ? In the mean time I will do one set of test with the same configuration on a small data set. Thanks, Susant - Original Message - From: Benjamin Turner bennytu...@gmail.com To: Nithya Balachandran nbala...@redhat.com Cc: Susant Palai spa...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Wednesday, April 29, 2015 2:13:05 AM Subject: Re: [Gluster-devel] Rebalance improvement design I am not seeing the performance you were. I am running on 500GB of data: [root@gqas001 ~]# gluster v rebalance testvol status Node Rebalanced-files size scanned failures skipped status run time in secs
Re: [Gluster-devel] Rebalance improvement design
Hi Ben, Yes we were using pure dist volume. Will check in to your systems for more info. Can you please update which patch set you used ? In the mean time I will do one set of test with the same configuration on a small data set. Thanks, Susant - Original Message - From: Benjamin Turner bennytu...@gmail.com To: Nithya Balachandran nbala...@redhat.com Cc: Susant Palai spa...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Wednesday, April 29, 2015 2:13:05 AM Subject: Re: [Gluster-devel] Rebalance improvement design I am not seeing the performance you were. I am running on 500GB of data: [root@gqas001 ~]# gluster v rebalance testvol status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost 129021 7.9GB912104 0 0 in progress 10100.00 gqas012.sbu.lab.eng.bos.redhat.com00Bytes 1930312 0 0 in progress 10100.00 gqas003.sbu.lab.eng.bos.redhat.com00Bytes 1930312 0 0 in progress 10100.00 gqas004.sbu.lab.eng.bos.redhat.com 128903 7.9GB 946730 0 0 in progress 10100.00 gqas013.sbu.lab.eng.bos.redhat.com00Bytes 1930312 0 0 in progress 10100.00 gqas014.sbu.lab.eng.bos.redhat.com00Bytes 1930312 0 0 in progress 10100.00 Based on what I am seeing I expect this to take 2 days. Was you rebal run on a pure dist volume? I am trying on 2x2 + 2 new bricks. Any idea why mine is taking so long? -b On Wed, Apr 22, 2015 at 1:10 AM, Nithya Balachandran nbala...@redhat.com wrote: That sounds great. Thanks. Regards, Nithya - Original Message - From: Benjamin Turner bennytu...@gmail.com To: Nithya Balachandran nbala...@redhat.com Cc: Susant Palai spa...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Wednesday, 22 April, 2015 12:14:14 AM Subject: Re: [Gluster-devel] Rebalance improvement design I am setting up a test env now, I'll have some feedback for you this week. -b On Tue, Apr 21, 2015 at 11:36 AM, Nithya Balachandran nbala...@redhat.com wrote: Hi Ben, Did you get a chance to try this out? Regards, Nithya - Original Message - From: Susant Palai spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Monday, April 13, 2015 9:55:07 AM Subject: Re: [Gluster-devel] Rebalance improvement design Hi Ben, Uploaded a new patch here: http://review.gluster.org/#/c/9657/. We can start perf test on it. :) Susant - Original Message - From: Susant Palai spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Thursday, 9 April, 2015 3:40:09 PM Subject: Re: [Gluster-devel] Rebalance improvement design Thanks Ben. RPM is not available and I am planning to refresh the patch in two days with some more regression fixes. I think we can run the tests post that. Any larger data-set will be good(say 3 to 5 TB). Thanks, Susant - Original Message - From: Benjamin Turner bennytu...@gmail.com To: Vijay Bellur vbel...@redhat.com Cc: Susant Palai spa...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Thursday, 9 April, 2015 2:10:30 AM Subject: Re: [Gluster-devel] Rebalance improvement design I have some rebalance perf regression stuff I have been working on, is there an RPM with these patches anywhere so that I can try it on my systems? If not I'll just build from: git fetch git:// review.gluster.org/glusterfs refs/changes/57/9657/8 git cherry-pick FETCH_HEAD I will have _at_least_ 10TB of storage, how many TBs of data should I run with? -b On Tue, Apr 7, 2015 at 9:07 AM, Vijay Bellur vbel...@redhat.com wrote: On 04/07/2015 03:08 PM, Susant Palai wrote: Here is one test performed on a 300GB data set and around 100%(1/2 the time) improvement was seen. [root@gprfs031 ~]# gluster v i Volume Name: rbperf Type: Distribute Volume ID: 35562662-337e-4923-b862- d0bbb0748003 Status: Started Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: gprfs029-10ge:/bricks/ gprfs029/brick1 Brick2: gprfs030-10ge:/bricks/ gprfs030/brick1 Brick3: gprfs031
Re: [Gluster-devel] Rebalance improvement design
Doh my mistake, I thought it was merged. I was just running with the upstream 3.7 daily. Can I use this run as my baseline and then I can run next time on the patch to show the % improvement? I'll wipe everything and try on the patch, any idea when it will be merged? -b On Wed, Apr 29, 2015 at 5:34 AM, Susant Palai spa...@redhat.com wrote: Hi Ben I checked out the glusterfs process attaching gdb and I could not find the newer code. Can you confirm whether you took the new patch ? patch i: http://review.gluster.org/#/c/9657/ Thanks, Susant - Original Message - From: Susant Palai spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com, Nithya Balachandran nbala...@redhat.com Cc: Shyamsundar Ranganathan srang...@redhat.com Sent: Wednesday, April 29, 2015 1:22:02 PM Subject: Re: [Gluster-devel] Rebalance improvement design This is how it looks for 2000 file. each 1MB. Done rebalance on 2*2 + 2. OLDER: [root@gprfs030 ~]# gluster v rebalance test1 status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost 2000 1.9GB 3325 0 0 completed 63.00 gprfs032-10ge00Bytes 2158 0 0 completed 6.00 volume rebalance: test1: success: [root@gprfs030 ~]# NEW: [root@gprfs030 upstream_rebalance]# gluster v rebalance test1 status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost 2000 1.9GB 2011 0 0 completed 12.00 gprfs032-10ge00Bytes 0 0 0 failed 0.00 [Failed because of a crash which I will address in next patch] volume rebalance: test1: success: Just trying out replica behaviour for rebalance. Here is the volume info. [root@gprfs030 ~]# gluster v i Volume Name: test1 Type: Distributed-Replicate Volume ID: e12ef289-86f2-454a-beaa-72ea763dbada Status: Started Number of Bricks: 3 x 2 = 6 Transport-type: tcp Bricks: Brick1: gprfs030-10ge:/bricks/gprfs030/brick1 Brick2: gprfs032-10ge:/bricks/gprfs032/brick1 Brick3: gprfs030-10ge:/bricks/gprfs030/brick2 Brick4: gprfs032-10ge:/bricks/gprfs032/brick2 Brick5: gprfs030-10ge:/bricks/gprfs030/brick3 Brick6: gprfs032-10ge:/bricks/gprfs032/brick3 - Original Message - From: Susant Palai spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Wednesday, April 29, 2015 1:13:04 PM Subject: Re: [Gluster-devel] Rebalance improvement design Ben, will you be able to give rebal stat for the same configuration and data set with older rebalance infra ? Thanks, Susant - Original Message - From: Susant Palai spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Wednesday, April 29, 2015 12:08:38 PM Subject: Re: [Gluster-devel] Rebalance improvement design Hi Ben, Yes we were using pure dist volume. Will check in to your systems for more info. Can you please update which patch set you used ? In the mean time I will do one set of test with the same configuration on a small data set. Thanks, Susant - Original Message - From: Benjamin Turner bennytu...@gmail.com To: Nithya Balachandran nbala...@redhat.com Cc: Susant Palai spa...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Wednesday, April 29, 2015 2:13:05 AM Subject: Re: [Gluster-devel] Rebalance improvement design I am not seeing
Re: [Gluster-devel] Rebalance improvement design
Doh my mistake, I thought it was merged. I was just running with the upstream 3.7 daily. Can I use this run as my baseline and then I can run next time on the patch to show the % improvement? I'll wipe everything and try on the patch, any idea when it will be merged? Yes, it would be very useful to have this run as the baseline. The patch has just been merged in master. It should be backported to 3.7 in a day or so. Regards, Nithya On Wed, Apr 22, 2015 at 1:10 AM, Nithya Balachandran nbala...@redhat.com wrote: That sounds great. Thanks. Regards, Nithya - Original Message - From: Benjamin Turner bennytu...@gmail.com To: Nithya Balachandran nbala...@redhat.com Cc: Susant Palai spa...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Wednesday, 22 April, 2015 12:14:14 AM Subject: Re: [Gluster-devel] Rebalance improvement design I am setting up a test env now, I'll have some feedback for you this week. -b On Tue, Apr 21, 2015 at 11:36 AM, Nithya Balachandran nbala...@redhat.com wrote: Hi Ben, Did you get a chance to try this out? Regards, Nithya - Original Message - From: Susant Palai spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Monday, April 13, 2015 9:55:07 AM Subject: Re: [Gluster-devel] Rebalance improvement design Hi Ben, Uploaded a new patch here: http://review.gluster.org/#/c/9657/. We can start perf test on it. :) Susant - Original Message - From: Susant Palai spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Thursday, 9 April, 2015 3:40:09 PM Subject: Re: [Gluster-devel] Rebalance improvement design Thanks Ben. RPM is not available and I am planning to refresh the patch in two days with some more regression fixes. I think we can run the tests post that. Any larger data-set will be good(say 3 to 5 TB). Thanks, Susant - Original Message - From: Benjamin Turner bennytu...@gmail.com To: Vijay Bellur vbel...@redhat.com Cc: Susant Palai spa...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Thursday, 9 April, 2015 2:10:30 AM Subject: Re: [Gluster-devel] Rebalance improvement design I have some rebalance perf regression stuff I have been working on, is there an RPM with these patches anywhere so that I can try it on my systems? If not I'll just build from: git fetch git:// review.gluster.org/glusterfs refs/changes/57/9657/8 git cherry-pick FETCH_HEAD I will have _at_least_ 10TB of storage, how many TBs of data should I run with? -b On Tue, Apr 7, 2015 at 9:07 AM, Vijay Bellur vbel...@redhat.com wrote: On 04/07/2015 03:08 PM, Susant Palai wrote: Here is one test performed on a 300GB data set and around 100%(1/2 the time) improvement was seen. [root@gprfs031 ~]# gluster v i Volume Name: rbperf Type: Distribute Volume ID: 35562662-337e-4923-b862- d0bbb0748003 Status: Started Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: gprfs029-10ge:/bricks/ gprfs029/brick1 Brick2: gprfs030-10ge:/bricks/ gprfs030/brick1 Brick3: gprfs031-10ge:/bricks/ gprfs031/brick1 Brick4: gprfs032-10ge:/bricks/ gprfs032/brick1 Added server 32 and started rebalance force. Rebalance stat for new changes: [root@gprfs031 ~]# gluster v rebalance rbperf status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost 74639 36.1GB 297319 0 0 completed 1743.00 172.17.40.30 67512 33.5GB 269187 0 0 completed 1395.00 gprfs029-10ge 79095 38.8GB 284105 0 0 completed 1559.00 gprfs032-10ge 0 0Bytes 0 0 0 completed 402.00 volume rebalance: rbperf: success: Rebalance stat for old model: [root@gprfs031 ~]# gluster v rebalance rbperf status Node Rebalanced-files size scanned failures skipped status run time in secs
Re: [Gluster-devel] Rebalance improvement design
On 04/29/2015 08:57 AM, Benjamin Turner wrote: Doh my mistake, I thought it was merged. I was just running with the upstream 3.7 daily. Can I use this run as my baseline and then I can run next time on the patch to show the % improvement? I'll wipe everything and try on the patch, any idea when it will be merged? Patch should be merged today (master). It has passed regression and the one compile error is also cleared up. Doing last mile checks before merge. -b On Wed, Apr 29, 2015 at 5:34 AM, Susant Palai spa...@redhat.com mailto:spa...@redhat.com wrote: Hi Ben I checked out the glusterfs process attaching gdb and I could not find the newer code. Can you confirm whether you took the new patch ? patch i: http://review.gluster.org/#/c/9657/ Thanks, Susant - Original Message - From: Susant Palai spa...@redhat.com mailto:spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com mailto:bennytu...@gmail.com, Nithya Balachandran nbala...@redhat.com mailto:nbala...@redhat.com Cc: Shyamsundar Ranganathan srang...@redhat.com mailto:srang...@redhat.com Sent: Wednesday, April 29, 2015 1:22:02 PM Subject: Re: [Gluster-devel] Rebalance improvement design This is how it looks for 2000 file. each 1MB. Done rebalance on 2*2 + 2. OLDER: [root@gprfs030 ~]# gluster v rebalance test1 status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost 2000 1.9GB 3325 0 0 completed 63.00 gprfs032-10ge0 0Bytes 2158 0 0 completed 6.00 volume rebalance: test1: success: [root@gprfs030 ~]# NEW: [root@gprfs030 upstream_rebalance]# gluster v rebalance test1 status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost 2000 1.9GB 2011 0 0 completed 12.00 gprfs032-10ge0 0Bytes 0 0 0 failed 0.00 [Failed because of a crash which I will address in next patch] volume rebalance: test1: success: Just trying out replica behaviour for rebalance. Here is the volume info. [root@gprfs030 ~]# gluster v i Volume Name: test1 Type: Distributed-Replicate Volume ID: e12ef289-86f2-454a-beaa-72ea763dbada Status: Started Number of Bricks: 3 x 2 = 6 Transport-type: tcp Bricks: Brick1: gprfs030-10ge:/bricks/gprfs030/brick1 Brick2: gprfs032-10ge:/bricks/gprfs032/brick1 Brick3: gprfs030-10ge:/bricks/gprfs030/brick2 Brick4: gprfs032-10ge:/bricks/gprfs032/brick2 Brick5: gprfs030-10ge:/bricks/gprfs030/brick3 Brick6: gprfs032-10ge:/bricks/gprfs032/brick3 - Original Message - From: Susant Palai spa...@redhat.com mailto:spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com mailto:bennytu...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org mailto:gluster-devel@gluster.org Sent: Wednesday, April 29, 2015 1:13:04 PM Subject: Re: [Gluster-devel] Rebalance improvement design Ben, will you be able to give rebal stat for the same configuration and data set with older rebalance infra ? Thanks, Susant - Original Message - From: Susant Palai spa...@redhat.com mailto:spa...@redhat.com To: Benjamin Turner
Re: [Gluster-devel] Rebalance improvement design
Sweet! Here is the baseline: [root@gqas001 ~]# gluster v rebalance testvol status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost 132857581.1GB 9402953 0 0completed 98500.00 gqas012.sbu.lab.eng.bos.redhat.com00Bytes 811 0 0completed 51982.00 gqas003.sbu.lab.eng.bos.redhat.com00Bytes 811 0 0completed 51982.00 gqas004.sbu.lab.eng.bos.redhat.com 132629081.0GB 9708625 0 0completed 98500.00 gqas013.sbu.lab.eng.bos.redhat.com00Bytes 811 0 0completed 51982.00 gqas014.sbu.lab.eng.bos.redhat.com00Bytes 811 0 0completed 51982.00 volume rebalance: testvol: success: I'll have a run on the patch started tomorrow. -b On Wed, Apr 29, 2015 at 12:51 PM, Nithya Balachandran nbala...@redhat.com wrote: Doh my mistake, I thought it was merged. I was just running with the upstream 3.7 daily. Can I use this run as my baseline and then I can run next time on the patch to show the % improvement? I'll wipe everything and try on the patch, any idea when it will be merged? Yes, it would be very useful to have this run as the baseline. The patch has just been merged in master. It should be backported to 3.7 in a day or so. Regards, Nithya On Wed, Apr 22, 2015 at 1:10 AM, Nithya Balachandran nbala...@redhat.com wrote: That sounds great. Thanks. Regards, Nithya - Original Message - From: Benjamin Turner bennytu...@gmail.com To: Nithya Balachandran nbala...@redhat.com Cc: Susant Palai spa...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Wednesday, 22 April, 2015 12:14:14 AM Subject: Re: [Gluster-devel] Rebalance improvement design I am setting up a test env now, I'll have some feedback for you this week. -b On Tue, Apr 21, 2015 at 11:36 AM, Nithya Balachandran nbala...@redhat.com wrote: Hi Ben, Did you get a chance to try this out? Regards, Nithya - Original Message - From: Susant Palai spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Monday, April 13, 2015 9:55:07 AM Subject: Re: [Gluster-devel] Rebalance improvement design Hi Ben, Uploaded a new patch here: http://review.gluster.org/#/c/9657/. We can start perf test on it. :) Susant - Original Message - From: Susant Palai spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Thursday, 9 April, 2015 3:40:09 PM Subject: Re: [Gluster-devel] Rebalance improvement design Thanks Ben. RPM is not available and I am planning to refresh the patch in two days with some more regression fixes. I think we can run the tests post that. Any larger data-set will be good(say 3 to 5 TB). Thanks, Susant - Original Message - From: Benjamin Turner bennytu...@gmail.com To: Vijay Bellur vbel...@redhat.com Cc: Susant Palai spa...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Thursday, 9 April, 2015 2:10:30 AM Subject: Re: [Gluster-devel] Rebalance improvement design I have some rebalance perf regression stuff I have been working on, is there an RPM with these patches anywhere so that I can try it on my systems? If not I'll just build from: git fetch git:// review.gluster.org/glusterfs refs/changes/57/9657/8 git cherry-pick FETCH_HEAD I will have _at_least_ 10TB of storage, how many TBs of data should I run with? -b On Tue, Apr 7, 2015 at 9:07 AM, Vijay Bellur vbel...@redhat.com wrote: On 04/07/2015 03:08 PM, Susant Palai wrote: Here is one test performed on a 300GB data set and around
Re: [Gluster-devel] Rebalance improvement design
I am setting up a test env now, I'll have some feedback for you this week. -b On Tue, Apr 21, 2015 at 11:36 AM, Nithya Balachandran nbala...@redhat.com wrote: Hi Ben, Did you get a chance to try this out? Regards, Nithya - Original Message - From: Susant Palai spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Monday, April 13, 2015 9:55:07 AM Subject: Re: [Gluster-devel] Rebalance improvement design Hi Ben, Uploaded a new patch here: http://review.gluster.org/#/c/9657/. We can start perf test on it. :) Susant - Original Message - From: Susant Palai spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Thursday, 9 April, 2015 3:40:09 PM Subject: Re: [Gluster-devel] Rebalance improvement design Thanks Ben. RPM is not available and I am planning to refresh the patch in two days with some more regression fixes. I think we can run the tests post that. Any larger data-set will be good(say 3 to 5 TB). Thanks, Susant - Original Message - From: Benjamin Turner bennytu...@gmail.com To: Vijay Bellur vbel...@redhat.com Cc: Susant Palai spa...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Thursday, 9 April, 2015 2:10:30 AM Subject: Re: [Gluster-devel] Rebalance improvement design I have some rebalance perf regression stuff I have been working on, is there an RPM with these patches anywhere so that I can try it on my systems? If not I'll just build from: git fetch git:// review.gluster.org/glusterfs refs/changes/57/9657/8 git cherry-pick FETCH_HEAD I will have _at_least_ 10TB of storage, how many TBs of data should I run with? -b On Tue, Apr 7, 2015 at 9:07 AM, Vijay Bellur vbel...@redhat.com wrote: On 04/07/2015 03:08 PM, Susant Palai wrote: Here is one test performed on a 300GB data set and around 100%(1/2 the time) improvement was seen. [root@gprfs031 ~]# gluster v i Volume Name: rbperf Type: Distribute Volume ID: 35562662-337e-4923-b862- d0bbb0748003 Status: Started Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: gprfs029-10ge:/bricks/ gprfs029/brick1 Brick2: gprfs030-10ge:/bricks/ gprfs030/brick1 Brick3: gprfs031-10ge:/bricks/ gprfs031/brick1 Brick4: gprfs032-10ge:/bricks/ gprfs032/brick1 Added server 32 and started rebalance force. Rebalance stat for new changes: [root@gprfs031 ~]# gluster v rebalance rbperf status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost 74639 36.1GB 297319 0 0 completed 1743.00 172.17.40.30 67512 33.5GB 269187 0 0 completed 1395.00 gprfs029-10ge 79095 38.8GB 284105 0 0 completed 1559.00 gprfs032-10ge 0 0Bytes 0 0 0 completed 402.00 volume rebalance: rbperf: success: Rebalance stat for old model: [root@gprfs031 ~]# gluster v rebalance rbperf status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost 86493 42.0GB 634302 0 0 completed 3329.00 gprfs029-10ge 94115 46.2GB 687852 0 0 completed 3328.00 gprfs030-10ge 74314 35.9GB 651943 0 0 completed 3072.00 gprfs032-10ge 0 0Bytes 594166 0 0 completed 1943.00 volume rebalance: rbperf: success: This is interesting. Thanks for sharing well done! Maybe we should attempt a much larger data set and see how we fare there :). Regards, Vijay __ _ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/ mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Rebalance improvement design
That sounds great. Thanks. Regards, Nithya - Original Message - From: Benjamin Turner bennytu...@gmail.com To: Nithya Balachandran nbala...@redhat.com Cc: Susant Palai spa...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Wednesday, 22 April, 2015 12:14:14 AM Subject: Re: [Gluster-devel] Rebalance improvement design I am setting up a test env now, I'll have some feedback for you this week. -b On Tue, Apr 21, 2015 at 11:36 AM, Nithya Balachandran nbala...@redhat.com wrote: Hi Ben, Did you get a chance to try this out? Regards, Nithya - Original Message - From: Susant Palai spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Monday, April 13, 2015 9:55:07 AM Subject: Re: [Gluster-devel] Rebalance improvement design Hi Ben, Uploaded a new patch here: http://review.gluster.org/#/c/9657/. We can start perf test on it. :) Susant - Original Message - From: Susant Palai spa...@redhat.com To: Benjamin Turner bennytu...@gmail.com Cc: Gluster Devel gluster-devel@gluster.org Sent: Thursday, 9 April, 2015 3:40:09 PM Subject: Re: [Gluster-devel] Rebalance improvement design Thanks Ben. RPM is not available and I am planning to refresh the patch in two days with some more regression fixes. I think we can run the tests post that. Any larger data-set will be good(say 3 to 5 TB). Thanks, Susant - Original Message - From: Benjamin Turner bennytu...@gmail.com To: Vijay Bellur vbel...@redhat.com Cc: Susant Palai spa...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Thursday, 9 April, 2015 2:10:30 AM Subject: Re: [Gluster-devel] Rebalance improvement design I have some rebalance perf regression stuff I have been working on, is there an RPM with these patches anywhere so that I can try it on my systems? If not I'll just build from: git fetch git:// review.gluster.org/glusterfs refs/changes/57/9657/8 git cherry-pick FETCH_HEAD I will have _at_least_ 10TB of storage, how many TBs of data should I run with? -b On Tue, Apr 7, 2015 at 9:07 AM, Vijay Bellur vbel...@redhat.com wrote: On 04/07/2015 03:08 PM, Susant Palai wrote: Here is one test performed on a 300GB data set and around 100%(1/2 the time) improvement was seen. [root@gprfs031 ~]# gluster v i Volume Name: rbperf Type: Distribute Volume ID: 35562662-337e-4923-b862- d0bbb0748003 Status: Started Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: gprfs029-10ge:/bricks/ gprfs029/brick1 Brick2: gprfs030-10ge:/bricks/ gprfs030/brick1 Brick3: gprfs031-10ge:/bricks/ gprfs031/brick1 Brick4: gprfs032-10ge:/bricks/ gprfs032/brick1 Added server 32 and started rebalance force. Rebalance stat for new changes: [root@gprfs031 ~]# gluster v rebalance rbperf status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost 74639 36.1GB 297319 0 0 completed 1743.00 172.17.40.30 67512 33.5GB 269187 0 0 completed 1395.00 gprfs029-10ge 79095 38.8GB 284105 0 0 completed 1559.00 gprfs032-10ge 0 0Bytes 0 0 0 completed 402.00 volume rebalance: rbperf: success: Rebalance stat for old model: [root@gprfs031 ~]# gluster v rebalance rbperf status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost 86493 42.0GB 634302 0 0 completed 3329.00 gprfs029-10ge 94115 46.2GB 687852 0 0 completed 3328.00 gprfs030-10ge 74314 35.9GB 651943 0 0 completed 3072.00 gprfs032-10ge 0 0Bytes 594166 0 0 completed 1943.00 volume rebalance: rbperf: success: This is interesting. Thanks for sharing well done! Maybe we should attempt a much larger data set and see how we fare there :). Regards, Vijay __ _ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/ mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Rebalance improvement design
Thanks Ben. RPM is not available and I am planning to refresh the patch in two days with some more regression fixes. I think we can run the tests post that. Any larger data-set will be good(say 3 to 5 TB). Thanks, Susant - Original Message - From: Benjamin Turner bennytu...@gmail.com To: Vijay Bellur vbel...@redhat.com Cc: Susant Palai spa...@redhat.com, Gluster Devel gluster-devel@gluster.org Sent: Thursday, 9 April, 2015 2:10:30 AM Subject: Re: [Gluster-devel] Rebalance improvement design I have some rebalance perf regression stuff I have been working on, is there an RPM with these patches anywhere so that I can try it on my systems? If not I'll just build from: git fetch git:// review.gluster.org/glusterfs refs/changes/57/9657/8 git cherry-pick FETCH_HEAD I will have _at_least_ 10TB of storage, how many TBs of data should I run with? -b On Tue, Apr 7, 2015 at 9:07 AM, Vijay Bellur vbel...@redhat.com wrote: On 04/07/2015 03:08 PM, Susant Palai wrote: Here is one test performed on a 300GB data set and around 100%(1/2 the time) improvement was seen. [root@gprfs031 ~]# gluster v i Volume Name: rbperf Type: Distribute Volume ID: 35562662-337e-4923-b862- d0bbb0748003 Status: Started Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: gprfs029-10ge:/bricks/ gprfs029/brick1 Brick2: gprfs030-10ge:/bricks/ gprfs030/brick1 Brick3: gprfs031-10ge:/bricks/ gprfs031/brick1 Brick4: gprfs032-10ge:/bricks/ gprfs032/brick1 Added server 32 and started rebalance force. Rebalance stat for new changes: [root@gprfs031 ~]# gluster v rebalance rbperf status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost 74639 36.1GB 297319 0 0 completed 1743.00 172.17.40.30 67512 33.5GB 269187 0 0 completed 1395.00 gprfs029-10ge 79095 38.8GB 284105 0 0 completed 1559.00 gprfs032-10ge 0 0Bytes 0 0 0 completed 402.00 volume rebalance: rbperf: success: Rebalance stat for old model: [root@gprfs031 ~]# gluster v rebalance rbperf status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost 86493 42.0GB 634302 0 0 completed 3329.00 gprfs029-10ge 94115 46.2GB 687852 0 0 completed 3328.00 gprfs030-10ge 74314 35.9GB 651943 0 0 completed 3072.00 gprfs032-10ge 0 0Bytes 594166 0 0 completed 1943.00 volume rebalance: rbperf: success: This is interesting. Thanks for sharing well done! Maybe we should attempt a much larger data set and see how we fare there :). Regards, Vijay __ _ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/ mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Rebalance improvement design
I have some rebalance perf regression stuff I have been working on, is there an RPM with these patches anywhere so that I can try it on my systems? If not I'll just build from: git fetch git://review.gluster.org/glusterfs refs/changes/57/9657/8 git cherry-pick FETCH_HEAD I will have _at_least_ 10TB of storage, how many TBs of data should I run with? -b On Tue, Apr 7, 2015 at 9:07 AM, Vijay Bellur vbel...@redhat.com wrote: On 04/07/2015 03:08 PM, Susant Palai wrote: Here is one test performed on a 300GB data set and around 100%(1/2 the time) improvement was seen. [root@gprfs031 ~]# gluster v i Volume Name: rbperf Type: Distribute Volume ID: 35562662-337e-4923-b862-d0bbb0748003 Status: Started Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: gprfs029-10ge:/bricks/gprfs029/brick1 Brick2: gprfs030-10ge:/bricks/gprfs030/brick1 Brick3: gprfs031-10ge:/bricks/gprfs031/brick1 Brick4: gprfs032-10ge:/bricks/gprfs032/brick1 Added server 32 and started rebalance force. Rebalance stat for new changes: [root@gprfs031 ~]# gluster v rebalance rbperf status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost7463936.1GB 297319 0 0completed 1743.00 172.17.40.306751233.5GB 269187 0 0completed 1395.00 gprfs029-10ge7909538.8GB 284105 0 0completed 1559.00 gprfs032-10ge00Bytes 0 0 0completed 402.00 volume rebalance: rbperf: success: Rebalance stat for old model: [root@gprfs031 ~]# gluster v rebalance rbperf status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost8649342.0GB 634302 0 0completed 3329.00 gprfs029-10ge9411546.2GB 687852 0 0completed 3328.00 gprfs030-10ge7431435.9GB 651943 0 0completed 3072.00 gprfs032-10ge00Bytes 594166 0 0completed 1943.00 volume rebalance: rbperf: success: This is interesting. Thanks for sharing well done! Maybe we should attempt a much larger data set and see how we fare there :). Regards, Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Rebalance improvement design
On 04/07/2015 03:08 PM, Susant Palai wrote: Here is one test performed on a 300GB data set and around 100%(1/2 the time) improvement was seen. [root@gprfs031 ~]# gluster v i Volume Name: rbperf Type: Distribute Volume ID: 35562662-337e-4923-b862-d0bbb0748003 Status: Started Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: gprfs029-10ge:/bricks/gprfs029/brick1 Brick2: gprfs030-10ge:/bricks/gprfs030/brick1 Brick3: gprfs031-10ge:/bricks/gprfs031/brick1 Brick4: gprfs032-10ge:/bricks/gprfs032/brick1 Added server 32 and started rebalance force. Rebalance stat for new changes: [root@gprfs031 ~]# gluster v rebalance rbperf status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost7463936.1GB 297319 0 0completed1743.00 172.17.40.306751233.5GB 269187 0 0completed1395.00 gprfs029-10ge7909538.8GB 284105 0 0completed1559.00 gprfs032-10ge00Bytes 0 0 0completed 402.00 volume rebalance: rbperf: success: Rebalance stat for old model: [root@gprfs031 ~]# gluster v rebalance rbperf status Node Rebalanced-files size scanned failures skipped status run time in secs - --- --- --- --- --- -- localhost8649342.0GB 634302 0 0completed3329.00 gprfs029-10ge9411546.2GB 687852 0 0completed3328.00 gprfs030-10ge7431435.9GB 651943 0 0completed3072.00 gprfs032-10ge00Bytes 594166 0 0completed1943.00 volume rebalance: rbperf: success: This is interesting. Thanks for sharing well done! Maybe we should attempt a much larger data set and see how we fare there :). Regards, Vijay ___ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel