Re: [sheepdog] [PATCH 0/3] work: implement a dynamically changing thread pool
At Thu, 25 Oct 2012 13:55:20 +0200, Bastian Scholz wrote: > Master Branch from today... > > # collie vdi create test 1G > # ./vditest test -w -B 4096 -a > > First try, some sheeps die and the vdis had failed objects. > After shutting down the remaining sheeps, restart all and > wait for recovery finished the test vdi was gone... (no > logs available, sorry) > > Second try, the gateway process hangs and write a lot of > this to the logfile... > > Oct 25 13:47:01 [main] listen_handler(839) failed to > > accept a new connection: Too many open files Current Sheepdog code has a bug in handling too many concurrent requests. I'll fix it soon. > > Without -a parameter it works... > First try... > # ./vditest test -w -B 4096 > options: -B 4096:4096 -c writethrough -D 0:100 -o 0 > -p linear -s 1351165854 -S 0:1073741824 -T 10 -f -1 > Total write throughput: 256409.6B/s (250.4K/s), IOPS 62.6/s. > > After the 10th run of the same command > # ./vditest test -w -B 4096 > options: -B 4096:4096 -c writethrough -D 0:100 -o 0 > -p linear -s 1351166058 -S 0:1073741824 -T 10 -f -1 > Total write throughput: 1862041.6B/s (1.8M/s), IOPS 454.6/s. > > And the throughput and IOPS are still increasing with each > additional test... > > Any additional hints for me how to continue with testing? Writing data to unallocated areas causes a metadata update. You can get a more stable result if you preallocate data when creating VDI. $ collie vdi create test 100M -P Thanks, Kazutaka -- sheepdog mailing list sheepdog@lists.wpkg.org http://lists.wpkg.org/mailman/listinfo/sheepdog
Re: [sheepdog] [PATCH 0/3] work: implement a dynamically changing thread pool
OKAY... call me sheep-ripper... Am 2012-10-25 10:08, schrieb MORITA Kazutaka: So far, I've not encountered situations where my patch shows worse performance. In most cases, queue_work is called only from one thread, so serializing at queue_work is unlikely to be a problem. Another contention is between queue_work and worker_routine, but worker threads spend the most of time doing disk I/Os and it also couldn't be a problem on my environment. Master Branch from today... # collie vdi create test 1G # ./vditest test -w -B 4096 -a First try, some sheeps die and the vdis had failed objects. After shutting down the remaining sheeps, restart all and wait for recovery finished the test vdi was gone... (no logs available, sorry) Second try, the gateway process hangs and write a lot of this to the logfile... Oct 25 13:47:01 [main] listen_handler(839) failed to accept a new connection: Too many open files Without -a parameter it works... First try... # ./vditest test -w -B 4096 options: -B 4096:4096 -c writethrough -D 0:100 -o 0 -p linear -s 1351165854 -S 0:1073741824 -T 10 -f -1 Total write throughput: 256409.6B/s (250.4K/s), IOPS 62.6/s. After the 10th run of the same command # ./vditest test -w -B 4096 options: -B 4096:4096 -c writethrough -D 0:100 -o 0 -p linear -s 1351166058 -S 0:1073741824 -T 10 -f -1 Total write throughput: 1862041.6B/s (1.8M/s), IOPS 454.6/s. And the throughput and IOPS are still increasing with each additional test... Any additional hints for me how to continue with testing? # collie node list M Id Host:Port V-Nodes Zone -0 10.0.1.61:7000 0 1023475722 -1 10.0.1.62:7000 0 1040252938 -2 10.0.1.62:7001 64 1040252938 -3 10.0.1.62:7002 64 1040252938 -4 10.0.1.62:7003 64 1040252938 -5 10.0.1.63:7000 0 1057030154 -6 10.0.1.63:7001 64 1057030154 -7 10.0.1.63:7002 64 1057030154 -8 10.0.1.63:7003 64 1057030154 Thanks Bastian -- sheepdog mailing list sheepdog@lists.wpkg.org http://lists.wpkg.org/mailman/listinfo/sheepdog
Re: [sheepdog] [PATCH 0/3] work: implement a dynamically changing thread pool
At Thu, 25 Oct 2012 09:14:32 +0200, Bastian Scholz wrote: > > Am 2012-10-22 08:43, schrieb MORITA Kazutaka: > > Yes, we need more numbers with various conditions to change the > > design. (I like this patch implementation, which uses the same code > > with ordered work queue, though.) > > > > I think of trying it, but I wish more users would test it too. > > Hi Kazutaka, > > If I want to test it, can I use the thread branch from git? Yes, thanks! So far, I've not encountered situations where my patch shows worse performance. In most cases, queue_work is called only from one thread, so serializing at queue_work is unlikely to be a problem. Another contention is between queue_work and worker_routine, but worker threads spend the most of time doing disk I/Os and it also couldn't be a problem on my environment. Thanks, Kazutaka -- sheepdog mailing list sheepdog@lists.wpkg.org http://lists.wpkg.org/mailman/listinfo/sheepdog
Re: [sheepdog] [PATCH 0/3] work: implement a dynamically changing thread pool
Am 2012-10-22 08:43, schrieb MORITA Kazutaka: Yes, we need more numbers with various conditions to change the design. (I like this patch implementation, which uses the same code with ordered work queue, though.) I think of trying it, but I wish more users would test it too. Hi Kazutaka, If I want to test it, can I use the thread branch from git? Cheers Bastian -- sheepdog mailing list sheepdog@lists.wpkg.org http://lists.wpkg.org/mailman/listinfo/sheepdog
Re: [sheepdog] [PATCH 0/3] work: implement a dynamically changing thread pool
At Mon, 22 Oct 2012 16:46:35 +0900, MORITA Kazutaka wrote: > > At Mon, 22 Oct 2012 15:35:46 +0800, > Liu Yuan wrote: > > > > On 10/22/2012 02:50 PM, Liu Yuan wrote: > > > On 10/22/2012 02:43 PM, MORITA Kazutaka wrote: > > >> I think of trying it, but I wish more users would test it too. > > > > > > I have tested it on my laptop and get the similar result. > > > > > > What I am only concerned is that if pthread signal & wakeup use a signal > > > wakeup queue instead of multi-queues, the wakeup itself would be huge > > > bottleneck. I'll try to figure it out what pthread signal use. For > > > pthread_create, which use clone() sys call that scale well on SMP machine. > > > > > > > pthread signal use Futex, it might not be a scaling problem. But the new > > queue_work() are serialized by a mutex (cond_mutex), so probably your > > patch set won't perform as well as single IO source against heavy > > multiple IO sources, which is the normal use case? > > Processing parallel I/O is more common. I'll try to do the benchmark > and improve this patch if it doesn't show a good performance. I tested multiple I/Os with a single machine, but my patch still showed a better performance. I ran the following command, which can cause 500 I/O requests at the same time: $. /script/vditest test -w -B 4096 -a The current implementation shows about 3600 IOPS, and my patch does about 3900 IOPS. I'll try more tests, but my patch looks better for now. Thanks, Kazutaka -- sheepdog mailing list sheepdog@lists.wpkg.org http://lists.wpkg.org/mailman/listinfo/sheepdog
Re: [sheepdog] [PATCH 0/3] work: implement a dynamically changing thread pool
At Mon, 22 Oct 2012 15:35:46 +0800, Liu Yuan wrote: > > On 10/22/2012 02:50 PM, Liu Yuan wrote: > > On 10/22/2012 02:43 PM, MORITA Kazutaka wrote: > >> I think of trying it, but I wish more users would test it too. > > > > I have tested it on my laptop and get the similar result. > > > > What I am only concerned is that if pthread signal & wakeup use a signal > > wakeup queue instead of multi-queues, the wakeup itself would be huge > > bottleneck. I'll try to figure it out what pthread signal use. For > > pthread_create, which use clone() sys call that scale well on SMP machine. > > > > pthread signal use Futex, it might not be a scaling problem. But the new > queue_work() are serialized by a mutex (cond_mutex), so probably your > patch set won't perform as well as single IO source against heavy > multiple IO sources, which is the normal use case? Processing parallel I/O is more common. I'll try to do the benchmark and improve this patch if it doesn't show a good performance. Thanks, Kazutaka -- sheepdog mailing list sheepdog@lists.wpkg.org http://lists.wpkg.org/mailman/listinfo/sheepdog
Re: [sheepdog] [PATCH 0/3] work: implement a dynamically changing thread pool
On 10/22/2012 02:50 PM, Liu Yuan wrote: > On 10/22/2012 02:43 PM, MORITA Kazutaka wrote: >> I think of trying it, but I wish more users would test it too. > > I have tested it on my laptop and get the similar result. > > What I am only concerned is that if pthread signal & wakeup use a signal > wakeup queue instead of multi-queues, the wakeup itself would be huge > bottleneck. I'll try to figure it out what pthread signal use. For > pthread_create, which use clone() sys call that scale well on SMP machine. > pthread signal use Futex, it might not be a scaling problem. But the new queue_work() are serialized by a mutex (cond_mutex), so probably your patch set won't perform as well as single IO source against heavy multiple IO sources, which is the normal use case? Thanks, Yuan -- sheepdog mailing list sheepdog@lists.wpkg.org http://lists.wpkg.org/mailman/listinfo/sheepdog
Re: [sheepdog] [PATCH 0/3] work: implement a dynamically changing thread pool
On 10/22/2012 02:43 PM, MORITA Kazutaka wrote: > I think of trying it, but I wish more users would test it too. I have tested it on my laptop and get the similar result. What I am only concerned is that if pthread signal & wakeup use a signal wakeup queue instead of multi-queues, the wakeup itself would be huge bottleneck. I'll try to figure it out what pthread signal use. For pthread_create, which use clone() sys call that scale well on SMP machine. Thanks, Yuan -- sheepdog mailing list sheepdog@lists.wpkg.org http://lists.wpkg.org/mailman/listinfo/sheepdog
Re: [sheepdog] [PATCH 0/3] work: implement a dynamically changing thread pool
At Mon, 22 Oct 2012 14:30:51 +0800, Liu Yuan wrote: > > On 10/22/2012 12:31 PM, MORITA Kazutaka wrote: > > Currently, sheep calls a pthread_create for every I/O request, but the > > overhead is not so cheap. On my environment, it takes 320 > > microseconds for sheep to process 4 KB write, but I found that sheep > > spends 30~40 microseconds in pthread_create. > > > > This series removes a short thread and implements a dynamic worker > > thread pool based on the previous work queue implementation. With > > this series, the 4 KB write performance was increased from 3100 IOPS > > to 3350 IOPS on my environment. > > It is indeed a booster, but I guess just in some conditions (I guess you > might measure it on a single local node and minimize the network Yes. > overhead). With network added in the path, the booster might be smaller. > > Do you too measure how much time pthread_cond_signal() & wakeup take? I > guess signal & wakeup mechanism will be deteriorated with heavy > workloads, so probably this booster might be neutralized. But I am not > sure which one will perform better in heavy workloads, clone() syscall > or pthread signaling(I suspect that pthread signal also use system calls > to signal & wakeup, and if there is single internal wakeup queue, this > would be bottleneck). Yes, we need more numbers with various conditions to change the design. (I like this patch implementation, which uses the same code with ordered work queue, though.) I think of trying it, but I wish more users would test it too. Thanks, Kazutaka -- sheepdog mailing list sheepdog@lists.wpkg.org http://lists.wpkg.org/mailman/listinfo/sheepdog
Re: [sheepdog] [PATCH 0/3] work: implement a dynamically changing thread pool
On 10/22/2012 12:31 PM, MORITA Kazutaka wrote: > Currently, sheep calls a pthread_create for every I/O request, but the > overhead is not so cheap. On my environment, it takes 320 > microseconds for sheep to process 4 KB write, but I found that sheep > spends 30~40 microseconds in pthread_create. > > This series removes a short thread and implements a dynamic worker > thread pool based on the previous work queue implementation. With > this series, the 4 KB write performance was increased from 3100 IOPS > to 3350 IOPS on my environment. It is indeed a booster, but I guess just in some conditions (I guess you might measure it on a single local node and minimize the network overhead). With network added in the path, the booster might be smaller. Do you too measure how much time pthread_cond_signal() & wakeup take? I guess signal & wakeup mechanism will be deteriorated with heavy workloads, so probably this booster might be neutralized. But I am not sure which one will perform better in heavy workloads, clone() syscall or pthread signaling(I suspect that pthread signal also use system calls to signal & wakeup, and if there is single internal wakeup queue, this would be bottleneck). Thanks, Yuan -- sheepdog mailing list sheepdog@lists.wpkg.org http://lists.wpkg.org/mailman/listinfo/sheepdog