Re: [PERFORM] Proposal of tunable fix for scalability of 8.4
decibel wrote: On Mar 13, 2009, at 3:02 PM, Jignesh K. Shah wrote: vmstat seems similar to wakeup some kthr memorypagedisk faults cpu r b w swap free re mf pi po fr de sr s0 s1 s2 sd in sy cs us sy id 63 0 0 45535728 38689856 0 14 0 0 0 0 0 0 0 0 0 163318 334225 360179 47 17 36 85 0 0 45436736 38690760 0 6 0 0 0 0 0 0 0 0 0 165536 347462 365987 47 17 36 59 0 0 45405184 38681752 0 11 0 0 0 0 0 0 0 0 0 155153 326182 345527 47 16 37 53 0 0 45393816 38673344 0 6 0 0 0 0 0 0 0 0 0 152752 317851 340737 47 16 37 66 0 0 45378312 38651920 0 11 0 0 0 0 0 0 0 0 0 150979 304350 336915 47 16 38 67 0 0 45489520 38639664 0 5 0 0 0 0 0 0 0 0 0 157188 318958 351905 47 16 37 82 0 0 45483600 38633344 0 10 0 0 0 0 0 0 0 0 0 168797 348619 375827 47 17 36 68 0 0 45463008 38614432 0 9 0 0 0 0 0 0 0 0 0 173020 376594 385370 47 18 35 54 0 0 45451376 38603792 0 13 0 0 0 0 0 0 0 0 0 161891 342522 364286 48 17 35 41 0 0 45356544 38605976 0 5 0 0 0 0 0 0 0 0 0 167250 358320 372469 47 17 36 27 0 0 45323472 38596952 0 11 0 0 0 0 0 0 0 0 0 165099 344695 364256 48 17 35 The good news is there's now at least enough runnable procs. What I find *extremely* odd is the CPU usage is almost dead constant... Generally when there is dead constant.. signs of classic bottleneck ;-) We will be fixing one to get to another.. but knocking bottlenecks is the name of the game I think -Jignesh -- Jignesh Shah http://blogs.sun.com/jkshah The New Sun Microsystems,Inc http://sun.com/postgresql -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Proposal of tunable fix for scalability of 8.4
decibel wrote: On Mar 11, 2009, at 10:48 PM, Jignesh K. Shah wrote: Fair enough.. Well I am now appealing to all who has a fairly decent sized hardware want to try it out and see whether there are "gains", "no-changes" or "regressions" based on your workload. Also it will help if you report number of cpus when you respond back to help collect feedback. Do you have a self-contained test case? I have several boxes with 16-cores worth of Xeon with 96GB I could try it on (though you might not care about having "only" 16 cores :P) I dont have authority over iGen, but I am pretty sure that with sysbench we should be able to recreate the test case or even dbt-2 That said the patch should be pretty easy to apply to your own workloads (where more feedback is more appreciated ).. On x64 16 cores might bring out the problem faster too since typically they are 2.5X higher clock frequency.. Try it out.. stock build vs patched builds. -Jignesh -- Jignesh Shah http://blogs.sun.com/jkshah The New Sun Microsystems,Inc http://sun.com/postgresql -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Proposal of tunable fix for scalability of 8.4
Simon Riggs wrote: On Wed, 2009-03-11 at 16:53 -0400, Jignesh K. Shah wrote: 1200: 2000: Medium Throughput: -1781969.000 Avg Medium Resp: 0.019 I think you need to iron out bugs in your test script before we put too much stock into the results generated. Your throughput should not be negative. I'd be interested in knowing the number of S and X locks requested, so we can think about this from first principles. My understanding is that ratio of S:X is about 10:1. Do you have more exact numbers? Simon, that's a known bug for the test where the first time it reaches the max number of users, it throws a negative number. But all other numbers are pretty much accurate Generally the users:transactions count depends on think time.. -Jignesh -- Jignesh Shah http://blogs.sun.com/jkshah The New Sun Microsystems,Inc http://sun.com/postgresql -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Proposal of tunable fix for scalability of 8.4
Top posting because my email client will mess up the inline: Re: advance insert pointer. I have no idea how complicated that advance part is as you allude to. But can this be done without a lock at all? An atomic compare and exchange (or compare and set, etc) should do it. Although boundaries in buffers could make it a bit more complicated than that. Sounds potentially lockless to me. CompareAndSet - like atomics would prevent context switches entirely and generally work fabulous if the item that needs locking is itself an atomic value like a pointer or int. This is similar to, but lighter weight than, a spin lock. From: Tom Lane [...@sss.pgh.pa.us] Sent: Saturday, March 14, 2009 9:09 AM To: Heikki Linnakangas Cc: Robert Haas; Scott Carey; Greg Smith; Jignesh K. Shah; Kevin Grittner; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Proposal of tunable fix for scalability of 8.4 Yeah, that's been seen to be an issue before. I had the germ of an idea about how to fix that: ... with no lock, determine size of WAL record ... obtain WALInsertLock identify WAL start address of my record, advance insert pointer past record end *release* WALInsertLock without lock, copy record into the space just reserved The idea here is to allow parallelization of the copying of data into the buffers. The hold time on WALInsertLock would be very short. Maybe it could even become a spinlock, though I'm not sure, because the "advance insert pointer" bit is more complicated than it looks (you have to allow for the extra overhead when crossing a WAL page boundary). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance