Re: [PERFORM] User concurrency thresholding: where do I look?
Yep quite a bit of transactions .. But the piece that's slow is where it is clearing it up in CommitTransaction(). I am not sure of how ProcArrayLock is designed to work and hence not clear what we are seeing is what we expect. Do we have some design doc on ProcArrayLock to understand its purpose? Thanks. Regards, Jignesh Josh Berkus wrote: Simon, Well thats pretty weird. That code path clearly only happens once per transaction and ought to be fast. The other code paths that take ProcArrayLock like TransactionIdIsInProgress() and GetSnapshotData() ought to spend more time holding the lock. Presumably you are running with a fair number of SERIALIZABLE transactions? Given that this is TPCC-analog, I'd assume that we are. Jignesh? ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] User concurrency thresholding: where do I look?
Simon, > Well thats pretty weird. That code path clearly only happens once per > transaction and ought to be fast. The other code paths that take > ProcArrayLock like TransactionIdIsInProgress() and GetSnapshotData() > ought to spend more time holding the lock. Presumably you are running > with a fair number of SERIALIZABLE transactions? Given that this is TPCC-analog, I'd assume that we are. Jignesh? -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] User concurrency thresholding: where do I look?
With CLOG 16 the drp[s comes at about 1150 users with the following lock stats bash-3.00# ./4_lwlock_waits.d 16404 Lock IdMode Count ProcArrayLock Shared 2 XidGenLock Exclusive 2 XidGenLock Shared 4 WALInsertLock Exclusive 7 CLogControlLock Shared 8 WALWriteLock Exclusive 46 ProcArrayLock Exclusive 64 CLogControlLock Exclusive 263 Lock IdMode Combined Time (ns) XidGenLock Exclusive 528300 ProcArrayLock Shared 968800 WALInsertLock Exclusive 4090900 XidGenLock Shared 73987600 WALWriteLock Exclusive 86200700 ProcArrayLock Exclusive130756000 CLogControlLock Shared240471000 CLogControlLock Exclusive 4115158500 So I think 32 is a better option for CLogs before ProcArrayLock becomes the bottleneck. Though I havent seen what we can do with ProcArrayLock problem. Regards, Jignesh Jignesh K. Shah wrote: Using CLOG Buffers 32 and the commit sibling check patch I still see a drop at 1200-1300 users.. bash-3.00# ./4_lwlock_waits.d 18250 Lock IdMode Count XidGenLock Shared 1 CLogControlLock Shared 2 ProcArrayLock Shared 2 XidGenLock Exclusive 4 CLogControlLock Exclusive 15 WALInsertLock Exclusive 18 WALWriteLock Exclusive 38 ProcArrayLock Exclusive 77 Lock IdMode Combined Time (ns) XidGenLock Shared88700 WALInsertLock Exclusive 69556000 ProcArrayLock Shared 95656800 XidGenLock Exclusive139634100 CLogControlLock Exclusive148822200 CLogControlLock Shared16163 WALWriteLock Exclusive332781800 ProcArrayLock Exclusive 5688265500 bash-3.00# ./4_lwlock_waits.d 18599 Lock IdMode Count ProcArrayLock Shared 2 XidGenLock Exclusive 3 XidGenLock Shared 4 CLogControlLock Shared 5 WALInsertLock Exclusive 10 CLogControlLock Exclusive 21 WALWriteLock Exclusive 28 ProcArrayLock Exclusive 54 Lock IdMode Combined Time (ns) XidGenLock Exclusive 5688800 WALInsertLock Exclusive 11424700 CLogControlLock Shared 55589100 ProcArrayLock Shared135220400 WALWriteLock Exclusive177906900 XidGenLock Shared524146500 CLogControlLock Exclusive524563900 ProcArrayLock Exclusive 5828744500 bash-3.00# bash-3.00# ./6_lwlock_stack.d 4 18599 Lock IdMode Count ProcArrayLock Shared 1 ProcArrayLock Exclusive 52 Lock IdMode Combined Time (ns) ProcArrayLock Shared 41428300 ProcArrayLock Exclusive 3858386500 Lock Id Combined Time (ns) postgres`LWLockAcquire+0x1f0 postgres`GetSnapshotData+0x120 postgres`GetTransactionSnapshot+0x80 postgres`PortalStart+0x198 postgres`exec_bind_message+0x84c postgres`PostgresMain+0x17f8 postgres`BackendRun+0x2f8 postgres`ServerLoop+0x680 postgres`PostmasterMain+0xda8 postgres`main+0x3d0 postgres`_start+0x17c Shared 41428300 postgres`LWLockAcquire+0x1f0 postgres`CommitTransaction+0x104 postgres`CommitTransactionCommand+0xbc postgres`finish_xact_command+0x78 postgres`exec_execute_message+0x42c postgres`PostgresMain+0x1838 postgres`BackendRun+0x2f8 postgres`ServerLoop+0x680 postgres`PostmasterMain+0xda8 postgres`main+0x3d0 postgres`_start+0x17c Exclusive3858386500 -Jignesh ---(end of broadcast)
Re: [PERFORM] User concurrency thresholding: where do I look?
Using CLOG Buffers 32 and the commit sibling check patch I still see a drop at 1200-1300 users.. bash-3.00# ./4_lwlock_waits.d 18250 Lock IdMode Count XidGenLock Shared 1 CLogControlLock Shared 2 ProcArrayLock Shared 2 XidGenLock Exclusive 4 CLogControlLock Exclusive 15 WALInsertLock Exclusive 18 WALWriteLock Exclusive 38 ProcArrayLock Exclusive 77 Lock IdMode Combined Time (ns) XidGenLock Shared88700 WALInsertLock Exclusive 69556000 ProcArrayLock Shared 95656800 XidGenLock Exclusive139634100 CLogControlLock Exclusive148822200 CLogControlLock Shared16163 WALWriteLock Exclusive332781800 ProcArrayLock Exclusive 5688265500 bash-3.00# ./4_lwlock_waits.d 18599 Lock IdMode Count ProcArrayLock Shared 2 XidGenLock Exclusive 3 XidGenLock Shared 4 CLogControlLock Shared 5 WALInsertLock Exclusive 10 CLogControlLock Exclusive 21 WALWriteLock Exclusive 28 ProcArrayLock Exclusive 54 Lock IdMode Combined Time (ns) XidGenLock Exclusive 5688800 WALInsertLock Exclusive 11424700 CLogControlLock Shared 55589100 ProcArrayLock Shared135220400 WALWriteLock Exclusive177906900 XidGenLock Shared524146500 CLogControlLock Exclusive524563900 ProcArrayLock Exclusive 5828744500 bash-3.00# bash-3.00# ./6_lwlock_stack.d 4 18599 Lock IdMode Count ProcArrayLock Shared 1 ProcArrayLock Exclusive 52 Lock IdMode Combined Time (ns) ProcArrayLock Shared 41428300 ProcArrayLock Exclusive 3858386500 Lock Id Combined Time (ns) postgres`LWLockAcquire+0x1f0 postgres`GetSnapshotData+0x120 postgres`GetTransactionSnapshot+0x80 postgres`PortalStart+0x198 postgres`exec_bind_message+0x84c postgres`PostgresMain+0x17f8 postgres`BackendRun+0x2f8 postgres`ServerLoop+0x680 postgres`PostmasterMain+0xda8 postgres`main+0x3d0 postgres`_start+0x17c Shared 41428300 postgres`LWLockAcquire+0x1f0 postgres`CommitTransaction+0x104 postgres`CommitTransactionCommand+0xbc postgres`finish_xact_command+0x78 postgres`exec_execute_message+0x42c postgres`PostgresMain+0x1838 postgres`BackendRun+0x2f8 postgres`ServerLoop+0x680 postgres`PostmasterMain+0xda8 postgres`main+0x3d0 postgres`_start+0x17c Exclusive3858386500 -Jignesh ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] User concurrency thresholding: where do I look?
I tried CLOG Buffers 32 and the performance is as good as 64 bit.. (I havent tried 16 yet though.. ) I am going to try your second patch now.. Also here is the breakup by Mode. The combined time is the total time it waits for all counts. Lock IdMode Count ProcArrayLock Shared 1 CLogControlLock Exclusive 4 CLogControlLock Shared 4 XidGenLock Shared 4 XidGenLock Exclusive 7 WALInsertLock Exclusive 21 WALWriteLock Exclusive 62 ProcArrayLock Exclusive 79 Lock IdModeCombined Time (ns) CLogControlLockExclusive 325200 CLogControlLock Shared 4509200 XidGenLockExclusive 11839600 ProcArrayLock Shared 40506600 XidGenLock Shared119013700 WALInsertLockExclusive148063100 WALWriteLockExclusive347052100 ProcArrayLockExclusive 1054780600 Here is another one at higher user count 1600: bash-3.00# ./4_lwlock_waits.d 9208 Lock IdMode Count CLogControlLock Exclusive 1 CLogControlLock Shared 2 XidGenLock Shared 7 WALInsertLock Exclusive 12 WALWriteLock Exclusive 50 ProcArrayLock Exclusive 82 Lock IdMode Combined Time (ns) CLogControlLockExclusive27300 XidGenLock Shared 14689300 CLogControlLock Shared 72664900 WALInsertLockExclusive101431300 WALWriteLockExclusive534357400 ProcArrayLockExclusive 4110350300 Now I will try with your second patch. Regards, Jignesh Simon Riggs wrote: On Thu, 2007-07-26 at 17:17 -0400, Jignesh K. Shah wrote: Lock Id Combined Time (ns) XidGenLock194966200 WALInsertLock517955000 CLogControlLock679665100 WALWriteLock 2838716200 ProcArrayLock 44181002600 Is this the time the lock is held for or the time that we wait for that lock? It would be good to see the break down of time separately for shared and exclusive. Can we have a table like this: LockId,LockMode,SumTimeLockHeld,SumTimeLockWait Top Wait time seems to come from the following code path for ProcArrayLock: Lock IdMode Count ProcArrayLock Exclusive 21 Lock Id Combined Time (ns) ProcArrayLock 5255937500 Lock Id Combined Time (ns) postgres`LWLockAcquire+0x1f0 postgres`CommitTransaction+0x104 postgres`CommitTransactionCommand+0xbc postgres`finish_xact_command+0x78 Well thats pretty weird. That code path clearly only happens once per transaction and ought to be fast. The other code paths that take ProcArrayLock like TransactionIdIsInProgress() and GetSnapshotData() ought to spend more time holding the lock. Presumably you are running with a fair number of SERIALIZABLE transactions? Are you running with commit_delay > 0? Its possible that the call to CountActiveBackends() is causing pinging of the procarray by other backends while we're trying to read it during CommitTransaction(). If so, try the attached patch. Index: src/backend/access/transam/xact.c === RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/xact.c,v retrieving revision 1.245 diff -c -r1.245 xact.c *** src/backend/access/transam/xact.c 7 Jun 2007 21:45:58 - 1.245 --- src/backend/access/transam/xact.c 27 Jul 2007 09:09:08 - *** *** 820,827 * are fewer than CommitSiblings other backends with active * transactions. */ ! if (CommitDelay > 0 && enableFsync && ! CountActiveBackends() >= CommitSiblings) pg_usleep(CommitDelay); XLogFlush(recptr); --- 820,826 * are fewer than CommitSiblings other backends with active * transactions. */ ! if (CommitDelay > 0 && enableFsync) pg_usleep(CommitDelay); XLogFlush(recptr);
Re: [PERFORM] User concurrency thresholding: where do I look?
Yes I can try to breakup the Shared and exclusive time.. Also yes I use commit delays =10, it helps out a lot in reducing IO load.. I will try out the patch soon. -Jignesh Simon Riggs wrote: On Thu, 2007-07-26 at 17:17 -0400, Jignesh K. Shah wrote: Lock Id Combined Time (ns) XidGenLock194966200 WALInsertLock517955000 CLogControlLock679665100 WALWriteLock 2838716200 ProcArrayLock 44181002600 Is this the time the lock is held for or the time that we wait for that lock? It would be good to see the break down of time separately for shared and exclusive. Can we have a table like this: LockId,LockMode,SumTimeLockHeld,SumTimeLockWait Top Wait time seems to come from the following code path for ProcArrayLock: Lock IdMode Count ProcArrayLock Exclusive 21 Lock Id Combined Time (ns) ProcArrayLock 5255937500 Lock Id Combined Time (ns) postgres`LWLockAcquire+0x1f0 postgres`CommitTransaction+0x104 postgres`CommitTransactionCommand+0xbc postgres`finish_xact_command+0x78 Well thats pretty weird. That code path clearly only happens once per transaction and ought to be fast. The other code paths that take ProcArrayLock like TransactionIdIsInProgress() and GetSnapshotData() ought to spend more time holding the lock. Presumably you are running with a fair number of SERIALIZABLE transactions? Are you running with commit_delay > 0? Its possible that the call to CountActiveBackends() is causing pinging of the procarray by other backends while we're trying to read it during CommitTransaction(). If so, try the attached patch. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] User concurrency thresholding: where do I look?
On Fri, 2007-07-27 at 04:58 -0400, Alvaro Herrera wrote: > Jignesh K. Shah wrote: > > > Top Wait time seems to come from the following code path for > > ProcArrayLock: > > > > Lock IdMode Count > > ProcArrayLock Exclusive 21 > > > > Lock Id Combined Time (ns) > > ProcArrayLock 5255937500 > > > > Lock Id Combined Time (ns) > > > > > > postgres`LWLockAcquire+0x1f0 > > postgres`CommitTransaction+0x104 > > Yeah, ProcArrayLock is pretty contended. I think it would be kinda neat > if we could split it up in partitions. This lock is quite particular > though. Maybe, if we did we should set the partitions according to numbers of users, so lower numbers of users are all in one partition. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] User concurrency thresholding: where do I look?
On Thu, 2007-07-26 at 17:17 -0400, Jignesh K. Shah wrote: > Lock Id Combined Time (ns) > XidGenLock194966200 >WALInsertLock517955000 > CLogControlLock679665100 > WALWriteLock 2838716200 >ProcArrayLock 44181002600 Is this the time the lock is held for or the time that we wait for that lock? It would be good to see the break down of time separately for shared and exclusive. Can we have a table like this: LockId,LockMode,SumTimeLockHeld,SumTimeLockWait > Top Wait time seems to come from the following code path for > ProcArrayLock: > > Lock IdMode Count >ProcArrayLock Exclusive 21 > > Lock Id Combined Time (ns) >ProcArrayLock 5255937500 > > Lock Id Combined Time (ns) > > > postgres`LWLockAcquire+0x1f0 > postgres`CommitTransaction+0x104 > postgres`CommitTransactionCommand+0xbc > postgres`finish_xact_command+0x78 Well thats pretty weird. That code path clearly only happens once per transaction and ought to be fast. The other code paths that take ProcArrayLock like TransactionIdIsInProgress() and GetSnapshotData() ought to spend more time holding the lock. Presumably you are running with a fair number of SERIALIZABLE transactions? Are you running with commit_delay > 0? Its possible that the call to CountActiveBackends() is causing pinging of the procarray by other backends while we're trying to read it during CommitTransaction(). If so, try the attached patch. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com Index: src/backend/access/transam/xact.c === RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/xact.c,v retrieving revision 1.245 diff -c -r1.245 xact.c *** src/backend/access/transam/xact.c 7 Jun 2007 21:45:58 - 1.245 --- src/backend/access/transam/xact.c 27 Jul 2007 09:09:08 - *** *** 820,827 * are fewer than CommitSiblings other backends with active * transactions. */ ! if (CommitDelay > 0 && enableFsync && ! CountActiveBackends() >= CommitSiblings) pg_usleep(CommitDelay); XLogFlush(recptr); --- 820,826 * are fewer than CommitSiblings other backends with active * transactions. */ ! if (CommitDelay > 0 && enableFsync) pg_usleep(CommitDelay); XLogFlush(recptr); ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] User concurrency thresholding: where do I look?
Jignesh K. Shah wrote: > Top Wait time seems to come from the following code path for > ProcArrayLock: > > Lock IdMode Count > ProcArrayLock Exclusive 21 > > Lock Id Combined Time (ns) > ProcArrayLock 5255937500 > > Lock Id Combined Time (ns) > > > postgres`LWLockAcquire+0x1f0 > postgres`CommitTransaction+0x104 Yeah, ProcArrayLock is pretty contended. I think it would be kinda neat if we could split it up in partitions. This lock is quite particular though. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] User concurrency thresholding: where do I look?
Will try 16 and 32 CLOGBUFFER tomorrow: But here is locks data again with about increased time profiling (about 2 minutes) for the connection with about 2000 users: bash-3.00# time ./4_lwlock_waits.d 13583 ^C Lock IdMode Count ProcArrayLock Shared 5 XidGenLock Exclusive 13 CLogControlLock Shared 14 XidGenLock Shared 15 CLogControlLock Exclusive 21 WALInsertLock Exclusive 77 WALWriteLock Exclusive 175 ProcArrayLock Exclusive 275 Lock Id Combined Time (ns) XidGenLock194966200 WALInsertLock517955000 CLogControlLock679665100 WALWriteLock 2838716200 ProcArrayLock 44181002600 Top Wait time seems to come from the following code path for ProcArrayLock: Lock IdMode Count ProcArrayLock Exclusive 21 Lock Id Combined Time (ns) ProcArrayLock 5255937500 Lock Id Combined Time (ns) postgres`LWLockAcquire+0x1f0 postgres`CommitTransaction+0x104 postgres`CommitTransactionCommand+0xbc postgres`finish_xact_command+0x78 postgres`exec_execute_message+0x42c postgres`PostgresMain+0x1838 postgres`BackendRun+0x2f8 postgres`ServerLoop+0x680 postgres`PostmasterMain+0xda8 postgres`main+0x3d0 postgres`_start+0x17c 5255937500 Regards, Jignesh Simon Riggs wrote: On Thu, 2007-07-26 at 15:44 -0400, Jignesh K. Shah wrote: BEAUTIFUL!!! Using the Patch I can now go upto 1300 users without dropping.. But now it still repeats at 1300-1350 users.. OK, can you try again with 16 and 32 buffers please? We need to know how many is enough and whether this number needs to be variable via a parameter, or just slightly larger by default. Thanks. I corrected the Lock Descriptions based on what I got from lwlock.h and retried the whole scalability again with profiling again.. This time it looks like the ProcArrayLock That's what I would expect with that many users. Lock IdMode Count XidGenLock Exclusive 1 CLogControlLock Shared 2 XidGenLock Shared 2 WALWriteLock Exclusive 4 WALInsertLock Exclusive 8 CLogControlLock Exclusive 9 ProcArrayLock Exclusive 9 ...but as Tom mentioned, we need to do longer runs now so these counts get to somewhere in the hundreds so we have some statistical validity. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] User concurrency thresholding: where do I look?
On Thu, 2007-07-26 at 15:44 -0400, Jignesh K. Shah wrote: > BEAUTIFUL!!! > > Using the Patch I can now go upto 1300 users without dropping.. But now > it still repeats at 1300-1350 users.. OK, can you try again with 16 and 32 buffers please? We need to know how many is enough and whether this number needs to be variable via a parameter, or just slightly larger by default. Thanks. > I corrected the Lock Descriptions based on what I got from lwlock.h and > retried the whole scalability again with profiling again.. This time it > looks like the ProcArrayLock That's what I would expect with that many users. > Lock IdMode Count > XidGenLock Exclusive 1 > CLogControlLock Shared 2 > XidGenLock Shared 2 > WALWriteLock Exclusive 4 >WALInsertLock Exclusive 8 > CLogControlLock Exclusive 9 >ProcArrayLock Exclusive 9 ...but as Tom mentioned, we need to do longer runs now so these counts get to somewhere in the hundreds so we have some statistical validity. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] User concurrency thresholding: where do I look?
Tom Lane wrote: That path would be taking CLogControlLock ... so you're off by at least one. Compare the script to src/include/storage/lwlock.h. Indeed, the indexing was off by one due to the removal of BuffMappingLock in src/include/storage/lwlock.h between 8.1 and 8.2 which was not updated in the DTrace script. Thanks, Robert ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] User concurrency thresholding: where do I look?
BEAUTIFUL!!! Using the Patch I can now go upto 1300 users without dropping.. But now it still repeats at 1300-1350 users.. I corrected the Lock Descriptions based on what I got from lwlock.h and retried the whole scalability again with profiling again.. This time it looks like the ProcArrayLock bash-3.00# echo 600 users 600 users bash-3.00# ./4_lwlock_waits.d 7056 Lock IdMode Count XidGenLock Exclusive 1 CLogControlLock Shared 2 XidGenLock Shared 2 WALWriteLock Exclusive 4 WALInsertLock Exclusive 8 CLogControlLock Exclusive 9 ProcArrayLock Exclusive 9 Lock Id Combined Time (ns) WALWriteLock 2842300 XidGenLock 4951000 CLogControlLock 11151800 WALInsertLock 13035600 ProcArrayLock 2004 bash-3.00# echo 700 users 700 users bash-3.00# ./4_lwlock_waits.d 7056 Lock IdMode Count XidGenLock Exclusive 1 WALWriteLock Exclusive 2 XidGenLock Shared 2 CLogControlLock Shared 3 CLogControlLock Exclusive 8 WALInsertLock Exclusive 9 ProcArrayLock Exclusive 22 Lock Id Combined Time (ns) XidGenLock 4093300 WALWriteLock 4914800 WALInsertLock 7389100 ProcArrayLock 10248200 CLogControlLock 11989400 bash-3.00# echo 800 users 800 users bash-3.00# ./4_lwlock_waits.d 7056 Lock IdMode Count WALWriteLock Exclusive 1 XidGenLock Shared 2 CLogControlLock Shared 3 CLogControlLock Exclusive 7 WALInsertLock Exclusive 7 ProcArrayLock Exclusive 31 Lock Id Combined Time (ns) WALWriteLock 319100 XidGenLock 5388700 WALInsertLock 9901400 CLogControlLock 13465000 ProcArrayLock 42979700 bash-3.00# echo 900 users 900 users bash-3.00# ./4_lwlock_waits.d 7056 Lock IdMode Count CLogControlLock Shared 1 XidGenLock Exclusive 1 WALWriteLock Exclusive 2 CLogControlLock Exclusive 6 WALInsertLock Exclusive 9 ProcArrayLock Exclusive 25 Lock Id Combined Time (ns) XidGenLock 3197700 WALWriteLock 3887100 CLogControlLock 15774500 WALInsertLock 38268700 ProcArrayLock162073100 bash-3.00# ./6_lwlock_stack.d 4 7056 Lock IdMode Count ProcArrayLock Shared 1 ProcArrayLock Exclusive 67 Lock Id Combined Time (ns) ProcArrayLock216773800 Lock Id Combined Time (ns) postgres`LWLockAcquire+0x1c8 postgres`GetSnapshotData+0x118 postgres`GetTransactionSnapshot+0x5c postgres`PortalStart+0x150 postgres`exec_bind_message+0x81c postgres`PostgresMain+0x12b8 postgres`BackendRun+0x278 postgres`ServerLoop+0x63c postgres`PostmasterMain+0xc40 postgres`main+0x394 postgres`_start+0x108 2779000 postgres`LWLockAcquire+0x1c8 postgres`CommitTransaction+0xe0 postgres`CommitTransactionCommand+0x90 postgres`finish_xact_command+0x60 postgres`exec_execute_message+0x3d8 postgres`PostgresMain+0x1300 postgres`BackendRun+0x278 postgres`ServerLoop+0x63c postgres`PostmasterMain+0xc40 postgres`main+0x394 postgres`_start+0x108 213994800 bash-3.00# echo 1000 users - SLIGHT DROP 1000 users bash-3.00# ./4_lwlock_waits.d 7056 Lock IdMode Count WALWriteLock Exclusive 1 CLogControlLock Exclusive 2 XidGenLock Shared 2 CLogControlLock Shared 3 WALInsertLock Exclusive 4 ProcArrayLock Exclusive 26 Lock Id Combined Time (ns) WALWriteLock 1807400 XidGenLock 2024000
Re: [PERFORM] User concurrency thresholding: where do I look?
On Thu, 2007-07-26 at 11:27 -0400, Jignesh K. Shah wrote: > However at 900 Users where the big drop in throughput occurs: > It gives a different top "consumer" of time: postgres`LWLockAcquire+0x1c8 > postgres`SimpleLruReadPage+0x1ac > postgres`TransactionIdGetStatus+0x14 > postgres`TransactionLogFetch+0x58 TransactionIdGetStatus doesn't directly call SimpleLruReadPage(). Presumably the compiler has been rearranging things?? Looks like you're out of clog buffers. It seems like the clog buffers aren't big enough to hold clog pages for long enough and the SELECT FOR SHARE processing is leaving lots of additional read locks that are increasing the number of clog requests for older xids. Try the enclosed patch. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com Index: src/include/access/clog.h === RCS file: /projects/cvsroot/pgsql/src/include/access/clog.h,v retrieving revision 1.19 diff -c -r1.19 clog.h *** src/include/access/clog.h 5 Jan 2007 22:19:50 - 1.19 --- src/include/access/clog.h 26 Jul 2007 15:44:58 - *** *** 29,35 /* Number of SLRU buffers to use for clog */ ! #define NUM_CLOG_BUFFERS 8 extern void TransactionIdSetStatus(TransactionId xid, XidStatus status); --- 29,35 /* Number of SLRU buffers to use for clog */ ! #define NUM_CLOG_BUFFERS 64 extern void TransactionIdSetStatus(TransactionId xid, XidStatus status); ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] User concurrency thresholding: where do I look?
"Jignesh K. Shah" <[EMAIL PROTECTED]> writes: > For 600-850 Users: that potential mislabeled CheckPointStartLock or > LockID==12 comes from various sources where the top source (while system > is still doing great) comes from: > postgres`LWLockAcquire+0x1c8 > postgres`SimpleLruReadPage_ReadOnly+0xc > postgres`TransactionIdGetStatus+0x14 That path would be taking CLogControlLock ... so you're off by at least one. Compare the script to src/include/storage/lwlock.h. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] User concurrency thresholding: where do I look?
I will look for runs with longer samples.. Now the script could have mislabeled lock names.. Anyway digging into the one that seems to increase over time... I did stack profiles on how that increases... and here are some numbers.. For 600-850 Users: that potential mislabeled CheckPointStartLock or LockID==12 comes from various sources where the top source (while system is still doing great) comes from: postgres`LWLockAcquire+0x1c8 postgres`SimpleLruReadPage_ReadOnly+0xc postgres`TransactionIdGetStatus+0x14 postgres`TransactionLogFetch+0x58 postgres`TransactionIdDidCommit+0x4 postgres`HeapTupleSatisfiesSnapshot+0x234 postgres`heap_release_fetch+0x1a8 postgres`index_getnext+0xf4 postgres`IndexNext+0x7c postgres`ExecScan+0x8c postgres`ExecProcNode+0xb4 postgres`ExecutePlan+0xdc postgres`ExecutorRun+0xb0 postgres`PortalRunSelect+0x9c postgres`PortalRun+0x244 postgres`exec_execute_message+0x3a0 postgres`PostgresMain+0x1300 postgres`BackendRun+0x278 postgres`ServerLoop+0x63c postgres`PostmasterMain+0xc40 8202100 postgres`LWLockAcquire+0x1c8 postgres`TransactionIdSetStatus+0x1c postgres`RecordTransactionCommit+0x2a8 postgres`CommitTransaction+0xc8 postgres`CommitTransactionCommand+0x90 postgres`finish_xact_command+0x60 postgres`exec_execute_message+0x3d8 postgres`PostgresMain+0x1300 postgres`BackendRun+0x278 postgres`ServerLoop+0x63c postgres`PostmasterMain+0xc40 postgres`main+0x394 postgres`_start+0x108 30822100 However at 900 Users where the big drop in throughput occurs: It gives a different top "consumer" of time: postgres`LWLockAcquire+0x1c8 postgres`TransactionIdSetStatus+0x1c postgres`RecordTransactionCommit+0x2a8 postgres`CommitTransaction+0xc8 postgres`CommitTransactionCommand+0x90 postgres`finish_xact_command+0x60 postgres`exec_execute_message+0x3d8 postgres`PostgresMain+0x1300 postgres`BackendRun+0x278 postgres`ServerLoop+0x63c postgres`PostmasterMain+0xc40 postgres`main+0x394 postgres`_start+0x108 406601300 postgres`LWLockAcquire+0x1c8 postgres`SimpleLruReadPage+0x1ac postgres`TransactionIdGetStatus+0x14 postgres`TransactionLogFetch+0x58 postgres`TransactionIdDidCommit+0x4 postgres`HeapTupleSatisfiesUpdate+0x360 postgres`heap_lock_tuple+0x27c postgres`ExecutePlan+0x33c postgres`ExecutorRun+0xb0 postgres`PortalRunSelect+0x9c postgres`PortalRun+0x244 postgres`exec_execute_message+0x3a0 postgres`PostgresMain+0x1300 postgres`BackendRun+0x278 postgres`ServerLoop+0x63c postgres`PostmasterMain+0xc40 postgres`main+0x394 postgres`_start+0x108 444523100 postgres`LWLockAcquire+0x1c8 postgres`SimpleLruReadPage+0x1ac postgres`TransactionIdGetStatus+0x14 postgres`TransactionLogFetch+0x58 postgres`TransactionIdDidCommit+0x4 postgres`HeapTupleSatisfiesSnapshot+0x234 postgres`heap_release_fetch+0x1a8 postgres`index_getnext+0xf4 postgres`IndexNext+0x7c postgres`ExecScan+0x8c postgres`ExecProcNode+0xb4 postgres`ExecutePlan+0xdc postgres`ExecutorRun+0xb0 postgres`PortalRunSelect+0x9c postgres`PortalRun+0x244 postgres`exec_execute_message+0x3a0 postgres`PostgresMain+0x1300 postgres`BackendRun+0x278 postgres`ServerLoop+0x63c postgres`PostmasterMain+0xc40 166130 Maybe you all will understand more than I do on what it does here.. Looks like IndexNext has a problem at high number of users to me.. but I could be wrong.. -Jignesh Tom Lane wrote: "Jignesh K. Shah" <[EMAIL PROTECTED]> writes: The count is only for a 10-second snapshot.. Plus remember there are about 1000 users running so the connection being profiled only gets 0.01 of the period on CPU.. And the count is for that CONNECTION only. OK, that explains the low absolute levels of the counts, but if the counts are for a regular backend then there's still a lot of bogosity here. Backends won't be taking the CheckpointLock at all, nor do they take CheckpointStartLock in exclusive mode. The bgwriter would do tha
Re: [PERFORM] User concurrency thresholding: where do I look?
On Thu, 2007-07-26 at 10:29 -0400, Jignesh K. Shah wrote: > The count is only for a 10-second snapshot.. Plus remember there are > about 1000 users running so the connection being profiled only gets > 0.01 of the period on CPU.. And the count is for that CONNECTION only. Is that for one process, or all processes aggregated in some way? > CheckpointStartLock Shared 6 > CheckpointStartLock Exclusive 102 That's definitely whacked. Surely we didn't start 102 checkpoints yet attempt to commit 6 times? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] User concurrency thresholding: where do I look?
"Jignesh K. Shah" <[EMAIL PROTECTED]> writes: > The count is only for a 10-second snapshot.. Plus remember there are > about 1000 users running so the connection being profiled only gets > 0.01 of the period on CPU.. And the count is for that CONNECTION only. OK, that explains the low absolute levels of the counts, but if the counts are for a regular backend then there's still a lot of bogosity here. Backends won't be taking the CheckpointLock at all, nor do they take CheckpointStartLock in exclusive mode. The bgwriter would do that but it'd not be taking most of these other locks. So I think the script is mislabeling the locks somehow. Also, elementary statistics should tell you that a sample taken as above is going to have enormous amounts of noise. You should be sampling over a much longer period, say on the order of a minute of runtime, to have numbers that are trustworthy. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] User concurrency thresholding: where do I look?
The count is only for a 10-second snapshot.. Plus remember there are about 1000 users running so the connection being profiled only gets 0.01 of the period on CPU.. And the count is for that CONNECTION only. Anyway using the lock wait script it shows the real picture as you requested. Here the combined time means time "spent waiting" for the lock. bash-3.00# echo 500 users 500 users bash-3.00# ./4_lwlock_waits.d 20764 Lock IdMode Count OidGenLock Exclusive 1 CheckpointStartLock Shared 3 OidGenLock Shared 4 WALInsertLock Exclusive 7 FreeSpaceLock Exclusive 8 XidGenLock Exclusive 15 CheckpointStartLock Exclusive 16 Lock Id Combined Time (ns) XidGenLock 3825000 OidGenLock 5307100 WALInsertLock 6317800 FreeSpaceLock 7244100 CheckpointStartLock 22199200 bash-3.00# echo 600 users 600 users bash-3.00# ./4_lwlock_waits.d 20764 Lock IdMode Count OidGenLock Exclusive 1 WALInsertLock Exclusive 1 CheckpointStartLock Shared 4 CheckpointStartLock Exclusive 11 XidGenLock Exclusive 21 Lock Id Combined Time (ns) OidGenLock 1728200 WALInsertLock 2040700 XidGenLock 19878500 CheckpointStartLock 24156500 bash-3.00# echo 700 users 700 users bash-3.00# ./4_lwlock_waits.d 20764 Lock IdMode Count OidGenLock Shared 1 XidGenLock Shared 1 CheckpointStartLock Shared 2 WALInsertLock Exclusive 4 CheckpointStartLock Exclusive 6 FreeSpaceLock Exclusive 6 XidGenLock Exclusive 13 Lock Id Combined Time (ns) OidGenLock 173 WALInsertLock 7253400 FreeSpaceLock 10977700 CheckpointStartLock 13356800 XidGenLock 38220500 bash-3.00# echo 800 users 800 users bash-3.00# ./4_lwlock_waits.d 20764 Lock IdMode Count CheckpointStartLock Shared 1 WALInsertLock Exclusive 2 XidGenLock Shared 2 CheckpointStartLock Exclusive 5 FreeSpaceLock Exclusive 8 XidGenLock Exclusive 12 Lock Id Combined Time (ns) WALInsertLock 3746800 FreeSpaceLock 7628900 CheckpointStartLock 11297500 XidGenLock 16649000 bash-3.00# echo 900 users - BIG DROP IN THROUGHPUT OCCURS... 900 users bash-3.00# ./4_lwlock_waits.d 20764 Lock IdMode Count OidGenLock Exclusive 1 OidGenLock Shared 1 XidGenLock Shared 1 FreeSpaceLock Exclusive 2 WALInsertLock Exclusive 2 CheckpointStartLock Shared 6 XidGenLock Exclusive 12 CheckpointStartLock Exclusive 121 Lock Id Combined Time (ns) OidGenLock 1968100 FreeSpaceLock 2076300 WALInsertLock 2190400 XidGenLock 20259400 CheckpointStartLock 1407294300 bash-3.00# echo 950 users 950 users bash-3.00# ./4_lwlock_waits.d 20764 Lock IdMode Count OidGenLock Exclusive 1 OidGenLock Shared 2 CheckpointStartLock Shared 3 WALInsertLock Exclusive 4 FreeSpaceLock Exclusive 5 XidGenLock Exclusive 11 CheckpointStartLock Exclusive 50 Lock Id Combined Time (ns) WALInsertLock 5577100 FreeSpaceLock 9115900 XidGenLock 13765800 OidGenLock 50155500 CheckpointStartLock759872200 bash-3.00# echo 1000 users 1000 users bash-3.00# ./4_lwlock_waits.d 20764 Lock IdMode Count OidGenLock Shared 1 WALInsertLock Exclusive 1 XidGenLock Exclusive 5 CheckpointStartLock Shared 6 CheckpointStartLock Exclusi
Re: [PERFORM] User concurrency thresholding: where do I look?
"Jignesh K. Shah" <[EMAIL PROTECTED]> writes: > Here is how I got the numbers.. > I had about 1600 users login into postgresql. Then started the run with > 500 users and using DTrace I started tracking Postgresql Locking "as > viewed from one user/connection". Echo statements indicate how many > users were active at that point and how was throughput performing. All > IO is done on /tmp which means on a RAM disk. > bash-3.00# echo 500 users - baseline number > 500 users > bash-3.00# ./3_lwlock_acquires.d 19178 > Lock IdMode Count > FirstLockMgrLock Exclusive 1 > RelCacheInitLock Exclusive 2 > SInvalLock Exclusive 2 >WALInsertLock Exclusive 10 > BufMappingLock Exclusive 12 > CheckpointLock Shared 29 > CheckpointStartLock Shared 29 > OidGenLock Exclusive 29 > XidGenLock Exclusive 29 > FirstLockMgrLock Shared 33 > CheckpointStartLock Exclusive 78 >FreeSpaceLock Exclusive 114 > OidGenLock Shared 126 > XidGenLock Shared 152 >ProcArrayLock Shared 482 I don't think I believe these numbers. For one thing, CheckpointLock is simply not ever taken in shared mode. The ratios of counts for different locks seems pretty improbable too, eg there is no way on earth that the LockMgrLocks are taken more often shared than exclusive (I would expect no shared acquires at all in the sort of test you are running). Not to mention that the absolute number of counts seems way too low. So I think the counting tool is broken. > Lock Id Combined Time (ns) > SInvalLock29800 > RelCacheInitLock30300 > BufMappingLock 168800 > FirstLockMgrLock 414300 >FreeSpaceLock 1281700 >ProcArrayLock 7869900 >WALInsertLock 3200 > CheckpointStartLock 13494700 > OidGenLock 25719100 > XidGenLock 26443300 > CheckpointLock194267800 Combined time of what exactly? It looks like this must be the total duration the lock is held, at least assuming that the time for CheckpointLock is correctly reported. It'd be much more useful to see the total time spent waiting to acquire the lock. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] User concurrency thresholding: where do I look?
Here is how I got the numbers.. I had about 1600 users login into postgresql. Then started the run with 500 users and using DTrace I started tracking Postgresql Locking "as viewed from one user/connection". Echo statements indicate how many users were active at that point and how was throughput performing. All IO is done on /tmp which means on a RAM disk. bash-3.00# echo 500 users - baseline number 500 users bash-3.00# ./3_lwlock_acquires.d 19178 Lock IdMode Count FirstLockMgrLock Exclusive 1 RelCacheInitLock Exclusive 2 SInvalLock Exclusive 2 WALInsertLock Exclusive 10 BufMappingLock Exclusive 12 CheckpointLock Shared 29 CheckpointStartLock Shared 29 OidGenLock Exclusive 29 XidGenLock Exclusive 29 FirstLockMgrLock Shared 33 CheckpointStartLock Exclusive 78 FreeSpaceLock Exclusive 114 OidGenLock Shared 126 XidGenLock Shared 152 ProcArrayLock Shared 482 Lock Id Combined Time (ns) SInvalLock29800 RelCacheInitLock30300 BufMappingLock 168800 FirstLockMgrLock 414300 FreeSpaceLock 1281700 ProcArrayLock 7869900 WALInsertLock 3200 CheckpointStartLock 13494700 OidGenLock 25719100 XidGenLock 26443300 CheckpointLock194267800 bash-3.00# echo 600 users - Throughput rising 600 users bash-3.00# ./3_lwlock_acquires.d 19178 Lock IdMode Count RelCacheInitLock Exclusive 1 SInvalLock Exclusive 1 BufMappingLock Exclusive 2 CLogControlLock Exclusive 2 WALInsertLock Exclusive 11 FirstLockMgrLock Shared 20 CheckpointLock Shared 24 CheckpointStartLock Shared 24 OidGenLock Exclusive 24 XidGenLock Exclusive 24 CheckpointStartLock Exclusive 72 FreeSpaceLock Exclusive 102 OidGenLock Shared 106 XidGenLock Shared 128 ProcArrayLock Shared 394 Lock Id Combined Time (ns) SInvalLock15600 RelCacheInitLock15700 BufMappingLock31000 CLogControlLock41500 FirstLockMgrLock 289000 FreeSpaceLock 3045400 CheckpointStartLock 7371800 WALInsertLock 9383200 ProcArrayLock 10457900 OidGenLock 20005900 XidGenLock 20331500 CheckpointLock187067900 bash-3.00# echo 700 users - Throughput rising 700 users bash-3.00# ./3_lwlock_acquires.d 19178 Lock IdMode Count RelCacheInitLock Exclusive 1 SInvalLock Exclusive 1 BufMappingLock Exclusive 2 WALInsertLock Exclusive 17 CheckpointLock Shared 33 CheckpointStartLock Shared 33 OidGenLock Exclusive 33 XidGenLock Exclusive 33 FirstLockMgrLock Shared 81 CheckpointStartLock Exclusive 87 FreeSpaceLock Exclusive 124 OidGenLock Shared 125 XidGenLock Shared 150 ProcArrayLock Shared 500 Lock Id Combined Time (ns) RelCacheInitLock15100 SInvalLock15400 BufMappingLock47400 FirstLockMgrLock 3021000 FreeSpaceLock 3794300 WALInsertLock 7567300 XidGenLock 18427400 ProcArrayLock 20884000 CheckpointStartLock 24084900 OidGenLock 26399500 CheckpointLock256549800 bash-3.00# echo 800 users - Throughput rising 800 users bash-3.00# ./3_lwlock_acquires.d 19178 Lock IdMode Count BufMappingLock Exclusive 1 RelCacheInitLock Exclusive 1 SInvalLock Exclusive 1 WALWriteLock Exclusive 1 WALInsertLock
Re: [PERFORM] User concurrency thresholding: where do I look?
On Mon, 2007-07-23 at 14:19 -0400, Tom Lane wrote: > "Simon Riggs" <[EMAIL PROTECTED]> writes: > > currPos and markPos are defined as BTScanPosData, which is an array of > > BTScanPosItems. That makes BTScanOpaqueData up to 8232 bytes, which > > seems wasteful since markPos is only ever used during merge joins. Most > > of that space isn't even used during merge joins either, we just do that > > to slightly optimise the speed of the restore during merge joins. > > Ah. I was seeing it as 6600 bytes on HPPA and 6608 on x86_64, but > I forgot that both of those architectures have MAXALIGN = 8. On a > MAXALIGN = 4 machine, MaxIndexTuplesPerPage will be significantly > larger, leading to larger BTScanPosData. > > Not sure it's worth fooling with, given that these days almost everyone > who's seriously concerned about performance is probably using 64bit > hardware. One less malloc cycle per indexscan is never going to be a > measurable savings anyway... Oh sure, I was thinking to avoid Solaris' mutex by avoiding malloc() completely. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] User concurrency thresholding: where do I look?
"Simon Riggs" <[EMAIL PROTECTED]> writes: > currPos and markPos are defined as BTScanPosData, which is an array of > BTScanPosItems. That makes BTScanOpaqueData up to 8232 bytes, which > seems wasteful since markPos is only ever used during merge joins. Most > of that space isn't even used during merge joins either, we just do that > to slightly optimise the speed of the restore during merge joins. Ah. I was seeing it as 6600 bytes on HPPA and 6608 on x86_64, but I forgot that both of those architectures have MAXALIGN = 8. On a MAXALIGN = 4 machine, MaxIndexTuplesPerPage will be significantly larger, leading to larger BTScanPosData. Not sure it's worth fooling with, given that these days almost everyone who's seriously concerned about performance is probably using 64bit hardware. One less malloc cycle per indexscan is never going to be a measurable savings anyway... regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] User concurrency thresholding: where do I look?
On Mon, 2007-07-23 at 12:35 -0400, Tom Lane wrote: > "Simon Riggs" <[EMAIL PROTECTED]> writes: > > Well, I discover there is an allocation of 8232 (inflation...) made once > > per statement by a memory context called... ExecutorState. Still not > > sure exactly which allocation this is, but its definitely once per > > statement on pgbench, which should narrow it down. Plan, query etc? > > Are you working with stock sources? The only allocation exceeding 1K > that I can see during pgbench is BTScanOpaqueData, which is 6600 bytes. > (Checked by setting a conditional breakpoint on AllocSetAlloc.) The > path that allocates a single-chunk block is never taken at all. I do have the bitmap patch currently applied, but it doesn't touch that part of the code. (gdb) p size $1 = 8232 (gdb) p sizeof(int) $2 = 4 (gdb) p sizeof(BTScanPosData) $3 = 4104 Since my notes say I got 8228 last year, seems reasonable. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] User concurrency thresholding: where do I look?
On Mon, 2007-07-23 at 16:48 +0100, Simon Riggs wrote: > On Mon, 2007-07-23 at 10:54 -0400, Tom Lane wrote: > > "Simon Riggs" <[EMAIL PROTECTED]> writes: > > > I looked at this last May and my notes say "ExecutorState". I guess that > > > was wrong, but my analysis showed there was a single malloc of 8228 > > > bytes happening once per query during my tests. > > > > Well, if you can track down where it's coming from, we could certainly > > hack the containing context's parameters. But EState's not it. > > Well, I discover there is an allocation of 8232 (inflation...) made once > per statement by a memory context called... ExecutorState. Still not > sure exactly which allocation this is, but its definitely once per > statement on pgbench, which should narrow it down. Plan, query etc? > > I don't see a way to hack the allocation, since the max chunk size is > 8K. It is the allocation of BTScanOpaqueData called from btrescan() in nbtree.c currPos and markPos are defined as BTScanPosData, which is an array of BTScanPosItems. That makes BTScanOpaqueData up to 8232 bytes, which seems wasteful since markPos is only ever used during merge joins. Most of that space isn't even used during merge joins either, we just do that to slightly optimise the speed of the restore during merge joins. Seems like we should allocate the memory when we do the first mark. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] User concurrency thresholding: where do I look?
"Simon Riggs" <[EMAIL PROTECTED]> writes: > Well, I discover there is an allocation of 8232 (inflation...) made once > per statement by a memory context called... ExecutorState. Still not > sure exactly which allocation this is, but its definitely once per > statement on pgbench, which should narrow it down. Plan, query etc? Are you working with stock sources? The only allocation exceeding 1K that I can see during pgbench is BTScanOpaqueData, which is 6600 bytes. (Checked by setting a conditional breakpoint on AllocSetAlloc.) The path that allocates a single-chunk block is never taken at all. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] User concurrency thresholding: where do I look?
On Mon, 2007-07-23 at 10:54 -0400, Tom Lane wrote: > "Simon Riggs" <[EMAIL PROTECTED]> writes: > > I looked at this last May and my notes say "ExecutorState". I guess that > > was wrong, but my analysis showed there was a single malloc of 8228 > > bytes happening once per query during my tests. > > Well, if you can track down where it's coming from, we could certainly > hack the containing context's parameters. But EState's not it. Well, I discover there is an allocation of 8232 (inflation...) made once per statement by a memory context called... ExecutorState. Still not sure exactly which allocation this is, but its definitely once per statement on pgbench, which should narrow it down. Plan, query etc? I don't see a way to hack the allocation, since the max chunk size is 8K. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] User concurrency thresholding: where do I look?
"Simon Riggs" <[EMAIL PROTECTED]> writes: > I looked at this last May and my notes say "ExecutorState". I guess that > was wrong, but my analysis showed there was a single malloc of 8228 > bytes happening once per query during my tests. Well, if you can track down where it's coming from, we could certainly hack the containing context's parameters. But EState's not it. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] User concurrency thresholding: where do I look?
On Mon, 2007-07-23 at 10:11 -0400, Tom Lane wrote: > "Simon Riggs" <[EMAIL PROTECTED]> writes: > > EState is about 8300 bytes, > > What? > > (gdb) p sizeof(EState) > $1 = 112 > > This is on a 32-bit machine, but even on 64-bit it wouldn't be more than > double that. > > > Would it be worth a special case in the palloc system to avoid having to > > repeatedly issue external memory allocation calls? > > The appropriate hack would be to change the AllocSetContextCreate > initial-size parameter for the containing context. But I really have > no idea what you're on about. I looked at this last May and my notes say "ExecutorState". I guess that was wrong, but my analysis showed there was a single malloc of 8228 bytes happening once per query during my tests. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] User concurrency thresholding: where do I look?
"Simon Riggs" <[EMAIL PROTECTED]> writes: > EState is about 8300 bytes, What? (gdb) p sizeof(EState) $1 = 112 This is on a 32-bit machine, but even on 64-bit it wouldn't be more than double that. > Would it be worth a special case in the palloc system to avoid having to > repeatedly issue external memory allocation calls? The appropriate hack would be to change the AllocSetContextCreate initial-size parameter for the containing context. But I really have no idea what you're on about. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] User concurrency thresholding: where do I look?
On Fri, 2007-07-20 at 16:57 -0400, Tom Lane wrote: > "Jignesh K. Shah" <[EMAIL PROTECTED]> writes: > > What its saying is that there are holds/waits in trying to get locks > > which are locked at Solaris user library levels called from the > > postgresql functions: > > For example both the following functions are hitting on the same mutex > > lock 0x10059e280 in Solaris Library call: > > postgres`AllocSetDelete+0x98 > > postgres`AllocSetAlloc+0x1c4 > > That's a perfect example of the sort of useless overhead that I was > complaining of just now in pgsql-patches. Having malloc/free use > an internal mutex is necessary in multi-threaded programs, but the > backend isn't multi-threaded. And yet, apparently you can't turn > that off in Solaris. > > (Fortunately, the palloc layer is probably insulating us from malloc's > performance enough that this isn't a huge deal. But it's annoying.) There is one thing that the palloc layer doesn't handle: EState. All other memory contexts have a very well chosen initial allocation that prevents mallocs during low-medium complexity OLTP workloads. EState is about 8300 bytes, so just above the large allocation limit. This means that every time we request an EState, i.e. at least once per statement we need to malloc() and then later free(). Would it be worth a special case in the palloc system to avoid having to repeatedly issue external memory allocation calls? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] User concurrency thresholding: where do I look?
David Boreham <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> Having malloc/free use >> an internal mutex is necessary in multi-threaded programs, but the >> backend isn't multi-threaded. > Hmm...confused. I'm not following why then there is contention for the > mutex. There isn't any contention for that mutex; Jignesh's results merely show that it was taken and released a lot of times. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] User concurrency thresholding: where do I look?
Tom Lane wrote: Having malloc/free use an internal mutex is necessary in multi-threaded programs, but the backend isn't multi-threaded. Hmm...confused. I'm not following why then there is contention for the mutex. Surely this has to be some other mutex that is in contention, not a heap lock ? It'd be handy to see the call stack for the wait state -- if the thing is spending a significant proportion of its time in contention it should be easy to get that with a simple tool such as pstack or a debugger. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] User concurrency thresholding: where do I look?
"Jignesh K. Shah" <[EMAIL PROTECTED]> writes: > Yes I did see increase in context switches and CPU migrations at that > point using mpstat. So follow that up --- try to determine which lock is being contended for. There's some very crude code in the sources that you can enable with -DLWLOCK_STATS, but probably DTrace would be a better tool. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] User concurrency thresholding: where do I look?
True you cant switch off the locks since libthread has been folded into libc in Solaris 10. Anyway just to give you an idea of the increase in context switching at the break point here are the mpstat (taken at 10 second interval) on this 8-socket Sun Fire V890. The low icsw (Involuntary Context Switches) is about 950-1000 user mark after which a context switch storm starts at users above 1000-1050 mark and drops in total throughput drops about 30% instantaneously.. I will try rebuilding the postgresql with dtrace probes to get more clues. (NOTE you will see 1 cpu (cpuid:22) doing more system work... thats the one doing handling the network interrupts) CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 0 57 0 27 1086 4072 98 1749 4161 7763 47 13 0 40 1 46 0 24226 4198 11 1826 4270 7547 45 13 0 42 2 42 0 34 1048 4103 91 1682 4241 7797 46 13 0 41 3 51 0 22216 4125 10 1734 4350 7399 45 13 0 43 4 65 0 27196 40158 1706 4110 7292 44 15 0 41 5 54 0 21216 4297 10 1702 4640 7708 45 13 0 42 6 36 0 1666 47 4218 12 1713 4260 7685 47 11 0 42 7 40 0 100 318 206 3699 10 1534 5850 6851 45 14 0 41 16 41 0 30875 3780 78 1509 4011 7604 45 13 0 42 17 39 0 24225 3970 12 1631 4080 7265 44 12 0 44 18 42 0 24995 3829 89 1519 4011 7343 45 12 0 43 19 39 0 31 788305 35888 1509 4000 6629 43 13 0 44 20 22 0 20196 39259 1577 4190 7364 44 12 0 44 21 38 0 31235 3792 13 1566 4070 7133 45 12 0 44 228 0 110 7053 7045 16418 728 8380 2917 16 50 0 33 23 62 0 29215 3985 10 1579 4490 7368 44 12 0 44 CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 0 13 0 27 1236 4228 113 1820 4331 8084 49 13 0 38 1 16 0 63266 4253 15 1875 4200 7754 47 14 0 39 2 11 0 31 1108 4178 97 1741 4251 8095 48 14 0 38 38 0 24206 42579 1818 4440 7807 47 13 0 40 4 13 0 54286 4145 17 1774 4261 7732 46 16 0 38 5 12 0 35236 4412 12 1775 4470 8249 48 13 0 39 68 0 2438 15 4323 14 1760 4220 8016 49 11 0 39 78 0 120 323 206 3801 15 1599 6350 7290 47 15 0 38 16 11 0 44 1075 3896 98 1582 3931 7997 47 15 0 39 17 15 0 29245 4120 14 1716 4160 7648 46 13 0 41 189 0 35 1135 3933 103 1594 3991 7714 47 13 0 40 198 0 34 832715 3702 12 1564 4030 7010 45 14 0 41 207 0 28276 3997 16 1624 4000 7676 46 13 0 41 218 0 28255 3997 15 1664 4020 7658 47 12 0 41 224 0 97 7741 7731 1586 11 704 9060 2933 17 51 0 32 23 13 0 28255 4144 15 1658 4370 7810 47 12 0 41 CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 00 0 141 3156 9262 301 2812 3300 10905 49 16 0 35 11 0 153 1996 9400 186 2808 3120 11066 48 16 0 37 20 0 140 2568 8798 242 2592 3100 10111 47 15 0 38 31 0 141 1896 8803 172 2592 3140 10171 47 15 0 39 40 0 120 2146 9540 207 2801 3220 10531 46 17 0 36 51 0 152 1806 8764 161 2564 3420 9904 47 15 0 38 61 0 107 344 148 8180 181 2512 2900 9314 51 14 0 35 70 0 665 443 204 8733 153 2574 4040 9892 43 21 0 37 160 0 113 2175 6446 201 1975 2650 7552 45 12 0 44 170 0 107 1535 6568 140 2021 2740 7586 44 11 0 45 180 0 121 2155 6072 201 1789 2761 7690 44 12 0 44 191 0 102 471425 6123 126 1829 2620 7185 43 12 0 45 200 0 102 1436 6451 129 1939 2620 7450 43 13 0 44 211 0 106 1505 6538 133 1997 2850 7425 44 11 0 44 220 0 494 5949 5876 3586 73 1040 3990 4058 26 39 0 34 230 0 102 1595 6393 142 1942 3240 7226 43 12 0 46 CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 00 0 217 4417 10763 426 3234 3630 12449 47 18 0 35 10 0 210 3227 3 309 3273 3510 12527 46 17 0 37 21 0 212 3878 10306 370 2977 3540 11320 45 16 0 38 30 0 230 2767 10332 257 2947 3410 11901 43 16 0 40 40 0 234 3067 11324 290 3265 3520 12
Re: [PERFORM] User concurrency thresholding: where do I look?
"Jignesh K. Shah" <[EMAIL PROTECTED]> writes: > What its saying is that there are holds/waits in trying to get locks > which are locked at Solaris user library levels called from the > postgresql functions: > For example both the following functions are hitting on the same mutex > lock 0x10059e280 in Solaris Library call: > postgres`AllocSetDelete+0x98 > postgres`AllocSetAlloc+0x1c4 That's a perfect example of the sort of useless overhead that I was complaining of just now in pgsql-patches. Having malloc/free use an internal mutex is necessary in multi-threaded programs, but the backend isn't multi-threaded. And yet, apparently you can't turn that off in Solaris. (Fortunately, the palloc layer is probably insulating us from malloc's performance enough that this isn't a huge deal. But it's annoying.) regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] User concurrency thresholding: where do I look?
sorry.. The are solaris mutex locks used by the postgresql process. What its saying is that there are holds/waits in trying to get locks which are locked at Solaris user library levels called from the postgresql functions: For example both the following functions are hitting on the same mutex lock 0x10059e280 in Solaris Library call: postgres`AllocSetDelete+0x98 postgres`AllocSetAlloc+0x1c4 I need to enable the DTrace probes on my builds -Jignesh Tom Lane wrote: "Jignesh K. Shah" <[EMAIL PROTECTED]> writes: Tom Lane wrote: So follow that up --- try to determine which lock is being contended for. There's some very crude code in the sources that you can enable with -DLWLOCK_STATS, but probably DTrace would be a better tool. Using plockstat -A -s 5 -p $pid I don't know what that is, but it doesn't appear to have anything to do with Postgres LWLocks or spinlocks, which are the locks I was thinking of. Try asking Robert Lor about this --- IIRC he had some dtrace probes to work with our locks. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] User concurrency thresholding: where do I look?
"Jignesh K. Shah" <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> So follow that up --- try to determine which lock is being contended >> for. There's some very crude code in the sources that you can enable >> with -DLWLOCK_STATS, but probably DTrace would be a better tool. > Using plockstat -A -s 5 -p $pid I don't know what that is, but it doesn't appear to have anything to do with Postgres LWLocks or spinlocks, which are the locks I was thinking of. Try asking Robert Lor about this --- IIRC he had some dtrace probes to work with our locks. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] User concurrency thresholding: where do I look?
Tom Lane wrote: "Jignesh K. Shah" <[EMAIL PROTECTED]> writes: Yes I did see increase in context switches and CPU migrations at that point using mpstat. So follow that up --- try to determine which lock is being contended for. There's some very crude code in the sources that you can enable with -DLWLOCK_STATS, but probably DTrace would be a better tool. regards, tom lane Using plockstat -A -s 5 -p $pid on bgwriter: doesnt report anything On one of the many connections: This one is hard to read easily By default, plockstat monitors all lock con- tention events, gathers frequency and timing data about those events, and displays the data in decreasing frequency order, so that the most common events appear first. ^Cbash-3.00# plockstat -A -s 5 -p 6401 ^C Mutex hold --- Count nsec Lock Caller 59 186888 0x10059e280 libumem.so.1`process_free+0x12c nsec Time Distribution --- count Stack 16384 || 1 libumem.so.1`process_free+0x12c 32768 |@ |14 postgres`AllocSetDelete+0x98 65536 |@@ | 5 postgres`MemoryContextDelete+0x78 131072 || 0 postgres`CommitTransaction+0x240 262144 |@@@ |39 --- Count nsec Lock Caller 53012226 0x10059e280 libumem.so.1`umem_cache_alloc+0x200 nsec Time Distribution --- count Stack 4096 |@@@ | 338 libumem.so.1`umem_cache_alloc+0x200 8192 |@ |24 libumem.so.1`umem_alloc+0x5c 16384 |@ |37 libumem.so.1`malloc+0x40 32768 |@ | 131 postgres`AllocSetAlloc+0x1c4 --- Count nsec Lock Caller 32410214 0x100578030 libumem.so.1`vmem_xfree+0x164 nsec Time Distribution --- count Stack 4096 |@@ | 192 libumem.so.1`vmem_xfree+0x164 8192 ||56 libumem.so.1`process_free+0x12c 16384 |@ |26 postgres`AllocSetDelete+0x98 32768 |@@@ |50 postgres`MemoryContextDelete+0x78 --- Count nsec Lock Caller 16113585 0x10059e280 libumem.so.1`process_free+0x12c nsec Time Distribution --- count Stack 4096 |@ | 118 libumem.so.1`process_free+0x12c 8192 || 4 postgres`AllocSetDelete+0x98 16384 |@ |10 postgres`MemoryContextDelete+0x78 32768 |@@@ |24 postgres`PortalDrop+0x160 65536 || 3 131072 || 0 262144 || 2 --- Count nsec Lock Caller 326 6081 libumem.so.1`vmem0+0xc38 libumem.so.1`vmem_xalloc+0x630 nsec Time Distribution --- count Stack 4096 || 170 libumem.so.1`vmem_xalloc+0x630 8192 |@@@ | 155 libumem.so.1`vmem_alloc+0x1f8 16384 || 1 libumem.so.1`vmem_xalloc+0x524 libumem.so.1`vmem_alloc+0x1f8 --- Count nsec Lock Caller 326 5867 libumem.so.1`vmem0+0x30 libumem.so.1`vmem_alloc+0x248 nsec Time Distribution --- count Stack 4096 |@ | 185 libumem.so.1`vmem_alloc+0x248 8192 |@@ | 141 libumem.so.1`vmem_sbrk_alloc+0x30 libumem.so.1`vmem_xalloc+0x524 libumem.so.1`vmem_alloc+0x1f8 --- Count nsec Lock Caller 318 5873 0x100578030 libumem.so.1`vmem_alloc+0x1d0 nsec Time Distribution --- count Stack 4096 |@ | 228 libumem.so.1`vmem_alloc+0x1d0 8192 |@ |78 libumem.so.1`umem_alloc+0xec 16384 || 6 libumem.so.1`malloc+0x40 32768 || 6 postgres`AllocSetAlloc+0x1c4 --- Count nsec Lock Caller 326 5591 0x100
Re: [PERFORM] User concurrency thresholding: where do I look?
Yes I did see increase in context switches and CPU migrations at that point using mpstat. Regards, Jignesh Tom Lane wrote: "Jignesh K. Shah" <[EMAIL PROTECTED]> writes: There are no hard failures reported anywhere. Log min durations does show that queries are now slowing down and taking longer. OS is not swapping and also eliminated IO by putting the whole database on /tmp Hmm. Do you see any evidence of a context swap storm (ie, a drastic increase in the context swaps/second reading reported by vmstat)? regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] User concurrency thresholding: where do I look?
"Jignesh K. Shah" <[EMAIL PROTECTED]> writes: > There are no hard failures reported anywhere. Log min durations does > show that queries are now slowing down and taking longer. > OS is not swapping and also eliminated IO by putting the whole database > on /tmp Hmm. Do you see any evidence of a context swap storm (ie, a drastic increase in the context swaps/second reading reported by vmstat)? regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] User concurrency thresholding: where do I look?
I forgot to add one more piece of information.. I also tried the same test with 64-bit postgresql with 6GB shared_buffers and results are the same it drops around the same point which to me sounds like a bottleneck.. More later -Jignesh Jignesh K. Shah wrote: Awww Josh, I was just enjoying the chat on the picket fence! :-) Anyway the workload is mixed (reads,writes) with simple to medium queries. The workload is known to scale well. But inorder to provide substantial input I am still trying to eliminate things that can bottleneck. Currently I have eliminated CPU (plenty free) , RAM (memory is 48GB RAM in this server for a 32-bit postgresql instance), IO Storage (used the free ram to do /tmp database to eliminate IO) and am still trying to eliminate any network bottlenecks to say for sure we have a problem in PostgreSQL. But yes till that final thing is confirmed (network which can very well be the case) it could be a problem somewhere else. However the thing that worries me is more of the big drop instead of remaining constant out there.. Anyway more on this within a day or two once I add more network nics between the systems to eliminate network problems (even though stats dont show them as problems right now) and also reduce malloc lock penalties if any. As for other questions: max_locks_per_transactions is set to default (10 I believe) increasing it still seems to degrade overall throughput number. max_connections is set to 1500 for now till I get decent scaling till 1400-1500 users. There are no hard failures reported anywhere. Log min durations does show that queries are now slowing down and taking longer. OS is not swapping and also eliminated IO by putting the whole database on /tmp So while I finish adding more network connections between the two systems (need to get cards) do enjoy the following URL :-) http://www.spec.org/jAppServer2004/results/res2007q3/jAppServer2004-20070703-00073.html Of course all postgresql.conf still remains from the old test so no flames on that one again :-) Regards, Jignesh Josh Berkus wrote: Tom, Well, if the load is a lot of short writing transactions then you'd expect the throughput to depend on how fast stuff can be pushed down to WAL. What have you got wal_buffers set to? Are you using a commit delay? What's the I/O system anyway (any BB write cache on the WAL disk?) and what wal sync method are you using? You know, I think Jignesh needs to me on this list so I can stop relaying questions on a workload I didn't design. Let me get him. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] User concurrency thresholding: where do I look?
Awww Josh, I was just enjoying the chat on the picket fence! :-) Anyway the workload is mixed (reads,writes) with simple to medium queries. The workload is known to scale well. But inorder to provide substantial input I am still trying to eliminate things that can bottleneck. Currently I have eliminated CPU (plenty free) , RAM (memory is 48GB RAM in this server for a 32-bit postgresql instance), IO Storage (used the free ram to do /tmp database to eliminate IO) and am still trying to eliminate any network bottlenecks to say for sure we have a problem in PostgreSQL. But yes till that final thing is confirmed (network which can very well be the case) it could be a problem somewhere else. However the thing that worries me is more of the big drop instead of remaining constant out there.. Anyway more on this within a day or two once I add more network nics between the systems to eliminate network problems (even though stats dont show them as problems right now) and also reduce malloc lock penalties if any. As for other questions: max_locks_per_transactions is set to default (10 I believe) increasing it still seems to degrade overall throughput number. max_connections is set to 1500 for now till I get decent scaling till 1400-1500 users. There are no hard failures reported anywhere. Log min durations does show that queries are now slowing down and taking longer. OS is not swapping and also eliminated IO by putting the whole database on /tmp So while I finish adding more network connections between the two systems (need to get cards) do enjoy the following URL :-) http://www.spec.org/jAppServer2004/results/res2007q3/jAppServer2004-20070703-00073.html Of course all postgresql.conf still remains from the old test so no flames on that one again :-) Regards, Jignesh Josh Berkus wrote: Tom, Well, if the load is a lot of short writing transactions then you'd expect the throughput to depend on how fast stuff can be pushed down to WAL. What have you got wal_buffers set to? Are you using a commit delay? What's the I/O system anyway (any BB write cache on the WAL disk?) and what wal sync method are you using? You know, I think Jignesh needs to me on this list so I can stop relaying questions on a workload I didn't design. Let me get him. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] User concurrency thresholding: where do I look?
Greg, There's so much going on with a TPC-C kind of workload. Has anyone ever looked into quantifying scaling for more fundamental operations? There are so many places a complicated workload could get caught up that starting there is hard. I've found it's helpful to see the breaking points for simpler operations, then compare how things change as each new transaction element is introduced. ... eagerly awaiting Michael Doilson's PgUnitTest --Josh ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] User concurrency thresholding: where do I look?
On Thu, 19 Jul 2007, Josh Berkus wrote: It's a TPCC-like workload, so heavy single-row updates, and the updates/inserts are what's being measured. There's so much going on with a TPC-C kind of workload. Has anyone ever looked into quantifying scaling for more fundamental operations? There are so many places a complicated workload could get caught up that starting there is hard. I've found it's helpful to see the breaking points for simpler operations, then compare how things change as each new transaction element is introduced. As an example, take a look at the MySQL SysBench tool: http://sysbench.sourceforge.net/docs/ Specifically their "oltp" tests. Note how you can get a handle on how simple selects scale, then simple inserts, then updates, and so on. The only thing I've thought of they missed is testing a trivial operation that doesn't even touch the buffer cache ('SELECT 1'?) that would let you quantify just general connection scaling issues. It seems to me that you could narrow the list of possible causes here much more quickly if you had a good handle on the upper concurrency of lower-level operations. [Note: it's possible to run SysBench against a PG database, but the code is very immature. Last time I tried there were plenty of crashes and there seemed to be some transaction wrapping issues that caused deadlocks with some tests.] -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] User concurrency thresholding: where do I look?
"Tom Lane" <[EMAIL PROTECTED]> writes: > Josh Berkus <[EMAIL PROTECTED]> writes: > >> That's an interesting thought. Let me check lock counts and see if this is >> possibly the case. > > AFAIK you'd get hard failures, not slowdowns, if you ran out of lock > space entirely I assume you've checked the server logs and are sure that you aren't in fact getting errors. I could, for example, envision a situation where a fraction of the transactions are getting some error and those transactions are therefore not being counted against the txn/s result. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] User concurrency thresholding: where do I look?
Tom, > Well, if the load is a lot of short writing transactions then you'd > expect the throughput to depend on how fast stuff can be pushed down to > WAL. What have you got wal_buffers set to? Are you using a commit > delay? What's the I/O system anyway (any BB write cache on the WAL > disk?) and what wal sync method are you using? You know, I think Jignesh needs to me on this list so I can stop relaying questions on a workload I didn't design. Let me get him. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] User concurrency thresholding: where do I look?
Josh Berkus <[EMAIL PROTECTED]> writes: >> Josh, what sort of workload is being tested here --- read-mostly, >> write-mostly, a mixture? > It's a TPCC-like workload, so heavy single-row updates, and the > updates/inserts are what's being measured. For that matter, when I've seen > this before it was with heavy-write workloads and we were measuring the > number of updates/inserts and not the number of reads. Well, if the load is a lot of short writing transactions then you'd expect the throughput to depend on how fast stuff can be pushed down to WAL. What have you got wal_buffers set to? Are you using a commit delay? What's the I/O system anyway (any BB write cache on the WAL disk?) and what wal sync method are you using? While I'm asking questions, exactly what were the data columns you presented? Txn/User doesn't make much sense to me, and I'm not sure what "response time" you were measuring either. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] User concurrency thresholding: where do I look?
Tom, all: > Also, have you tried watching vmstat or local equivalent to confirm that > the machine's not starting to swap? We're not swapping. > Josh, what sort of workload is being tested here --- read-mostly, > write-mostly, a mixture? It's a TPCC-like workload, so heavy single-row updates, and the updates/inserts are what's being measured. For that matter, when I've seen this before it was with heavy-write workloads and we were measuring the number of updates/inserts and not the number of reads. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] User concurrency thresholding: where do I look?
On Thu, 19 Jul 2007, Josh Berkus wrote: What seems to be happening is that at some specific number of connections average throughput drops 30% and response time quadruples or worse. Could you characterize what each connection is doing and how you're generating the load? I don't know how productive speculating about the cause here will be until there's a test script available so other people can see where the tipping point is on their system. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] User concurrency thresholding: where do I look?
Alvaro Herrera wrote: Tom Lane wrote: Josh Berkus <[EMAIL PROTECTED]> writes: So maybe this does not make much sense in normal operation, thus not applicable to what Josh Berkus is reporting. However I was talking to Josh Drake yesterday and he told me that pg_dump was spending some significant amount of time in LOCK TABLE when there are lots of tables (say 300k). Less, 128k Joshua D. Drake ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] User concurrency thresholding: where do I look?
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> AFAIK you'd get hard failures, not slowdowns, if you ran out of lock >> space entirely; > Well, if there still is shared memory available, the lock hash can > continue to grow, but it would slow down according to this comment in > ShmemInitHash: Right, but there's not an enormous amount of headroom in shared memory beyond the intended size of the hash tables. I'd think that you'd start seeing hard failures not very far beyond the point at which performance impacts became visible. Of course this is all speculation; I quite agree with varying the table-size parameters to see if it makes a difference. Josh, what sort of workload is being tested here --- read-mostly, write-mostly, a mixture? > However I was talking to Josh Drake yesterday and he told me that > pg_dump was spending some significant amount of time in LOCK TABLE when > there are lots of tables (say 300k). I wouldn't be too surprised if there's some O(N^2) effects when a single transaction holds that many locks, because of the linked-list proclock data structures. This would not be relevant to Josh's case though. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] User concurrency thresholding: where do I look?
Tom Lane wrote: > Josh Berkus <[EMAIL PROTECTED]> writes: > > Alvaro, > >> Have you messed with max_connections and/or max_locks_per_transaction > >> while testing this? The lock table is sized to max_locks_per_xact times > >> max_connections, and shared memory hash tables get slower when they are > >> full. Of course, the saturation point would depend on the avg number of > >> locks acquired per user, which would explain why you are seeing a lower > >> number for some users and higher for others (simpler/more complex > >> queries). > > > That's an interesting thought. Let me check lock counts and see if this is > > possibly the case. > > AFAIK you'd get hard failures, not slowdowns, if you ran out of lock > space entirely; Well, if there still is shared memory available, the lock hash can continue to grow, but it would slow down according to this comment in ShmemInitHash: * max_size is the estimated maximum number of hashtable entries. This is * not a hard limit, but the access efficiency will degrade if it is * exceeded substantially (since it's used to compute directory size and * the hash table buckets will get overfull). For the lock hash tables this max_size is (MaxBackends+max_prepared_xacts) * max_locks_per_xact. So maybe this does not make much sense in normal operation, thus not applicable to what Josh Berkus is reporting. However I was talking to Josh Drake yesterday and he told me that pg_dump was spending some significant amount of time in LOCK TABLE when there are lots of tables (say 300k). -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] User concurrency thresholding: where do I look?
Josh Berkus <[EMAIL PROTECTED]> writes: > Alvaro, >> Have you messed with max_connections and/or max_locks_per_transaction >> while testing this? The lock table is sized to max_locks_per_xact times >> max_connections, and shared memory hash tables get slower when they are >> full. Of course, the saturation point would depend on the avg number of >> locks acquired per user, which would explain why you are seeing a lower >> number for some users and higher for others (simpler/more complex >> queries). > That's an interesting thought. Let me check lock counts and see if this is > possibly the case. AFAIK you'd get hard failures, not slowdowns, if you ran out of lock space entirely; and the fact that you can continue the curve upwards says that you're not on the edge of running out. However I agree that it's worth experimenting with those two parameters to see if the curve moves around at all. Another resource that might be interesting is the number of open files. Also, have you tried watching vmstat or local equivalent to confirm that the machine's not starting to swap? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] User concurrency thresholding: where do I look?
Alvaro, > Have you messed with max_connections and/or max_locks_per_transaction > while testing this? The lock table is sized to max_locks_per_xact times > max_connections, and shared memory hash tables get slower when they are > full. Of course, the saturation point would depend on the avg number of > locks acquired per user, which would explain why you are seeing a lower > number for some users and higher for others (simpler/more complex > queries). That's an interesting thought. Let me check lock counts and see if this is possibly the case. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] User concurrency thresholding: where do I look?
Josh Berkus wrote: > Folks, > > I've run into this a number of times with various PostgreSQL users, so we > tested it at Sun. What seems to be happening is that at some specific number > of connections average throughput drops 30% and response time quadruples or > worse. The amount seems to vary per machine; I've seen it as variously 95, > 1050, 1700 or 2800 connections. Tinkering with postgresql.conf parameters > doesn't seem to affect this threshold. > > As an example of this behavior: > > Users Txn/User Resp. Time > 50105.38 0.01 > 100 113.05 0.01 > 150 114.05 0.01 > 200 113.51 0.01 > 250 113.38 0.01 > 300 112.14 0.01 > 350 112.26 0.01 > 400 111.43 0.01 > 450 110.72 0.01 > 500 110.44 0.01 > 550 109.36 0.01 > 600 107.01 0.02 > 650 105.71 0.02 > 700 106.95 0.02 > 750 107.69 0.02 > 800 106.78 0.02 > 850 108.59 0.02 > 900 106.03 0.02 > 950 106.13 0.02 > 1000 64.58 0.15 > 1050 52.32 0.23 > 1100 49.79 0.25 > > Tinkering with shared_buffers has had no effect on this threholding (the > above > was with 3gb to 6gb of shared_buffers). Any ideas on where we should look > for the source of the bottleneck? Have you messed with max_connections and/or max_locks_per_transaction while testing this? The lock table is sized to max_locks_per_xact times max_connections, and shared memory hash tables get slower when they are full. Of course, the saturation point would depend on the avg number of locks acquired per user, which would explain why you are seeing a lower number for some users and higher for others (simpler/more complex queries). This is just a guess though. No profiling or measuring at all, really. -- Alvaro Herrera http://www.amazon.com/gp/registry/5ZYLFMCVHXC "How amazing is that? I call it a night and come back to find that a bug has been identified and patched while I sleep."(Robert Davidson) http://archives.postgresql.org/pgsql-sql/2006-03/msg00378.php ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] User concurrency thresholding: where do I look?
Josh Berkus wrote: Folks, 650 105.71 0.02 700 106.95 0.02 750 107.69 0.02 800 106.78 0.02 850 108.59 0.02 900 106.03 0.02 950 106.13 0.02 100064.58 0.15 105052.32 0.23 110049.79 0.25 Tinkering with shared_buffers has had no effect on this threholding (the above was with 3gb to 6gb of shared_buffers). Any ideas on where we should look for the source of the bottleneck? I have seen this as well. I always knocked it up to PG having to managing so many connections but there are some interesting evidences to review. The amount of memory "each" connection takes up. Consider 4-11 meg per connection depending on various things like number of prepared queries. Number of CPUs. Obviously 500 connections over 4 CPUS isn't the same as 500 connections over 8 CPUS. That number of connections generally means a higher velocity, a higher velocity means more checkpoint segments. Wrong settings with your checkpoint segments, bgwriter and checkpoint will cause you to start falling down. I would also note that our experience is that PG falls down a little higher, more toward 2500 connections last time I checked, but this was likely on different hardware. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster