Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Added to TODO: * Improve performance of shared invalidation queue for multiple CPUs http://archives.postgresql.org/pgsql-performance/2008-01/msg00023.php --- Tom Lane wrote: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > Perhaps it would make sense to try to take the "fast path" in > > SIDelExpiredDataEntries with only a shared lock rather than exclusive. > > I think the real problem here is that sinval catchup processing is well > designed to create contention :-(. Once we've decided that the message > queue is getting too full, we SIGUSR1 all the backends at once (or as > fast as the postmaster can do it anyway), then they all go off and try > to touch the sinval queue. Backends that haven't awoken even once > since the last time will have to process the entire queue contents, > and they're all trying to do that at the same time. What's worse, they > take and release the SInvalLock once for each message they take off the > queue. This isn't so horrid for one-core machines (since each process > will monopolize the CPU for probably less than one timeslice while it's > catching up) but it's pretty obvious where all the contention is coming > from on an 8-core. > > Some ideas for improving matters: > > 1. Try to avoid having all the backends hit the queue at once. Instead > of SIGUSR1'ing everybody at the same time, maybe hit only the process > with the oldest message pointer, and have him hit the next oldest after > he's done reading the queue. > > 2. Try to take more than one message off the queue per SInvalLock cycle. > (There is a tuning tradeoff here, since it would mean holding the lock > for longer at a time.) > > 3. Try to avoid having every backend run SIDelExpiredDataEntries every > time through ReceiveSharedInvalidMessages. It's not critical to delete > entries until the queue starts getting full --- maybe we could rejigger > the logic so it only happens once when somebody notices the queue is > getting full, or so that only the guy(s) who had nextMsgNum == minMsgNum > do it, or something like that? > > regards, tom lane > > ---(end of broadcast)--- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate -- Bruce Momjian <[EMAIL PROTECTED]>http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
On Fri, 25 Jan 2008, Simon Riggs wrote: 1. Try to avoid having all the backends hit the queue at once. Instead of SIGUSR1'ing everybody at the same time, maybe hit only the process with the oldest message pointer, and have him hit the next oldest after he's done reading the queue. My feeling was that an "obvious" way to deal with this is to implement some sort of "random early detect". That is, randomly SIGUSR1 processes as entries are added to the queue. The signals should become more frequent as the queue length increases, until it reaches the current cut-off of signalling everyone when the queue really is full. The hope would be that that would never happen. Matthew ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
On Mon, 2008-01-07 at 19:54 -0500, Tom Lane wrote: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > Perhaps it would make sense to try to take the "fast path" in > > SIDelExpiredDataEntries with only a shared lock rather than exclusive. > > I think the real problem here is that sinval catchup processing is well > designed to create contention :-(. Thinking some more about handling TRUNCATEs... > Some ideas for improving matters: > > 1. Try to avoid having all the backends hit the queue at once. Instead > of SIGUSR1'ing everybody at the same time, maybe hit only the process > with the oldest message pointer, and have him hit the next oldest after > he's done reading the queue. > > 2. Try to take more than one message off the queue per SInvalLock cycle. > (There is a tuning tradeoff here, since it would mean holding the lock > for longer at a time.) > > 3. Try to avoid having every backend run SIDelExpiredDataEntries every > time through ReceiveSharedInvalidMessages. It's not critical to delete > entries until the queue starts getting full --- maybe we could rejigger > the logic so it only happens once when somebody notices the queue is > getting full, or so that only the guy(s) who had nextMsgNum == minMsgNum > do it, or something like that? (2) is unnecessary if we can reduce the number of Exclusive lockers so that repeated access to the backend's messages is not contended. (1) would do this, but seems like it would be complex. We can reduce the possibility of multiple re-signals though. (3) seems like the easiest route, as long as we get a reasonable algorithm for reducing the access rate to a reasonable level. I'm posting a patch for discussion to -patches now that will do this. It seems straightforward enough to include in 8.3, but that may rise a few eyebrows, but read the patch first. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
> Okay, for a table of just a few entries I agree that DELETE is > probably better. But don't forget you're going to need to have those > tables vacuumed fairly regularly now, else they'll start to bloat. I think we'll go with DELETE also for another reason: Just after we figured out the cause of the spikes we started to investigate a long-term issue we had with PostgreSQL: pg_dump of big database was blocking some of our applications. And yes, we replaced TRUNCATE with DELETE and everything is running as expected. Looking at the docs now I see there is a new paragraph in 8.3 docs mentioning that TRUNCATE is not MVCC-safe and also the blocking issue. It's a pity that the warning wasn't there in 7.1 times :-) Thanks, Kuba Tom Lane napsal(a): Jakub Ouhrabka <[EMAIL PROTECTED]> writes: Huh. One transaction truncating a dozen tables? That would match the sinval trace all right ... It should be 4 tables - the shown log looks like there were more truncates? Actually, counting up the entries, there are close to 2 dozen relations apparently being truncated in the trace you showed. But that might be only four tables at the user level, since each index on these tables would appear separately, and you might have a toast table plus index for each one too. If you want to dig down, the table OIDs are visible in the trace, in the messages with type -1: LOG: sending inval msg -1 0 30036 0 30700 3218341912 ^ ^ DBOID RELOID so you could look into pg_class to confirm what's what. Yes, performance was the initial reason to use truncate instead of delete many years ago. But today the truncated tables usualy contain exactly one row - quick measurements now show that it's faster to issue delete instead of truncate in this case. Okay, for a table of just a few entries I agree that DELETE is probably better. But don't forget you're going to need to have those tables vacuumed fairly regularly now, else they'll start to bloat. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Jakub Ouhrabka <[EMAIL PROTECTED]> writes: >>> Huh. One transaction truncating a dozen tables? That would match the >>> sinval trace all right ... > It should be 4 tables - the shown log looks like there were more truncates? Actually, counting up the entries, there are close to 2 dozen relations apparently being truncated in the trace you showed. But that might be only four tables at the user level, since each index on these tables would appear separately, and you might have a toast table plus index for each one too. If you want to dig down, the table OIDs are visible in the trace, in the messages with type -1: >> LOG: sending inval msg -1 0 30036 0 30700 3218341912 ^ ^ DBOID RELOID so you could look into pg_class to confirm what's what. > Yes, performance was the initial reason to use truncate instead of > delete many years ago. But today the truncated tables usualy contain > exactly one row - quick measurements now show that it's faster to issue > delete instead of truncate in this case. Okay, for a table of just a few entries I agree that DELETE is probably better. But don't forget you're going to need to have those tables vacuumed fairly regularly now, else they'll start to bloat. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Adrian Moisey <[EMAIL PROTECTED]> writes: >> we've found it: TRUNCATE > I haven't been following this thread. Can someone please explain to me > why TRUNCATE causes these spikes? It's not so much the TRUNCATE as the overhead of broadcasting the resultant catalog changes to the many hundreds of (mostly idle) backends he's got --- all of which respond by trying to lock the shared sinval message queue at about the same time. You could see the whole thread as an object lesson in why connection pooling is a good idea. But certainly it seems that sinval is the next bottleneck in terms of being able to scale Postgres up to very large numbers of backends. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
> Huh. One transaction truncating a dozen tables? That would match the > sinval trace all right ... It should be 4 tables - the shown log looks like there were more truncates? > You might be throwing the baby out with the bathwater, > performance-wise. Yes, performance was the initial reason to use truncate instead of delete many years ago. But today the truncated tables usualy contain exactly one row - quick measurements now show that it's faster to issue delete instead of truncate in this case. Again, many thanks for your invaluable advice! Kuba ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Jakub Ouhrabka <[EMAIL PROTECTED]> writes: > we've found it: TRUNCATE Huh. One transaction truncating a dozen tables? That would match the sinval trace all right ... > One more question: is it ok to do mass regexp update of pg_proc.prosrc > changing TRUNCATEs to DELETEs? You might be throwing the baby out with the bathwater, performance-wise. Mass DELETEs will require cleanup by VACUUM, and that'll likely eat more cycles and I/O than you save. I'd think in terms of trying to spread out the TRUNCATEs or check to see if you really need one (maybe the table's already empty), rather than this. I do plan to look at the sinval code once 8.3 is out the door, so another possibility if you can wait a few weeks/months is to leave your app alone and see if the eventual patch looks sane to back-patch. (I don't think the community would consider doing so, but you could run a locally modified Postgres with it.) regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Hi > I can think of three things that might be producing this: we've found it: TRUNCATE I haven't been following this thread. Can someone please explain to me why TRUNCATE causes these spikes? -- Adrian Moisey System Administrator | CareerJunction | Your Future Starts Here. Web: www.careerjunction.co.za | Email: [EMAIL PROTECTED] Phone: +27 21 686 6820 | Mobile: +27 82 858 7830 | Fax: +27 21 686 6842 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Hi Tom, > I can think of three things that might be producing this: we've found it: TRUNCATE We'll try to eliminate use of TRUNCATE and the periodical spikes should go off. There will still be possibility of spikes because of database creation etc - we'll try to handle this by issuing trivial commands from idle backeds and in longer run to decrease the number of backends/databases running. This is the way to go, right? One more question: is it ok to do mass regexp update of pg_proc.prosrc changing TRUNCATEs to DELETEs? Anything we should be aware of? Sure we'll do our own testing but in case you see any obvious downsides... Many thanks for your guidance! Kuba ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Jakub Ouhrabka <[EMAIL PROTECTED]> writes: > What does it mean? Look at src/include/storage/sinval.h and src/include/utils/syscache.h. What you seem to have here is a bunch of tuple updates in pg_class (invalidating caches 29 and 30, which in 8.2 correspond to RELNAMENSP and RELOID), followed by a bunch of SharedInvalRelcacheMsg and SharedInvalSmgrMsg. What I find interesting is that the hits are coming against nearly-successive tuple CTIDs in pg_class, eg these are all on pages 25 and 26 of pg_class: > LOG: sending inval msg 30 0 25 45 30036 4294936595 > LOG: sending inval msg 29 0 25 45 30036 2019111801 > LOG: sending inval msg 30 0 26 11 30036 4294936595 > LOG: sending inval msg 29 0 26 11 30036 2019111801 > LOG: sending inval msg 30 0 25 44 30036 4294936597 > LOG: sending inval msg 29 0 25 44 30036 3703878920 > LOG: sending inval msg 30 0 26 10 30036 4294936597 > LOG: sending inval msg 29 0 26 10 30036 3703878920 > LOG: sending inval msg 30 0 26 9 30036 4294936616 > LOG: sending inval msg 29 0 26 9 30036 3527122063 > LOG: sending inval msg 30 0 25 43 30036 4294936616 > LOG: sending inval msg 29 0 25 43 30036 3527122063 The ordering is a little strange --- not sure what's producing that. I can think of three things that might be producing this: 1. DDL operations ... but most sorts of DDL on a table would touch more catalogs than just pg_class, so this is a bit hard to credit. 2. VACUUM. 3. Some sort of direct update of pg_class. The fact that we have a bunch of catcache invals followed by relcache/smgr invals says that this all happened in one transaction, else they'd have been intermixed better. That lets VACUUM off the hook, because it processes each table in a separate transaction. I am wondering if maybe your app does one of those sneaky things like fooling with pg_class.reltriggers. If so, the problem might be soluble by just avoiding unnecessary updates. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Hi Tom, > Strange. The best idea that comes to mind is to add some debugging > code to SendSharedInvalidMessage to log the content of each message > that's sent out. That would at least tell us *what* is going into > the queue, even if not directly *why*. we've patched postgresql and run one of our plpgsql complex procedures. There are many of sinval messages - log output is below. What does it mean? Thanks, Kuba LOG: sending inval msg 30 0 26 13 30036 4294936593 LOG: sending inval msg 29 0 26 13 30036 337030170 LOG: sending inval msg 30 0 25 46 30036 4294936593 LOG: sending inval msg 29 0 25 46 30036 337030170 LOG: sending inval msg 30 0 26 13 30036 4294936593 LOG: sending inval msg 29 0 26 13 30036 337030170 LOG: sending inval msg 30 0 25 45 30036 4294936595 LOG: sending inval msg 29 0 25 45 30036 2019111801 LOG: sending inval msg 30 0 26 11 30036 4294936595 LOG: sending inval msg 29 0 26 11 30036 2019111801 LOG: sending inval msg 30 0 25 44 30036 4294936597 LOG: sending inval msg 29 0 25 44 30036 3703878920 LOG: sending inval msg 30 0 26 10 30036 4294936597 LOG: sending inval msg 29 0 26 10 30036 3703878920 LOG: sending inval msg 30 0 26 9 30036 4294936616 LOG: sending inval msg 29 0 26 9 30036 3527122063 LOG: sending inval msg 30 0 25 43 30036 4294936616 LOG: sending inval msg 29 0 25 43 30036 3527122063 LOG: sending inval msg 30 0 26 9 30036 4294936616 LOG: sending inval msg 29 0 26 9 30036 3527122063 LOG: sending inval msg 30 0 25 41 30036 4294936618 LOG: sending inval msg 29 0 25 41 30036 2126866956 LOG: sending inval msg 30 0 26 7 30036 4294936618 LOG: sending inval msg 29 0 26 7 30036 2126866956 LOG: sending inval msg 30 0 25 40 30036 4294936620 LOG: sending inval msg 29 0 25 40 30036 1941919314 LOG: sending inval msg 30 0 26 5 30036 4294936620 LOG: sending inval msg 29 0 26 5 30036 1941919314 LOG: sending inval msg 30 0 26 4 30036 4294936633 LOG: sending inval msg 29 0 26 4 30036 544523647 LOG: sending inval msg 30 0 25 39 30036 4294936633 LOG: sending inval msg 29 0 25 39 30036 544523647 LOG: sending inval msg 30 0 26 4 30036 4294936633 LOG: sending inval msg 29 0 26 4 30036 544523647 LOG: sending inval msg 30 0 25 38 30036 4294936635 LOG: sending inval msg 29 0 25 38 30036 2557582018 LOG: sending inval msg 30 0 26 3 30036 4294936635 LOG: sending inval msg 29 0 26 3 30036 2557582018 LOG: sending inval msg 30 0 25 37 30036 4294936637 LOG: sending inval msg 29 0 25 37 30036 2207280630 LOG: sending inval msg 30 0 26 2 30036 4294936637 LOG: sending inval msg 29 0 26 2 30036 2207280630 LOG: sending inval msg 30 0 26 1 30036 4294936669 LOG: sending inval msg 29 0 26 1 30036 1310188568 LOG: sending inval msg 30 0 25 36 30036 4294936669 LOG: sending inval msg 29 0 25 36 30036 1310188568 LOG: sending inval msg 30 0 26 1 30036 4294936669 LOG: sending inval msg 29 0 26 1 30036 1310188568 LOG: sending inval msg 30 0 25 35 30036 4294936671 LOG: sending inval msg 29 0 25 35 30036 2633053415 LOG: sending inval msg 30 0 25 48 30036 4294936671 LOG: sending inval msg 29 0 25 48 30036 2633053415 LOG: sending inval msg 30 0 25 33 30036 4294936673 LOG: sending inval msg 29 0 25 33 30036 2049964857 LOG: sending inval msg 30 0 25 47 30036 4294936673 LOG: sending inval msg 29 0 25 47 30036 2049964857 LOG: sending inval msg -1 0 30036 0 30700 3218341912 LOG: sending inval msg -2 2084 1663 0 30036 50335 LOG: sending inval msg -2 0 1663 0 30036 50336 LOG: sending inval msg -1 2075 30036 0 30702 30036 LOG: sending inval msg -2 0 1663 0 30036 50324 LOG: sending inval msg -1 0 30036 0 30702 30036 LOG: sending inval msg -2 0 1663 0 30036 50336 LOG: sending inval msg -2 0 1663 0 30036 50323 LOG: sending inval msg -1 0 30036 0 30700 30036 LOG: sending inval msg -2 0 1663 0 30036 50335 LOG: sending inval msg -2 0 1663 0 30036 50322 LOG: sending inval msg -1 0 30036 0 30698 30036 LOG: sending inval msg -2 0 1663 0 30036 50334 LOG: sending inval msg -1 0 30036 0 30677 3218341912 LOG: sending inval msg -2 2084 1663 0 30036 50332 LOG: sending inval msg -2 0 1663 0 30036 50333 LOG: sending inval msg -1 2075 30036 0 30679 30036 LOG: sending inval msg -2 0 1663 0 30036 50321 LOG: sending inval msg -1 0 30036 0 30679 30036 LOG: sending inval msg -2 0 1663 0 30036 50333 LOG: sending inval msg -2 0 1663 0 30036 50320 LOG: sending inval msg -1 0 30036 0 30677 30036 LOG: sending inval msg -2 0 1663 0 30036 50332 LOG: sending inval msg -2 0 1663 0 30036 50319 LOG: sending inval msg -1 0 30036 0 30675 30036 LOG: sending inval msg -2 0 1663 0 30036 50331 LOG: sending inval msg -1 0 30036 0 30660 3218341912 LOG: sending inval msg -2 2084 1663 0 30036 50329 LOG: sending inval msg -2 0 1663 0 30036 50330 LOG: sending inval msg -1 2075 30036 0 30662 30036 LOG: sending inval msg -2 0 1663 0 30036 50318 LOG: sending inval msg -1 0 30036 0 30662 30036 LOG: sending inval msg -2 0 1663 0 30036 50330 LOG: sending inval msg -2 0 1663 0 30036 50317 LOG: s
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Jakub Ouhrabka <[EMAIL PROTECTED]> writes: > We'we tried hard to identify what's the cause of filling sinval-queue. > We went through query logs as well as function bodies stored in the > database. We were not able to find any DDL, temp table creations etc. Strange. The best idea that comes to mind is to add some debugging code to SendSharedInvalidMessage to log the content of each message that's sent out. That would at least tell us *what* is going into the queue, even if not directly *why*. Try something like (untested) void SendSharedInvalidMessage(SharedInvalidationMessage *msg) { boolinsertOK; + elog(LOG, "sending inval msg %d %u %u %u %u %u", + msg->cc.id, + msg->cc.tuplePtr.ip_blkid.bi_hi, + msg->cc.tuplePtr.ip_blkid.bi_lo, + msg->cc.tuplePtr.ip_posid, + msg->cc.dbId, + msg->cc.hashValue); + LWLockAcquire(SInvalLock, LW_EXCLUSIVE); insertOK = SIInsertDataEntry(shmInvalBuffer, msg); LWLockRelease(SInvalLock); regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Hi Tom, > I doubt we'd risk destabilizing 8.3 at this point, for a problem that > affects so few people; let alone back-patching into 8.2. understand. > OK, that confirms the theory that it's sinval-queue contention. We'we tried hard to identify what's the cause of filling sinval-queue. We went through query logs as well as function bodies stored in the database. We were not able to find any DDL, temp table creations etc. We did following experiment: stop one of our clients, so there started to be queue of events (aka rows in db) for it to process. Then we started the client again, it started processing the queue - that means calling simple selects, updates and complex plpgsql function(s). And at this moment, the spike started even it shouldn't start to meet usual periodicity. It's consistent. We are pretty sure that this client is not doing any DDL... What should we look for to find the cause? Thanks for any hints, Kuba ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Jakub Ouhrabka <[EMAIL PROTECTED]> writes: > Yes, I can confirm that it's triggered by SIGUSR1 signals. OK, that confirms the theory that it's sinval-queue contention. > If I understand it correctly we have following choices now: > 1) Use only 2 cores (out of 8 cores) > 2) Lower the number of idle backends - at least force backends to do > something at different times to eliminate spikes - is "select 1" enough > to force processing the queue? Yeah, if you could get your clients to issue trivial queries every few seconds (not all at the same time) the spikes should go away. If you don't want to change your clients, one possible amelioration is to reduce the signaling threshold in SIInsertDataEntry --- instead of 70% of MAXNUMMESSAGES, maybe signal at 10% or 20%. That would make the spikes more frequent but smaller, which might help ... or not. > 3) Is there any chance of this being fixed/improved in 8.3 or even 8.2? I doubt we'd risk destabilizing 8.3 at this point, for a problem that affects so few people; let alone back-patching into 8.2. There are some other known performance problems in the sinval signaling (for instance, that a queue overflow results in cache resets for all backends, not only the slowest), so I think addressing all of them at once would be the thing to do. That would be a large enough patch that it would certainly need to go through beta testing before I'd want to inflict it on the world... This discussion has raised the priority of the problem in my mind, so I'm thinking it should be worked on in 8.4; but it's too late for 8.3. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
> You could check this theory > out by strace'ing some of the idle backends and seeing if their > activity spikes are triggered by receipt of SIGUSR1 signals. Yes, I can confirm that it's triggered by SIGUSR1 signals. If I understand it correctly we have following choices now: 1) Use only 2 cores (out of 8 cores) 2) Lower the number of idle backends - at least force backends to do something at different times to eliminate spikes - is "select 1" enough to force processing the queue? 3) Is there any chance of this being fixed/improved in 8.3 or even 8.2? It's a (performance) bug from our point of view. I realize we're first who noticed it and it's not typical use case to have so many idle backends. But large installation with connection pooling is something similar and that's not that uncommon - one active backend doing DDL can then cause unexpected spikes during otherwise quiet hours... 4) Sure we'll try to reduce the number of DDL statements (which in fact we're not sure where exactly they are comming from) but I guess it would only longer the time between spikes but not make them any smoother. Any other suggestions? Thanks, Kuba ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Perhaps it would make sense to try to take the "fast path" in > SIDelExpiredDataEntries with only a shared lock rather than exclusive. I think the real problem here is that sinval catchup processing is well designed to create contention :-(. Once we've decided that the message queue is getting too full, we SIGUSR1 all the backends at once (or as fast as the postmaster can do it anyway), then they all go off and try to touch the sinval queue. Backends that haven't awoken even once since the last time will have to process the entire queue contents, and they're all trying to do that at the same time. What's worse, they take and release the SInvalLock once for each message they take off the queue. This isn't so horrid for one-core machines (since each process will monopolize the CPU for probably less than one timeslice while it's catching up) but it's pretty obvious where all the contention is coming from on an 8-core. Some ideas for improving matters: 1. Try to avoid having all the backends hit the queue at once. Instead of SIGUSR1'ing everybody at the same time, maybe hit only the process with the oldest message pointer, and have him hit the next oldest after he's done reading the queue. 2. Try to take more than one message off the queue per SInvalLock cycle. (There is a tuning tradeoff here, since it would mean holding the lock for longer at a time.) 3. Try to avoid having every backend run SIDelExpiredDataEntries every time through ReceiveSharedInvalidMessages. It's not critical to delete entries until the queue starts getting full --- maybe we could rejigger the logic so it only happens once when somebody notices the queue is getting full, or so that only the guy(s) who had nextMsgNum == minMsgNum do it, or something like that? regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Jakub Ouhrabka <[EMAIL PROTECTED]> writes: >>> Does your app create and destroy a tremendous number of temp tables, >>> or anything else in the way of frequent DDL commands? > Hmm. I can't think of anything like this. Maybe there are few backends > which create temp tables but not tremendous number. I don't think our > applications issue DDL statements either. Can LOCK TABLE IN EXCLUSIVE > MODE cause this? No. I did some experimenting to see exactly how large the sinval message buffer is in today's terms, and what I find is that about 22 cycles of create temp table foo (f1 int, f2 text); drop table foo; is enough to force a CatchupInterrupt to a sleeping backend. This case is a bit more complex than it appears since the text column forces the temp table to have a toast table; but even with only fixed-width columns, if you were creating one temp table a second that would be plenty to explain once-a-minute-or-so CatchupInterrupt processing. And if you've got a lot of backends that normally *don't* wake up that often, then they'd be sitting and failing to absorb the sinval traffic in any more timely fashion. So I'm betting that we've got the source of the spike identified. You could check this theory out by strace'ing some of the idle backends and seeing if their activity spikes are triggered by receipt of SIGUSR1 signals. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Jakub Ouhrabka <[EMAIL PROTECTED]> writes: > We've tried several times to get stacktrace from some of the running > backends during spikes, we got always this: > 0x2b005d00a9a9 in semop () from /lib/libc.so.6 > #0 0x2b005d00a9a9 in semop () from /lib/libc.so.6 > #1 0x0054fe53 in PGSemaphoreLock (sema=0x2b00a04e5090, > interruptOK=0 '\0') at pg_sema.c:411 > #2 0x00575d95 in LWLockAcquire (lockid=SInvalLock, > mode=LW_EXCLUSIVE) at lwlock.c:455 > #3 0x0056fbfe in ReceiveSharedInvalidMessages > (invalFunction=0x5e9a30 , > resetFunction=0x5e9df0 ) at sinval.c:159 > #4 0x00463505 in StartTransactionCommand () at xact.c:1439 > #5 0x0056fa4b in ProcessCatchupEvent () at sinval.c:347 > #6 0x0056fb20 in CatchupInterruptHandler > (postgres_signal_arg=) at sinval.c:221 CatchupInterruptHandler, eh? That seems to let NOTIFY off the hook, and instead points in the direction of sinval processing; which is to say, propagation of changes to system catalogs. Does your app create and destroy a tremendous number of temp tables, or anything else in the way of frequent DDL commands? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Jakub Ouhrabka wrote: > We've tried several times to get stacktrace from some of the running > backends during spikes, we got always this: > > 0x2b005d00a9a9 in semop () from /lib/libc.so.6 > #0 0x2b005d00a9a9 in semop () from /lib/libc.so.6 > #1 0x0054fe53 in PGSemaphoreLock (sema=0x2b00a04e5090, > interruptOK=0 '\0') at pg_sema.c:411 > #2 0x00575d95 in LWLockAcquire (lockid=SInvalLock, > mode=LW_EXCLUSIVE) at lwlock.c:455 > #3 0x0056fbfe in ReceiveSharedInvalidMessages > (invalFunction=0x5e9a30 , >resetFunction=0x5e9df0 ) at sinval.c:159 > #4 0x00463505 in StartTransactionCommand () at xact.c:1439 > #5 0x0056fa4b in ProcessCatchupEvent () at sinval.c:347 > #6 0x0056fb20 in CatchupInterruptHandler > (postgres_signal_arg=) at sinval.c:221 > #7 0x2b005cf6f110 in killpg () from /lib/libc.so.6 > #8 0x in ?? () Perhaps it would make sense to try to take the "fast path" in SIDelExpiredDataEntries with only a shared lock rather than exclusive. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Hi Tom & all, >> It sounds a bit like momentary contention for a spinlock, >> but exactly what isn't clear. > ok, we're going to try oprofile, will let you know... yes, it seems like contention for spinlock if I'm intepreting oprofile correctly, around 60% of time during spikes is in s_lock. [for details see below]. We've tried several times to get stacktrace from some of the running backends during spikes, we got always this: 0x2b005d00a9a9 in semop () from /lib/libc.so.6 #0 0x2b005d00a9a9 in semop () from /lib/libc.so.6 #1 0x0054fe53 in PGSemaphoreLock (sema=0x2b00a04e5090, interruptOK=0 '\0') at pg_sema.c:411 #2 0x00575d95 in LWLockAcquire (lockid=SInvalLock, mode=LW_EXCLUSIVE) at lwlock.c:455 #3 0x0056fbfe in ReceiveSharedInvalidMessages (invalFunction=0x5e9a30 , resetFunction=0x5e9df0 ) at sinval.c:159 #4 0x00463505 in StartTransactionCommand () at xact.c:1439 #5 0x0056fa4b in ProcessCatchupEvent () at sinval.c:347 #6 0x0056fb20 in CatchupInterruptHandler (postgres_signal_arg=) at sinval.c:221 #7 0x2b005cf6f110 in killpg () from /lib/libc.so.6 #8 0x in ?? () Is this enough info to guess what's happening? What should we try next? Thanks, Kuba oprofile results: [I've shortened path from /usr/local/pg... to just pg for better readilibity] # opreport --long-filenames CPU: Core 2, speed 2666.76 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 10 CPU_CLK_UNHALT...| samples| %| -- 125577 90.7584 pg-8.2.4/bin/postgres 3792 2.7406 /lib/libc-2.3.6.so 3220 2.3272 /usr/src/linux-2.6.22.15/vmlinux 2145 1.5503 /usr/bin/oprofiled 1540 1.1130 /xfs 521 0.3765 pg-8.2.4/lib/plpgsql.so 441 0.3187 /cciss 374 0.2703 /oprofile ... CPU: Core 2, speed 2666.76 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 10 samples %app name symbol name 8535561.6887 pg-8.2.4/bin/postgres s_lock 9803 7.0849 pg-8.2.4/bin/postgres LWLockRelease 5535 4.0003 pg-8.2.4/bin/postgres LWLockAcquire 3792 2.7406 /lib/libc-2.3.6.so (no symbols) 3724 2.6915 pg-8.2.4/bin/postgres DropRelFileNodeBuffers 2145 1.5503 /usr/bin/oprofiled (no symbols) 2069 1.4953 pg-8.2.4/bin/postgres GetSnapshotData 1540 1.1130 /xfs (no symbols) 1246 0.9005 pg-8.2.4/bin/postgres hash_search_with_hash_value 1052 0.7603 pg-8.2.4/bin/postgres AllocSetAlloc 1015 0.7336 pg-8.2.4/bin/postgres heapgettup 879 0.6353 pg-8.2.4/bin/postgres hash_any 862 0.6230 /usr/src/linux-2.6.22.15/vmlinux mwait_idle 740 0.5348 pg-8.2.4/bin/postgres hash_seq_search 674 0.4871 pg-8.2.4/bin/postgres HeapTupleSatisfiesNow 557 0.4026 pg-8.2.4/bin/postgres SIGetDataEntry 552 0.3989 pg-8.2.4/bin/postgres equal 469 0.3390 pg-8.2.4/bin/postgres SearchCatCache 441 0.3187 /cciss (no symbols) 433 0.3129 /usr/src/linux-2.6.22.15/vmlinux find_busiest_group 413 0.2985 pg-8.2.4/bin/postgres PinBuffer 393 0.2840 pg-8.2.4/bin/postgres MemoryContextAllocZeroAligned 374 0.2703 /oprofile(no symbols) 275 0.1988 pg-8.2.4/bin/postgres ExecInitExpr 253 0.1829 pg-8.2.4/bin/postgres base_yyparse 206 0.1489 pg-8.2.4/bin/postgres CatalogCacheFlushRelation 201 0.1453 pg-8.2.4/bin/postgres MemoryContextAlloc 194 0.1402 pg-8.2.4/bin/postgres _bt_compare 188 0.1359 /nf_conntrack(no symbols) 158 0.1142 /bnx2(no symbols) 147 0.1062 pg-8.2.4/bin/postgres pgstat_initstats 139 0.1005 pg-8.2.4/bin/postgres fmgr_info_cxt_security 132 0.0954 /usr/src/linux-2.6.22.15/vmlinux task_rq_lock 131 0.0947 /bin/bash(no symbols) 129 0.0932 pg-8.2.4/bin/postgres AllocSetFree 125 0.0903 pg-8.2.4/bin/postgres ReadBuffer 124 0.0896 pg-8.2.4/bin/postgres MemoryContextCreate 124 0.0896 pg-8.2.4/bin/postgres SyncOneBuffer 124 0.0896 pg-8.2.4/bin/postgres XLogInsert 123 0.0889 pg-8.2.4/bin/postgres _equalAggref 122 0.0882 pg-8.2.4/bin/postgres HeapTupleSatisfiesSnapshot 112 0.0809 pg-8.2.4/bin/postgres copyObject 102 0.0737 pg-8.2.4/bin/postgres UnpinBuffer 990.0716 pg-8.2.4/bin/postgres _SPI_execute_plan 990.0716 pg-8.2.4/bin/postgres nocachegetattr 980.0708 /usr/src/linux-2.6.22.15/vmlinux __wake_up_bit 970.0701 pg-8.2.4/bin/postgres TransactionIdIsInProgress 940.0679 pg-8.2.4/bin/postgres check_stack_depth 930.0672 pg-8.2.4/bin/postgres base_yylex 910.0658 pg-8.2.4/bin/postgres pfree 890.0643 pg-8.2.4/lib/plpgsql.
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
James Mansion wrote: Jakub Ouhrabka wrote: How can we diagnose what is happening during the peaks? Can you try forcing a core from a bunch of the busy processes? (Hmm - does Linux have an equivalent to the useful Solaris pstacks?) There's a 'pstack' for Linux, shipped at least in Red Hat distributions (and possibly others, I'm not sure). It's a shell script wrapper around gdb, so easily ported to any Linux. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Jakub Ouhrabka wrote: How can we diagnose what is happening during the peaks? Can you try forcing a core from a bunch of the busy processes? (Hmm - does Linux have an equivalent to the useful Solaris pstacks?) James ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Alvaro, >>> - do an UNLISTEN if possible >> Yes, we're issuing unlistens when appropriate. > > You are vacuuming pg_listener periodically, yes? Not that this seems > to have any relationship to your problem, but ... yes, autovacuum should take care of this. But looking forward for multiple-workers in 8.3 as it should help us during high load periods (some tables might wait too long for autovacuum now - but it's not that big problem for us...). Thanks for great work! Kuba ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Jakub Ouhrabka wrote: > > - do an UNLISTEN if possible > > Yes, we're issuing unlistens when appropriate. You are vacuuming pg_listener periodically, yes? Not that this seems to have any relationship to your problem, but ... -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Hi Sven, > I guess all backends do listen to the same notification. Unfortunatelly no. The backends are listening to different notifications in different databases. Usually there are only few listens per database with only one exception - there are many (hundreds) listens in one database but all for different notifications. > Can you change your implementation? > - split you problem - create multiple notification if possible Yes, it is like this. > - do an UNLISTEN if possible Yes, we're issuing unlistens when appropriate. > - use another signalisation technique We're planning to reduce the number of databases/backends/listens but anyway we'd like to run system on 8 cores if it is running without any problems on 2 cores... Thanks for the suggestions! Kuba Sven Geisler napsal(a): Hi Jakub, I do have a similar server (from DELL), which performance well with our PostgreSQL application. I guess the peak in context switches is the only think you can see. Anyhow, I think it is you're LISTEN/NOTIFY approach which cause that behaviour. I guess all backends do listen to the same notification. I don't know the exact implementation, but I can imagine that all backends are access the same section in the shared memory which cause the increase of context switches. More cores means more access at the same time. Can you change your implementation? - split you problem - create multiple notification if possible - do an UNLISTEN if possible - use another signalisation technique Regards Sven Jakub Ouhrabka schrieb: Hi all, we have a PostgreSQL dedicated Linux server with 8 cores (2xX5355). We came accross a strange issue: when running with all 8 cores enabled approximatly once a minute (period differs) the system is very busy for a few seconds (~5-10s) and we don't know why - this issue don't show up when we tell Linux to use only 2 cores, with 4 cores the problem is here but it is still better than with 8 cores - all on the same machine, same config, same workload. We don't see any apparent reason for these peaks. We'd like to investigate it further but we don't know what to try next. Any suggenstions? Any tunning tips for Linux+PostgreSQL on 8-way system? Can this be connected with our heavy use of listen/notify and hundreds backends in listen mode? More details are below. Thanks, Kuba System: HP DL360 2x5355, 8G RAM, P600+MSA50 - internal 2x72GB RAID 10 for OS, 10x72G disks RAID 10 for PostgreSQL data and wal OS: Linux 2.6 64bit (kernel 2.6.21, 22, 23 makes little difference) PostgreSQL: 8.2.4 (64bit), shared buffers 1G Nothing else than PostgreSQL is running on the server. Cca 800 concurrent backends. Majority of backends in LISTEN doing nothing. Client interface for most backends is ecpg+libpq. Problem description: The system is usually running 80-95% idle. Approximatly once a minute for cca 5-10s there is a peak in activity which looks like this: vmstat (and top or atop) reports 0% idle, 100% in user mode, very low iowait, low IO activity, higher number of contex switches than usual but not exceedingly high (2000-4000cs/s, usually 1500cs/s), few hundreds waiting processes per second (usually 0-1/s). From looking at top and running processes we can't see any obvious reason for the peak. According to PostgreSQL log the long running commands from these moments are e.g. begin transaction lasting several seconds. When only 2 cores are enabled (kernel command line) then everything is running smoothly. 4 cores exibits slightly better behavior than 8 cores but worse than 2 cores - the peaks are visible. We've tried kernel versions 2.6.21-23 (latest revisions as of beginning December from kernel.org) the pattern slightly changed but it may also be that the workload slightly changed. pgbench or any other stress testing runs smoothly on the server. The o usage panly strange thing about ourttern I can think of is heavy use of LISTEN/NOTIFY especially hunderds backends in listen mode. When restarting our connected clients the peaks are not there from time 0, they are visible after a while - seems something gets synchronized and causing troubles then. Since the server is PostgreSQL dedicated and no our client applications are running on it - and there is a difference when 2 and 8 cores are enabled - we think that the peaks are not caused by our client applications. How can we diagnose what is happening during the peaks? ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Hi Tom, > Interesting. Maybe you could use oprofile to try to see what's > happening? It sounds a bit like momentary contention for a spinlock, > but exactly what isn't clear. ok, we're going to try oprofile, will let you know... > Perhaps. Have you tried logging executions of NOTIFY to see if they > are correlated with the spikes? We didn't log the notifies but I think it's not correlated. We'll have a detailed look next time we try it (with oprofile). Thanks for suggestions! Kuba ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Jakub Ouhrabka <[EMAIL PROTECTED]> writes: > we have a PostgreSQL dedicated Linux server with 8 cores (2xX5355). We > came accross a strange issue: when running with all 8 cores enabled > approximatly once a minute (period differs) the system is very busy for > a few seconds (~5-10s) and we don't know why - this issue don't show up > when we tell Linux to use only 2 cores, with 4 cores the problem is here > but it is still better than with 8 cores - all on the same machine, same > config, same workload. We don't see any apparent reason for these peaks. Interesting. Maybe you could use oprofile to try to see what's happening? It sounds a bit like momentary contention for a spinlock, but exactly what isn't clear. > Can this be connected with our heavy use of listen/notify and hundreds > backends in listen mode? Perhaps. Have you tried logging executions of NOTIFY to see if they are correlated with the spikes? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Hi Jakub, I do have a similar server (from DELL), which performance well with our PostgreSQL application. I guess the peak in context switches is the only think you can see. Anyhow, I think it is you're LISTEN/NOTIFY approach which cause that behaviour. I guess all backends do listen to the same notification. I don't know the exact implementation, but I can imagine that all backends are access the same section in the shared memory which cause the increase of context switches. More cores means more access at the same time. Can you change your implementation? - split you problem - create multiple notification if possible - do an UNLISTEN if possible - use another signalisation technique Regards Sven Jakub Ouhrabka schrieb: > Hi all, > > we have a PostgreSQL dedicated Linux server with 8 cores (2xX5355). We > came accross a strange issue: when running with all 8 cores enabled > approximatly once a minute (period differs) the system is very busy for > a few seconds (~5-10s) and we don't know why - this issue don't show up > when we tell Linux to use only 2 cores, with 4 cores the problem is here > but it is still better than with 8 cores - all on the same machine, same > config, same workload. We don't see any apparent reason for these peaks. > We'd like to investigate it further but we don't know what to try next. > Any suggenstions? Any tunning tips for Linux+PostgreSQL on 8-way system? > Can this be connected with our heavy use of listen/notify and hundreds > backends in listen mode? > > More details are below. > > Thanks, > > Kuba > > System: HP DL360 2x5355, 8G RAM, P600+MSA50 - internal 2x72GB RAID 10 > for OS, 10x72G disks RAID 10 for PostgreSQL data and wal > OS: Linux 2.6 64bit (kernel 2.6.21, 22, 23 makes little difference) > PostgreSQL: 8.2.4 (64bit), shared buffers 1G > > Nothing else than PostgreSQL is running on the server. Cca 800 > concurrent backends. Majority of backends in LISTEN doing nothing. > Client interface for most backends is ecpg+libpq. > > Problem description: > > The system is usually running 80-95% idle. Approximatly once a minute > for cca 5-10s there is a peak in activity which looks like this: > > vmstat (and top or atop) reports 0% idle, 100% in user mode, very low > iowait, low IO activity, higher number of contex switches than usual but > not exceedingly high (2000-4000cs/s, usually 1500cs/s), few hundreds > waiting processes per second (usually 0-1/s). From looking at top and > running processes we can't see any obvious reason for the peak. > According to PostgreSQL log the long running commands from these moments > are e.g. begin transaction lasting several seconds. > > When only 2 cores are enabled (kernel command line) then everything is > running smoothly. 4 cores exibits slightly better behavior than 8 cores > but worse than 2 cores - the peaks are visible. > > We've tried kernel versions 2.6.21-23 (latest revisions as of beginning > December from kernel.org) the pattern slightly changed but it may also > be that the workload slightly changed. > > pgbench or any other stress testing runs smoothly on the server. > > The o usage panly strange thing about ourttern I can think of is heavy > use of LISTEN/NOTIFY especially hunderds backends in listen mode. > > When restarting our connected clients the peaks are not there from time > 0, they are visible after a while - seems something gets synchronized > and causing troubles then. > > Since the server is PostgreSQL dedicated and no our client applications > are running on it - and there is a difference when 2 and 8 cores are > enabled - we think that the peaks are not caused by our client > applications. > > How can we diagnose what is happening during the peaks? > > ---(end of broadcast)--- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to [EMAIL PROTECTED] so that your > message can get through to the mailing list cleanly -- Sven Geisler <[EMAIL PROTECTED]> Tel +49.30.921017.81 Fax .50 Senior Developer, AEC/communications GmbH & Co. KG Berlin, Germany ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
Hi all, we have a PostgreSQL dedicated Linux server with 8 cores (2xX5355). We came accross a strange issue: when running with all 8 cores enabled approximatly once a minute (period differs) the system is very busy for a few seconds (~5-10s) and we don't know why - this issue don't show up when we tell Linux to use only 2 cores, with 4 cores the problem is here but it is still better than with 8 cores - all on the same machine, same config, same workload. We don't see any apparent reason for these peaks. We'd like to investigate it further but we don't know what to try next. Any suggenstions? Any tunning tips for Linux+PostgreSQL on 8-way system? Can this be connected with our heavy use of listen/notify and hundreds backends in listen mode? More details are below. Thanks, Kuba System: HP DL360 2x5355, 8G RAM, P600+MSA50 - internal 2x72GB RAID 10 for OS, 10x72G disks RAID 10 for PostgreSQL data and wal OS: Linux 2.6 64bit (kernel 2.6.21, 22, 23 makes little difference) PostgreSQL: 8.2.4 (64bit), shared buffers 1G Nothing else than PostgreSQL is running on the server. Cca 800 concurrent backends. Majority of backends in LISTEN doing nothing. Client interface for most backends is ecpg+libpq. Problem description: The system is usually running 80-95% idle. Approximatly once a minute for cca 5-10s there is a peak in activity which looks like this: vmstat (and top or atop) reports 0% idle, 100% in user mode, very low iowait, low IO activity, higher number of contex switches than usual but not exceedingly high (2000-4000cs/s, usually 1500cs/s), few hundreds waiting processes per second (usually 0-1/s). From looking at top and running processes we can't see any obvious reason for the peak. According to PostgreSQL log the long running commands from these moments are e.g. begin transaction lasting several seconds. When only 2 cores are enabled (kernel command line) then everything is running smoothly. 4 cores exibits slightly better behavior than 8 cores but worse than 2 cores - the peaks are visible. We've tried kernel versions 2.6.21-23 (latest revisions as of beginning December from kernel.org) the pattern slightly changed but it may also be that the workload slightly changed. pgbench or any other stress testing runs smoothly on the server. The only strange thing about our usage pattern I can think of is heavy use of LISTEN/NOTIFY especially hunderds backends in listen mode. When restarting our connected clients the peaks are not there from time 0, they are visible after a while - seems something gets synchronized and causing troubles then. Since the server is PostgreSQL dedicated and no our client applications are running on it - and there is a difference when 2 and 8 cores are enabled - we think that the peaks are not caused by our client applications. How can we diagnose what is happening during the peaks? ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly