Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-03-21 Thread Bruce Momjian

Added to TODO:

* Improve performance of shared invalidation queue for multiple CPUs

  http://archives.postgresql.org/pgsql-performance/2008-01/msg00023.php


---

Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > Perhaps it would make sense to try to take the "fast path" in
> > SIDelExpiredDataEntries with only a shared lock rather than exclusive.
> 
> I think the real problem here is that sinval catchup processing is well
> designed to create contention :-(.  Once we've decided that the message
> queue is getting too full, we SIGUSR1 all the backends at once (or as
> fast as the postmaster can do it anyway), then they all go off and try
> to touch the sinval queue.  Backends that haven't awoken even once
> since the last time will have to process the entire queue contents,
> and they're all trying to do that at the same time.  What's worse, they
> take and release the SInvalLock once for each message they take off the
> queue.  This isn't so horrid for one-core machines (since each process
> will monopolize the CPU for probably less than one timeslice while it's
> catching up) but it's pretty obvious where all the contention is coming
> from on an 8-core.
> 
> Some ideas for improving matters:
> 
> 1. Try to avoid having all the backends hit the queue at once.  Instead
> of SIGUSR1'ing everybody at the same time, maybe hit only the process
> with the oldest message pointer, and have him hit the next oldest after
> he's done reading the queue.
> 
> 2. Try to take more than one message off the queue per SInvalLock cycle.
> (There is a tuning tradeoff here, since it would mean holding the lock
> for longer at a time.)
> 
> 3. Try to avoid having every backend run SIDelExpiredDataEntries every
> time through ReceiveSharedInvalidMessages.  It's not critical to delete
> entries until the queue starts getting full --- maybe we could rejigger
> the logic so it only happens once when somebody notices the queue is
> getting full, or so that only the guy(s) who had nextMsgNum == minMsgNum
> do it, or something like that?
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 7: You can help support the PostgreSQL project by donating at
> 
> http://www.postgresql.org/about/donate

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-28 Thread Matthew

On Fri, 25 Jan 2008, Simon Riggs wrote:

1. Try to avoid having all the backends hit the queue at once.  Instead
of SIGUSR1'ing everybody at the same time, maybe hit only the process
with the oldest message pointer, and have him hit the next oldest after
he's done reading the queue.


My feeling was that an "obvious" way to deal with this is to implement 
some sort of "random early detect". That is, randomly SIGUSR1 processes as 
entries are added to the queue. The signals should become more frequent as 
the queue length increases, until it reaches the current cut-off of 
signalling everyone when the queue really is full. The hope would be that 
that would never happen.


Matthew

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-25 Thread Simon Riggs
On Mon, 2008-01-07 at 19:54 -0500, Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > Perhaps it would make sense to try to take the "fast path" in
> > SIDelExpiredDataEntries with only a shared lock rather than exclusive.
> 
> I think the real problem here is that sinval catchup processing is well
> designed to create contention :-(.  

Thinking some more about handling TRUNCATEs...

> Some ideas for improving matters:
> 
> 1. Try to avoid having all the backends hit the queue at once.  Instead
> of SIGUSR1'ing everybody at the same time, maybe hit only the process
> with the oldest message pointer, and have him hit the next oldest after
> he's done reading the queue.
> 
> 2. Try to take more than one message off the queue per SInvalLock cycle.
> (There is a tuning tradeoff here, since it would mean holding the lock
> for longer at a time.)
> 
> 3. Try to avoid having every backend run SIDelExpiredDataEntries every
> time through ReceiveSharedInvalidMessages.  It's not critical to delete
> entries until the queue starts getting full --- maybe we could rejigger
> the logic so it only happens once when somebody notices the queue is
> getting full, or so that only the guy(s) who had nextMsgNum == minMsgNum
> do it, or something like that?

(2) is unnecessary if we can reduce the number of Exclusive lockers so
that repeated access to the backend's messages is not contended.

(1) would do this, but seems like it would be complex. We can reduce the
possibility of multiple re-signals though.

(3) seems like the easiest route, as long as we get a reasonable
algorithm for reducing the access rate to a reasonable level. 

I'm posting a patch for discussion to -patches now that will do this. It
seems straightforward enough to include in 8.3, but that may rise a few
eyebrows, but read the patch first.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-15 Thread Jakub Ouhrabka

> Okay, for a table of just a few entries I agree that DELETE is
> probably better.  But don't forget you're going to need to have those
> tables vacuumed fairly regularly now, else they'll start to bloat.

I think we'll go with DELETE also for another reason:

Just after we figured out the cause of the spikes we started to 
investigate a long-term issue we had with PostgreSQL: pg_dump of big 
database was blocking some of our applications. And yes, we replaced 
TRUNCATE with DELETE and everything is running as expected.


Looking at the docs now I see there is a new paragraph in 8.3 docs 
mentioning that TRUNCATE is not MVCC-safe and also the blocking issue. 
It's a pity that the warning wasn't there in 7.1 times :-)


Thanks,

Kuba

Tom Lane napsal(a):

Jakub Ouhrabka <[EMAIL PROTECTED]> writes:

Huh.  One transaction truncating a dozen tables?  That would match the
sinval trace all right ...



It should be 4 tables - the shown log looks like there were more truncates?


Actually, counting up the entries, there are close to 2 dozen relations
apparently being truncated in the trace you showed.  But that might be
only four tables at the user level, since each index on these tables
would appear separately, and you might have a toast table plus index
for each one too.  If you want to dig down, the table OIDs are visible
in the trace, in the messages with type -1:


LOG:  sending inval msg -1 0 30036 0 30700 3218341912

^   ^
DBOID   RELOID

so you could look into pg_class to confirm what's what.

Yes, performance was the initial reason to use truncate instead of 
delete many years ago. But today the truncated tables usualy contain 
exactly one row - quick measurements now show that it's faster to issue 
delete instead of truncate in this case.


Okay, for a table of just a few entries I agree that DELETE is probably
better.  But don't forget you're going to need to have those tables
vacuumed fairly regularly now, else they'll start to bloat.

regards, tom lane


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-15 Thread Tom Lane
Jakub Ouhrabka <[EMAIL PROTECTED]> writes:
>>> Huh.  One transaction truncating a dozen tables?  That would match the
>>> sinval trace all right ...

> It should be 4 tables - the shown log looks like there were more truncates?

Actually, counting up the entries, there are close to 2 dozen relations
apparently being truncated in the trace you showed.  But that might be
only four tables at the user level, since each index on these tables
would appear separately, and you might have a toast table plus index
for each one too.  If you want to dig down, the table OIDs are visible
in the trace, in the messages with type -1:

>> LOG:  sending inval msg -1 0 30036 0 30700 3218341912
^   ^
DBOID   RELOID

so you could look into pg_class to confirm what's what.

> Yes, performance was the initial reason to use truncate instead of 
> delete many years ago. But today the truncated tables usualy contain 
> exactly one row - quick measurements now show that it's faster to issue 
> delete instead of truncate in this case.

Okay, for a table of just a few entries I agree that DELETE is probably
better.  But don't forget you're going to need to have those tables
vacuumed fairly regularly now, else they'll start to bloat.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-15 Thread Tom Lane
Adrian Moisey <[EMAIL PROTECTED]> writes:
>> we've found it: TRUNCATE

> I haven't been following this thread.  Can someone please explain to me 
> why TRUNCATE causes these spikes?

It's not so much the TRUNCATE as the overhead of broadcasting the
resultant catalog changes to the many hundreds of (mostly idle)
backends he's got --- all of which respond by trying to lock the
shared sinval message queue at about the same time.

You could see the whole thread as an object lesson in why connection
pooling is a good idea.  But certainly it seems that sinval is the
next bottleneck in terms of being able to scale Postgres up to very
large numbers of backends.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-15 Thread Jakub Ouhrabka

> Huh.  One transaction truncating a dozen tables?  That would match the
> sinval trace all right ...

It should be 4 tables - the shown log looks like there were more truncates?

> You might be throwing the baby out with the bathwater,
> performance-wise.

Yes, performance was the initial reason to use truncate instead of 
delete many years ago. But today the truncated tables usualy contain 
exactly one row - quick measurements now show that it's faster to issue 
delete instead of truncate in this case.


Again, many thanks for your invaluable advice!

Kuba

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-15 Thread Tom Lane
Jakub Ouhrabka <[EMAIL PROTECTED]> writes:
> we've found it: TRUNCATE

Huh.  One transaction truncating a dozen tables?  That would match the
sinval trace all right ...

> One more question: is it ok to do mass regexp update of pg_proc.prosrc 
> changing TRUNCATEs to DELETEs?

You might be throwing the baby out with the bathwater, performance-wise.
Mass DELETEs will require cleanup by VACUUM, and that'll likely eat more
cycles and I/O than you save.  I'd think in terms of trying to spread
out the TRUNCATEs or check to see if you really need one (maybe the
table's already empty), rather than this.

I do plan to look at the sinval code once 8.3 is out the door, so
another possibility if you can wait a few weeks/months is to leave your
app alone and see if the eventual patch looks sane to back-patch.
(I don't think the community would consider doing so, but you could
run a locally modified Postgres with it.)

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-15 Thread Adrian Moisey

Hi


 > I can think of three things that might be producing this:

we've found it: TRUNCATE


I haven't been following this thread.  Can someone please explain to me 
why TRUNCATE causes these spikes?


--
Adrian Moisey
System Administrator | CareerJunction | Your Future Starts Here.
Web: www.careerjunction.co.za | Email: [EMAIL PROTECTED]
Phone: +27 21 686 6820 | Mobile: +27 82 858 7830 | Fax: +27 21 686 6842

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-15 Thread Jakub Ouhrabka

Hi Tom,

> I can think of three things that might be producing this:

we've found it: TRUNCATE

We'll try to eliminate use of TRUNCATE and the periodical spikes should 
go off. There will still be possibility of spikes because of database 
creation etc - we'll try to handle this by issuing trivial commands from 
idle backeds and in longer run to decrease the number of 
backends/databases running. This is the way to go, right?


One more question: is it ok to do mass regexp update of pg_proc.prosrc 
changing TRUNCATEs to DELETEs? Anything we should be aware of? Sure 
we'll do our own testing but in case you see any obvious downsides...


Many thanks for your guidance!

Kuba

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-14 Thread Tom Lane
Jakub Ouhrabka <[EMAIL PROTECTED]> writes:
> What does it mean?

Look at src/include/storage/sinval.h and src/include/utils/syscache.h.
What you seem to have here is a bunch of tuple updates in pg_class
(invalidating caches 29 and 30, which in 8.2 correspond to RELNAMENSP
and RELOID), followed by a bunch of SharedInvalRelcacheMsg and
SharedInvalSmgrMsg.

What I find interesting is that the hits are coming against
nearly-successive tuple CTIDs in pg_class, eg these are all
on pages 25 and 26 of pg_class:

> LOG:  sending inval msg 30 0 25 45 30036 4294936595
> LOG:  sending inval msg 29 0 25 45 30036 2019111801
> LOG:  sending inval msg 30 0 26 11 30036 4294936595
> LOG:  sending inval msg 29 0 26 11 30036 2019111801
> LOG:  sending inval msg 30 0 25 44 30036 4294936597
> LOG:  sending inval msg 29 0 25 44 30036 3703878920
> LOG:  sending inval msg 30 0 26 10 30036 4294936597
> LOG:  sending inval msg 29 0 26 10 30036 3703878920
> LOG:  sending inval msg 30 0 26 9 30036 4294936616
> LOG:  sending inval msg 29 0 26 9 30036 3527122063
> LOG:  sending inval msg 30 0 25 43 30036 4294936616
> LOG:  sending inval msg 29 0 25 43 30036 3527122063

The ordering is a little strange --- not sure what's producing that.

I can think of three things that might be producing this:

1. DDL operations ... but most sorts of DDL on a table would touch
more catalogs than just pg_class, so this is a bit hard to credit.

2. VACUUM.

3. Some sort of direct update of pg_class.

The fact that we have a bunch of catcache invals followed by
relcache/smgr invals says that this all happened in one transaction,
else they'd have been intermixed better.  That lets VACUUM off the
hook, because it processes each table in a separate transaction.

I am wondering if maybe your app does one of those sneaky things like
fooling with pg_class.reltriggers.  If so, the problem might be soluble
by just avoiding unnecessary updates.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-14 Thread Jakub Ouhrabka

Hi Tom,

> Strange.  The best idea that comes to mind is to add some debugging
> code to SendSharedInvalidMessage to log the content of each message
> that's sent out.  That would at least tell us *what* is going into
> the queue, even if not directly *why*.

we've patched postgresql and run one of our plpgsql complex procedures. 
There are many of sinval messages - log output is below.


What does it mean?

Thanks,

Kuba

LOG:  sending inval msg 30 0 26 13 30036 4294936593
LOG:  sending inval msg 29 0 26 13 30036 337030170
LOG:  sending inval msg 30 0 25 46 30036 4294936593
LOG:  sending inval msg 29 0 25 46 30036 337030170
LOG:  sending inval msg 30 0 26 13 30036 4294936593
LOG:  sending inval msg 29 0 26 13 30036 337030170
LOG:  sending inval msg 30 0 25 45 30036 4294936595
LOG:  sending inval msg 29 0 25 45 30036 2019111801
LOG:  sending inval msg 30 0 26 11 30036 4294936595
LOG:  sending inval msg 29 0 26 11 30036 2019111801
LOG:  sending inval msg 30 0 25 44 30036 4294936597
LOG:  sending inval msg 29 0 25 44 30036 3703878920
LOG:  sending inval msg 30 0 26 10 30036 4294936597
LOG:  sending inval msg 29 0 26 10 30036 3703878920
LOG:  sending inval msg 30 0 26 9 30036 4294936616
LOG:  sending inval msg 29 0 26 9 30036 3527122063
LOG:  sending inval msg 30 0 25 43 30036 4294936616
LOG:  sending inval msg 29 0 25 43 30036 3527122063
LOG:  sending inval msg 30 0 26 9 30036 4294936616
LOG:  sending inval msg 29 0 26 9 30036 3527122063
LOG:  sending inval msg 30 0 25 41 30036 4294936618
LOG:  sending inval msg 29 0 25 41 30036 2126866956
LOG:  sending inval msg 30 0 26 7 30036 4294936618
LOG:  sending inval msg 29 0 26 7 30036 2126866956
LOG:  sending inval msg 30 0 25 40 30036 4294936620
LOG:  sending inval msg 29 0 25 40 30036 1941919314
LOG:  sending inval msg 30 0 26 5 30036 4294936620
LOG:  sending inval msg 29 0 26 5 30036 1941919314
LOG:  sending inval msg 30 0 26 4 30036 4294936633
LOG:  sending inval msg 29 0 26 4 30036 544523647
LOG:  sending inval msg 30 0 25 39 30036 4294936633
LOG:  sending inval msg 29 0 25 39 30036 544523647
LOG:  sending inval msg 30 0 26 4 30036 4294936633
LOG:  sending inval msg 29 0 26 4 30036 544523647
LOG:  sending inval msg 30 0 25 38 30036 4294936635
LOG:  sending inval msg 29 0 25 38 30036 2557582018
LOG:  sending inval msg 30 0 26 3 30036 4294936635
LOG:  sending inval msg 29 0 26 3 30036 2557582018
LOG:  sending inval msg 30 0 25 37 30036 4294936637
LOG:  sending inval msg 29 0 25 37 30036 2207280630
LOG:  sending inval msg 30 0 26 2 30036 4294936637
LOG:  sending inval msg 29 0 26 2 30036 2207280630
LOG:  sending inval msg 30 0 26 1 30036 4294936669
LOG:  sending inval msg 29 0 26 1 30036 1310188568
LOG:  sending inval msg 30 0 25 36 30036 4294936669
LOG:  sending inval msg 29 0 25 36 30036 1310188568
LOG:  sending inval msg 30 0 26 1 30036 4294936669
LOG:  sending inval msg 29 0 26 1 30036 1310188568
LOG:  sending inval msg 30 0 25 35 30036 4294936671
LOG:  sending inval msg 29 0 25 35 30036 2633053415
LOG:  sending inval msg 30 0 25 48 30036 4294936671
LOG:  sending inval msg 29 0 25 48 30036 2633053415
LOG:  sending inval msg 30 0 25 33 30036 4294936673
LOG:  sending inval msg 29 0 25 33 30036 2049964857
LOG:  sending inval msg 30 0 25 47 30036 4294936673
LOG:  sending inval msg 29 0 25 47 30036 2049964857
LOG:  sending inval msg -1 0 30036 0 30700 3218341912
LOG:  sending inval msg -2 2084 1663 0 30036 50335
LOG:  sending inval msg -2 0 1663 0 30036 50336
LOG:  sending inval msg -1 2075 30036 0 30702 30036
LOG:  sending inval msg -2 0 1663 0 30036 50324
LOG:  sending inval msg -1 0 30036 0 30702 30036
LOG:  sending inval msg -2 0 1663 0 30036 50336
LOG:  sending inval msg -2 0 1663 0 30036 50323
LOG:  sending inval msg -1 0 30036 0 30700 30036
LOG:  sending inval msg -2 0 1663 0 30036 50335
LOG:  sending inval msg -2 0 1663 0 30036 50322
LOG:  sending inval msg -1 0 30036 0 30698 30036
LOG:  sending inval msg -2 0 1663 0 30036 50334
LOG:  sending inval msg -1 0 30036 0 30677 3218341912
LOG:  sending inval msg -2 2084 1663 0 30036 50332
LOG:  sending inval msg -2 0 1663 0 30036 50333
LOG:  sending inval msg -1 2075 30036 0 30679 30036
LOG:  sending inval msg -2 0 1663 0 30036 50321
LOG:  sending inval msg -1 0 30036 0 30679 30036
LOG:  sending inval msg -2 0 1663 0 30036 50333
LOG:  sending inval msg -2 0 1663 0 30036 50320
LOG:  sending inval msg -1 0 30036 0 30677 30036
LOG:  sending inval msg -2 0 1663 0 30036 50332
LOG:  sending inval msg -2 0 1663 0 30036 50319
LOG:  sending inval msg -1 0 30036 0 30675 30036
LOG:  sending inval msg -2 0 1663 0 30036 50331
LOG:  sending inval msg -1 0 30036 0 30660 3218341912
LOG:  sending inval msg -2 2084 1663 0 30036 50329
LOG:  sending inval msg -2 0 1663 0 30036 50330
LOG:  sending inval msg -1 2075 30036 0 30662 30036
LOG:  sending inval msg -2 0 1663 0 30036 50318
LOG:  sending inval msg -1 0 30036 0 30662 30036
LOG:  sending inval msg -2 0 1663 0 30036 50330
LOG:  sending inval msg -2 0 1663 0 30036 50317
LOG:  s

Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-11 Thread Tom Lane
Jakub Ouhrabka <[EMAIL PROTECTED]> writes:
> We'we tried hard to identify what's the cause of filling sinval-queue. 
> We went through query logs as well as function bodies stored in the 
> database. We were not able to find any DDL, temp table creations etc.

Strange.  The best idea that comes to mind is to add some debugging
code to SendSharedInvalidMessage to log the content of each message
that's sent out.  That would at least tell us *what* is going into
the queue, even if not directly *why*.  Try something like (untested)

void
SendSharedInvalidMessage(SharedInvalidationMessage *msg)
{
boolinsertOK;

+   elog(LOG, "sending inval msg %d %u %u %u %u %u",
+   msg->cc.id,
+   msg->cc.tuplePtr.ip_blkid.bi_hi,
+   msg->cc.tuplePtr.ip_blkid.bi_lo,
+   msg->cc.tuplePtr.ip_posid,
+   msg->cc.dbId,
+   msg->cc.hashValue);
+
LWLockAcquire(SInvalLock, LW_EXCLUSIVE);
insertOK = SIInsertDataEntry(shmInvalBuffer, msg);
LWLockRelease(SInvalLock);

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-11 Thread Jakub Ouhrabka

Hi Tom,

> I doubt we'd risk destabilizing 8.3 at this point, for a problem that
> affects so few people; let alone back-patching into 8.2.

understand.

> OK, that confirms the theory that it's sinval-queue contention.

We'we tried hard to identify what's the cause of filling sinval-queue. 
We went through query logs as well as function bodies stored in the 
database. We were not able to find any DDL, temp table creations etc.


We did following experiment: stop one of our clients, so there started 
to be queue of events (aka rows in db) for it to process. Then we 
started the client again, it started processing the queue - that means 
calling simple selects, updates and complex plpgsql function(s). And at 
this moment, the spike started even it shouldn't start to meet usual 
periodicity. It's consistent. We are pretty sure that this client is not 
doing any DDL...


What should we look for to find the cause?

Thanks for any hints,

Kuba


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-08 Thread Tom Lane
Jakub Ouhrabka <[EMAIL PROTECTED]> writes:
> Yes, I can confirm that it's triggered by SIGUSR1 signals.

OK, that confirms the theory that it's sinval-queue contention.

> If I understand it correctly we have following choices now:

> 1) Use only 2 cores (out of 8 cores)

> 2) Lower the number of idle backends - at least force backends to do 
> something at different times to eliminate spikes - is "select 1" enough 
> to force processing the queue?

Yeah, if you could get your clients to issue trivial queries every few
seconds (not all at the same time) the spikes should go away.

If you don't want to change your clients, one possible amelioration is
to reduce the signaling threshold in SIInsertDataEntry --- instead of
70% of MAXNUMMESSAGES, maybe signal at 10% or 20%.  That would make the
spikes more frequent but smaller, which might help ... or not.


> 3) Is there any chance of this being fixed/improved in 8.3 or even 8.2? 

I doubt we'd risk destabilizing 8.3 at this point, for a problem that
affects so few people; let alone back-patching into 8.2.  There are some
other known performance problems in the sinval signaling (for instance,
that a queue overflow results in cache resets for all backends, not only
the slowest), so I think addressing all of them at once would be the
thing to do.  That would be a large enough patch that it would certainly
need to go through beta testing before I'd want to inflict it on the
world...

This discussion has raised the priority of the problem in my mind,
so I'm thinking it should be worked on in 8.4; but it's too late for
8.3.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-08 Thread Jakub Ouhrabka

> You could check this theory
> out by strace'ing some of the idle backends and seeing if their
> activity spikes are triggered by receipt of SIGUSR1 signals.

Yes, I can confirm that it's triggered by SIGUSR1 signals.

If I understand it correctly we have following choices now:

1) Use only 2 cores (out of 8 cores)

2) Lower the number of idle backends - at least force backends to do 
something at different times to eliminate spikes - is "select 1" enough 
to force processing the queue?


3) Is there any chance of this being fixed/improved in 8.3 or even 8.2? 
It's a (performance) bug from our point of view. I realize we're first 
who noticed it and it's not typical use case to have so many idle 
backends. But large installation with connection pooling is something 
similar and that's not that uncommon - one active backend doing DDL can 
then cause unexpected spikes during otherwise quiet hours...


4) Sure we'll try to reduce the number of DDL statements (which in fact 
we're not sure where exactly they are comming from) but I guess it would 
only longer the time between spikes but not make them any smoother.


Any other suggestions?

Thanks,

Kuba


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-07 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Perhaps it would make sense to try to take the "fast path" in
> SIDelExpiredDataEntries with only a shared lock rather than exclusive.

I think the real problem here is that sinval catchup processing is well
designed to create contention :-(.  Once we've decided that the message
queue is getting too full, we SIGUSR1 all the backends at once (or as
fast as the postmaster can do it anyway), then they all go off and try
to touch the sinval queue.  Backends that haven't awoken even once
since the last time will have to process the entire queue contents,
and they're all trying to do that at the same time.  What's worse, they
take and release the SInvalLock once for each message they take off the
queue.  This isn't so horrid for one-core machines (since each process
will monopolize the CPU for probably less than one timeslice while it's
catching up) but it's pretty obvious where all the contention is coming
from on an 8-core.

Some ideas for improving matters:

1. Try to avoid having all the backends hit the queue at once.  Instead
of SIGUSR1'ing everybody at the same time, maybe hit only the process
with the oldest message pointer, and have him hit the next oldest after
he's done reading the queue.

2. Try to take more than one message off the queue per SInvalLock cycle.
(There is a tuning tradeoff here, since it would mean holding the lock
for longer at a time.)

3. Try to avoid having every backend run SIDelExpiredDataEntries every
time through ReceiveSharedInvalidMessages.  It's not critical to delete
entries until the queue starts getting full --- maybe we could rejigger
the logic so it only happens once when somebody notices the queue is
getting full, or so that only the guy(s) who had nextMsgNum == minMsgNum
do it, or something like that?

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-07 Thread Tom Lane
Jakub Ouhrabka <[EMAIL PROTECTED]> writes:
>>> Does your app create and destroy a tremendous number of temp tables,
>>> or anything else in the way of frequent DDL commands?

> Hmm. I can't think of anything like this. Maybe there are few backends 
> which create temp tables but not tremendous number. I don't think our 
> applications issue DDL statements either. Can LOCK TABLE IN EXCLUSIVE 
> MODE cause this?

No.

I did some experimenting to see exactly how large the sinval message
buffer is in today's terms, and what I find is that about 22 cycles
of

create temp table foo (f1 int, f2 text); drop table foo;

is enough to force a CatchupInterrupt to a sleeping backend.  This case
is a bit more complex than it appears since the text column forces the
temp table to have a toast table; but even with only fixed-width
columns, if you were creating one temp table a second that would be
plenty to explain once-a-minute-or-so CatchupInterrupt processing.
And if you've got a lot of backends that normally *don't* wake up
that often, then they'd be sitting and failing to absorb the sinval
traffic in any more timely fashion.  So I'm betting that we've got
the source of the spike identified.  You could check this theory
out by strace'ing some of the idle backends and seeing if their
activity spikes are triggered by receipt of SIGUSR1 signals.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-07 Thread Tom Lane
Jakub Ouhrabka <[EMAIL PROTECTED]> writes:
> We've tried several times to get stacktrace from some of the running 
> backends during spikes, we got always this:

> 0x2b005d00a9a9 in semop () from /lib/libc.so.6
> #0  0x2b005d00a9a9 in semop () from /lib/libc.so.6
> #1  0x0054fe53 in PGSemaphoreLock (sema=0x2b00a04e5090, 
> interruptOK=0 '\0') at pg_sema.c:411
> #2  0x00575d95 in LWLockAcquire (lockid=SInvalLock, 
> mode=LW_EXCLUSIVE) at lwlock.c:455
> #3  0x0056fbfe in ReceiveSharedInvalidMessages 
> (invalFunction=0x5e9a30 ,
> resetFunction=0x5e9df0 ) at sinval.c:159
> #4  0x00463505 in StartTransactionCommand () at xact.c:1439
> #5  0x0056fa4b in ProcessCatchupEvent () at sinval.c:347
> #6  0x0056fb20 in CatchupInterruptHandler 
> (postgres_signal_arg=) at sinval.c:221

CatchupInterruptHandler, eh?  That seems to let NOTIFY off the hook, and
instead points in the direction of sinval processing; which is to say,
propagation of changes to system catalogs.  Does your app create and
destroy a tremendous number of temp tables, or anything else in the way
of frequent DDL commands?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-07 Thread Alvaro Herrera
Jakub Ouhrabka wrote:

> We've tried several times to get stacktrace from some of the running 
> backends during spikes, we got always this:
>
> 0x2b005d00a9a9 in semop () from /lib/libc.so.6
> #0  0x2b005d00a9a9 in semop () from /lib/libc.so.6
> #1  0x0054fe53 in PGSemaphoreLock (sema=0x2b00a04e5090, 
> interruptOK=0 '\0') at pg_sema.c:411
> #2  0x00575d95 in LWLockAcquire (lockid=SInvalLock, 
> mode=LW_EXCLUSIVE) at lwlock.c:455
> #3  0x0056fbfe in ReceiveSharedInvalidMessages 
> (invalFunction=0x5e9a30 ,
>resetFunction=0x5e9df0 ) at sinval.c:159
> #4  0x00463505 in StartTransactionCommand () at xact.c:1439
> #5  0x0056fa4b in ProcessCatchupEvent () at sinval.c:347
> #6  0x0056fb20 in CatchupInterruptHandler 
> (postgres_signal_arg=) at sinval.c:221
> #7  0x2b005cf6f110 in killpg () from /lib/libc.so.6
> #8  0x in ?? ()

Perhaps it would make sense to try to take the "fast path" in
SIDelExpiredDataEntries with only a shared lock rather than exclusive.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-07 Thread Jakub Ouhrabka

Hi Tom & all,

>> It sounds a bit like momentary contention for a spinlock,
>> but exactly what isn't clear.

> ok, we're going to try oprofile, will let you know...

yes, it seems like contention for spinlock if I'm intepreting oprofile 
correctly, around 60% of time during spikes is in s_lock. [for details 
see below].


We've tried several times to get stacktrace from some of the running 
backends during spikes, we got always this:


0x2b005d00a9a9 in semop () from /lib/libc.so.6
#0  0x2b005d00a9a9 in semop () from /lib/libc.so.6
#1  0x0054fe53 in PGSemaphoreLock (sema=0x2b00a04e5090, 
interruptOK=0 '\0') at pg_sema.c:411
#2  0x00575d95 in LWLockAcquire (lockid=SInvalLock, 
mode=LW_EXCLUSIVE) at lwlock.c:455
#3  0x0056fbfe in ReceiveSharedInvalidMessages 
(invalFunction=0x5e9a30 ,

   resetFunction=0x5e9df0 ) at sinval.c:159
#4  0x00463505 in StartTransactionCommand () at xact.c:1439
#5  0x0056fa4b in ProcessCatchupEvent () at sinval.c:347
#6  0x0056fb20 in CatchupInterruptHandler 
(postgres_signal_arg=) at sinval.c:221

#7  0x2b005cf6f110 in killpg () from /lib/libc.so.6
#8  0x in ?? ()

Is this enough info to guess what's happening? What should we try next?

Thanks,

Kuba

oprofile results:

[I've shortened path from /usr/local/pg... to just pg for better 
readilibity]


# opreport --long-filenames
CPU: Core 2, speed 2666.76 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a 
unit mask of 0x00 (Unhalted core cycles) count 10

CPU_CLK_UNHALT...|
 samples|  %|
--
  125577 90.7584 pg-8.2.4/bin/postgres
3792  2.7406 /lib/libc-2.3.6.so
3220  2.3272 /usr/src/linux-2.6.22.15/vmlinux
2145  1.5503 /usr/bin/oprofiled
1540  1.1130 /xfs
 521  0.3765 pg-8.2.4/lib/plpgsql.so
 441  0.3187 /cciss
 374  0.2703 /oprofile

...

CPU: Core 2, speed 2666.76 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a 
unit mask of 0x00 (Unhalted core cycles) count 10

samples  %app name symbol name
8535561.6887  pg-8.2.4/bin/postgres s_lock
9803  7.0849  pg-8.2.4/bin/postgres LWLockRelease
5535  4.0003  pg-8.2.4/bin/postgres LWLockAcquire
3792  2.7406  /lib/libc-2.3.6.so   (no symbols)
3724  2.6915  pg-8.2.4/bin/postgres DropRelFileNodeBuffers
2145  1.5503  /usr/bin/oprofiled   (no symbols)
2069  1.4953  pg-8.2.4/bin/postgres GetSnapshotData
1540  1.1130  /xfs (no symbols)
1246  0.9005  pg-8.2.4/bin/postgres hash_search_with_hash_value
1052  0.7603  pg-8.2.4/bin/postgres AllocSetAlloc
1015  0.7336  pg-8.2.4/bin/postgres heapgettup
879   0.6353  pg-8.2.4/bin/postgres hash_any
862   0.6230  /usr/src/linux-2.6.22.15/vmlinux mwait_idle
740   0.5348  pg-8.2.4/bin/postgres hash_seq_search
674   0.4871  pg-8.2.4/bin/postgres HeapTupleSatisfiesNow
557   0.4026  pg-8.2.4/bin/postgres SIGetDataEntry
552   0.3989  pg-8.2.4/bin/postgres equal
469   0.3390  pg-8.2.4/bin/postgres SearchCatCache
441   0.3187  /cciss   (no symbols)
433   0.3129  /usr/src/linux-2.6.22.15/vmlinux find_busiest_group
413   0.2985  pg-8.2.4/bin/postgres PinBuffer
393   0.2840  pg-8.2.4/bin/postgres MemoryContextAllocZeroAligned
374   0.2703  /oprofile(no symbols)
275   0.1988  pg-8.2.4/bin/postgres ExecInitExpr
253   0.1829  pg-8.2.4/bin/postgres base_yyparse
206   0.1489  pg-8.2.4/bin/postgres CatalogCacheFlushRelation
201   0.1453  pg-8.2.4/bin/postgres MemoryContextAlloc
194   0.1402  pg-8.2.4/bin/postgres _bt_compare
188   0.1359  /nf_conntrack(no symbols)
158   0.1142  /bnx2(no symbols)
147   0.1062  pg-8.2.4/bin/postgres pgstat_initstats
139   0.1005  pg-8.2.4/bin/postgres fmgr_info_cxt_security
132   0.0954  /usr/src/linux-2.6.22.15/vmlinux task_rq_lock
131   0.0947  /bin/bash(no symbols)
129   0.0932  pg-8.2.4/bin/postgres AllocSetFree
125   0.0903  pg-8.2.4/bin/postgres ReadBuffer
124   0.0896  pg-8.2.4/bin/postgres MemoryContextCreate
124   0.0896  pg-8.2.4/bin/postgres SyncOneBuffer
124   0.0896  pg-8.2.4/bin/postgres XLogInsert
123   0.0889  pg-8.2.4/bin/postgres _equalAggref
122   0.0882  pg-8.2.4/bin/postgres HeapTupleSatisfiesSnapshot
112   0.0809  pg-8.2.4/bin/postgres copyObject
102   0.0737  pg-8.2.4/bin/postgres UnpinBuffer
990.0716  pg-8.2.4/bin/postgres _SPI_execute_plan
990.0716  pg-8.2.4/bin/postgres nocachegetattr
980.0708  /usr/src/linux-2.6.22.15/vmlinux __wake_up_bit
970.0701  pg-8.2.4/bin/postgres TransactionIdIsInProgress
940.0679  pg-8.2.4/bin/postgres check_stack_depth
930.0672  pg-8.2.4/bin/postgres base_yylex
910.0658  pg-8.2.4/bin/postgres pfree
890.0643  pg-8.2.4/lib/plpgsql.

Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-04 Thread David Boreham

James Mansion wrote:

Jakub Ouhrabka wrote:

How can we diagnose what is happening during the peaks?
Can you try forcing a core from a bunch of the busy processes?  (Hmm - 
does Linux have an equivalent to the useful Solaris pstacks?)
There's a 'pstack' for Linux, shipped at least in Red Hat distributions 
(and possibly others,
I'm not sure). It's a shell script wrapper around gdb, so easily ported 
to any Linux.




---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-04 Thread James Mansion

Jakub Ouhrabka wrote:

How can we diagnose what is happening during the peaks?
Can you try forcing a core from a bunch of the busy processes?  (Hmm - 
does Linux have an equivalent to the useful Solaris pstacks?)


James

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-03 Thread Jakub Ouhrabka

Alvaro,

>>> - do an UNLISTEN if possible
>> Yes, we're issuing unlistens when appropriate.
>
> You are vacuuming pg_listener periodically, yes?  Not that this seems
> to have any relationship to your problem, but ...

yes, autovacuum should take care of this. But looking forward for 
multiple-workers in 8.3 as it should help us during high load periods 
(some tables might wait too long for autovacuum now - but it's not that 
big problem for us...).


Thanks for great work!

Kuba

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-03 Thread Alvaro Herrera
Jakub Ouhrabka wrote:

> > - do an UNLISTEN if possible
>
> Yes, we're issuing unlistens when appropriate.

You are vacuuming pg_listener periodically, yes?  Not that this seems to
have any relationship to your problem, but ...

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-03 Thread Jakub Ouhrabka

Hi Sven,

> I guess all backends do listen to the same notification.

Unfortunatelly no. The backends are listening to different notifications 
in different databases. Usually there are only few listens per database 
with only one exception - there are many (hundreds) listens in one 
database but all for different notifications.


> Can you change your implementation?
> - split you problem - create multiple notification if possible

Yes, it is like this.

> - do an UNLISTEN if possible

Yes, we're issuing unlistens when appropriate.

> - use another signalisation technique

We're planning to reduce the number of databases/backends/listens but 
anyway we'd like to run system on 8 cores if it is running without any 
problems on 2 cores...


Thanks for the suggestions!

Kuba

Sven Geisler napsal(a):

Hi Jakub,

I do have a similar server (from DELL), which performance well with our
PostgreSQL application. I guess the peak in context switches is the only
think you can see.

Anyhow, I think it is you're LISTEN/NOTIFY approach which cause that
behaviour. I guess all backends do listen to the same notification.
I don't know the exact implementation, but I can imagine that all
backends are access the same section in the shared memory which cause
the increase of context switches. More cores means more access at the
same time.

Can you change your implementation?
- split you problem - create multiple notification if possible
- do an UNLISTEN if possible
- use another signalisation technique

Regards
Sven


Jakub Ouhrabka schrieb:

Hi all,

we have a PostgreSQL dedicated Linux server with 8 cores (2xX5355). We
came accross a strange issue: when running with all 8 cores enabled
approximatly once a minute (period differs) the system is very busy for
a few seconds (~5-10s) and we don't know why - this issue don't show up
when we tell Linux to use only 2 cores, with 4 cores the problem is here
but it is still better than with 8 cores - all on the same machine, same
config, same workload. We don't see any apparent reason for these peaks.
We'd like to investigate it further but we don't know what to try next.
Any suggenstions? Any tunning tips for Linux+PostgreSQL on 8-way system?
Can this be connected with our heavy use of listen/notify and hundreds
backends in listen mode?

More details are below.

Thanks,

Kuba

System: HP DL360 2x5355, 8G RAM, P600+MSA50 - internal 2x72GB RAID 10
for OS, 10x72G disks RAID 10 for PostgreSQL data and wal
OS: Linux 2.6 64bit (kernel 2.6.21, 22, 23 makes little difference)
PostgreSQL: 8.2.4 (64bit), shared buffers 1G

Nothing else than PostgreSQL is running on the server. Cca 800
concurrent backends. Majority of backends in LISTEN doing nothing.
Client interface for most backends is ecpg+libpq.

Problem description:

The system is usually running 80-95% idle. Approximatly once a minute
for cca 5-10s there is a peak in activity which looks like this:

vmstat (and top or atop) reports 0% idle, 100% in user mode, very low
iowait, low IO activity, higher number of contex switches than usual but
not exceedingly high (2000-4000cs/s, usually 1500cs/s), few hundreds
waiting processes per second (usually 0-1/s). From looking at top and
running processes we can't see any obvious reason for the peak.
According to PostgreSQL log the long running commands from these moments
are e.g. begin transaction lasting several seconds.

When only 2 cores are enabled (kernel command line) then everything is
running smoothly. 4 cores exibits slightly better behavior than 8 cores
but worse than 2 cores - the peaks are visible.

We've tried kernel versions 2.6.21-23 (latest revisions as of beginning
December from kernel.org) the pattern slightly changed but it may also
be that the workload slightly changed.

pgbench or any other stress testing runs smoothly on the server.

The o usage panly strange thing about ourttern I can think of is heavy
use of LISTEN/NOTIFY especially hunderds backends in listen mode.

When restarting our connected clients the peaks are not there from time
0, they are visible after a while - seems something gets synchronized
and causing troubles then.

Since the server is PostgreSQL dedicated and no our client applications
are running on it - and there is a difference when 2 and 8 cores are
enabled -  we think that the peaks are not caused by our client
applications.

How can we diagnose what is happening during the peaks?

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-03 Thread Jakub Ouhrabka

Hi Tom,

> Interesting.  Maybe you could use oprofile to try to see what's
> happening?  It sounds a bit like momentary contention for a spinlock,
> but exactly what isn't clear.

ok, we're going to try oprofile, will let you know...

> Perhaps.  Have you tried logging executions of NOTIFY to see if they
> are correlated with the spikes?

We didn't log the notifies but I think it's not correlated. We'll have a 
detailed look next time we try it (with oprofile).


Thanks for suggestions!

Kuba


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-03 Thread Tom Lane
Jakub Ouhrabka <[EMAIL PROTECTED]> writes:
> we have a PostgreSQL dedicated Linux server with 8 cores (2xX5355). We 
> came accross a strange issue: when running with all 8 cores enabled 
> approximatly once a minute (period differs) the system is very busy for 
> a few seconds (~5-10s) and we don't know why - this issue don't show up 
> when we tell Linux to use only 2 cores, with 4 cores the problem is here 
> but it is still better than with 8 cores - all on the same machine, same 
> config, same workload. We don't see any apparent reason for these peaks. 

Interesting.  Maybe you could use oprofile to try to see what's
happening?  It sounds a bit like momentary contention for a spinlock,
but exactly what isn't clear.

> Can this be connected with our heavy use of listen/notify and hundreds 
> backends in listen mode?

Perhaps.  Have you tried logging executions of NOTIFY to see if they are
correlated with the spikes?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-03 Thread Sven Geisler
Hi Jakub,

I do have a similar server (from DELL), which performance well with our
PostgreSQL application. I guess the peak in context switches is the only
think you can see.

Anyhow, I think it is you're LISTEN/NOTIFY approach which cause that
behaviour. I guess all backends do listen to the same notification.
I don't know the exact implementation, but I can imagine that all
backends are access the same section in the shared memory which cause
the increase of context switches. More cores means more access at the
same time.

Can you change your implementation?
- split you problem - create multiple notification if possible
- do an UNLISTEN if possible
- use another signalisation technique

Regards
Sven


Jakub Ouhrabka schrieb:
> Hi all,
> 
> we have a PostgreSQL dedicated Linux server with 8 cores (2xX5355). We
> came accross a strange issue: when running with all 8 cores enabled
> approximatly once a minute (period differs) the system is very busy for
> a few seconds (~5-10s) and we don't know why - this issue don't show up
> when we tell Linux to use only 2 cores, with 4 cores the problem is here
> but it is still better than with 8 cores - all on the same machine, same
> config, same workload. We don't see any apparent reason for these peaks.
> We'd like to investigate it further but we don't know what to try next.
> Any suggenstions? Any tunning tips for Linux+PostgreSQL on 8-way system?
> Can this be connected with our heavy use of listen/notify and hundreds
> backends in listen mode?
> 
> More details are below.
> 
> Thanks,
> 
> Kuba
> 
> System: HP DL360 2x5355, 8G RAM, P600+MSA50 - internal 2x72GB RAID 10
> for OS, 10x72G disks RAID 10 for PostgreSQL data and wal
> OS: Linux 2.6 64bit (kernel 2.6.21, 22, 23 makes little difference)
> PostgreSQL: 8.2.4 (64bit), shared buffers 1G
> 
> Nothing else than PostgreSQL is running on the server. Cca 800
> concurrent backends. Majority of backends in LISTEN doing nothing.
> Client interface for most backends is ecpg+libpq.
> 
> Problem description:
> 
> The system is usually running 80-95% idle. Approximatly once a minute
> for cca 5-10s there is a peak in activity which looks like this:
> 
> vmstat (and top or atop) reports 0% idle, 100% in user mode, very low
> iowait, low IO activity, higher number of contex switches than usual but
> not exceedingly high (2000-4000cs/s, usually 1500cs/s), few hundreds
> waiting processes per second (usually 0-1/s). From looking at top and
> running processes we can't see any obvious reason for the peak.
> According to PostgreSQL log the long running commands from these moments
> are e.g. begin transaction lasting several seconds.
> 
> When only 2 cores are enabled (kernel command line) then everything is
> running smoothly. 4 cores exibits slightly better behavior than 8 cores
> but worse than 2 cores - the peaks are visible.
> 
> We've tried kernel versions 2.6.21-23 (latest revisions as of beginning
> December from kernel.org) the pattern slightly changed but it may also
> be that the workload slightly changed.
> 
> pgbench or any other stress testing runs smoothly on the server.
> 
> The o usage panly strange thing about ourttern I can think of is heavy
> use of LISTEN/NOTIFY especially hunderds backends in listen mode.
> 
> When restarting our connected clients the peaks are not there from time
> 0, they are visible after a while - seems something gets synchronized
> and causing troubles then.
> 
> Since the server is PostgreSQL dedicated and no our client applications
> are running on it - and there is a difference when 2 and 8 cores are
> enabled -  we think that the peaks are not caused by our client
> applications.
> 
> How can we diagnose what is happening during the peaks?
> 
> ---(end of broadcast)---
> TIP 1: if posting/reading through Usenet, please send an appropriate
>   subscribe-nomail command to [EMAIL PROTECTED] so that your
>   message can get through to the mailing list cleanly

-- 
Sven Geisler <[EMAIL PROTECTED]>   Tel +49.30.921017.81  Fax .50
Senior Developer, AEC/communications GmbH & Co. KG Berlin, Germany

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores

2008-01-03 Thread Jakub Ouhrabka

Hi all,

we have a PostgreSQL dedicated Linux server with 8 cores (2xX5355). We 
came accross a strange issue: when running with all 8 cores enabled 
approximatly once a minute (period differs) the system is very busy for 
a few seconds (~5-10s) and we don't know why - this issue don't show up 
when we tell Linux to use only 2 cores, with 4 cores the problem is here 
but it is still better than with 8 cores - all on the same machine, same 
config, same workload. We don't see any apparent reason for these peaks. 
We'd like to investigate it further but we don't know what to try next. 
Any suggenstions? Any tunning tips for Linux+PostgreSQL on 8-way system? 
Can this be connected with our heavy use of listen/notify and hundreds 
backends in listen mode?


More details are below.

Thanks,

Kuba

System: HP DL360 2x5355, 8G RAM, P600+MSA50 - internal 2x72GB RAID 10 
for OS, 10x72G disks RAID 10 for PostgreSQL data and wal

OS: Linux 2.6 64bit (kernel 2.6.21, 22, 23 makes little difference)
PostgreSQL: 8.2.4 (64bit), shared buffers 1G

Nothing else than PostgreSQL is running on the server. Cca 800 
concurrent backends. Majority of backends in LISTEN doing nothing. 
Client interface for most backends is ecpg+libpq.


Problem description:

The system is usually running 80-95% idle. Approximatly once a minute 
for cca 5-10s there is a peak in activity which looks like this:


vmstat (and top or atop) reports 0% idle, 100% in user mode, very low 
iowait, low IO activity, higher number of contex switches than usual but 
not exceedingly high (2000-4000cs/s, usually 1500cs/s), few hundreds 
waiting processes per second (usually 0-1/s). From looking at top and 
running processes we can't see any obvious reason for the peak. 
According to PostgreSQL log the long running commands from these moments 
are e.g. begin transaction lasting several seconds.


When only 2 cores are enabled (kernel command line) then everything is 
running smoothly. 4 cores exibits slightly better behavior than 8 cores 
but worse than 2 cores - the peaks are visible.


We've tried kernel versions 2.6.21-23 (latest revisions as of beginning 
December from kernel.org) the pattern slightly changed but it may also 
be that the workload slightly changed.


pgbench or any other stress testing runs smoothly on the server.

The only strange thing about our usage pattern I can think of is heavy 
use of LISTEN/NOTIFY especially hunderds backends in listen mode.


When restarting our connected clients the peaks are not there from time 
0, they are visible after a while - seems something gets synchronized 
and causing troubles then.


Since the server is PostgreSQL dedicated and no our client applications 
are running on it - and there is a difference when 2 and 8 cores are 
enabled -  we think that the peaks are not caused by our client 
applications.


How can we diagnose what is happening during the peaks?

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly