Re: [HACKERS] Load spikes on 8.1.11

2008-07-21 Thread Andrew Sullivan
On Sat, Jul 19, 2008 at 07:09:46AM +0530, Gurjeet Singh wrote:

 Will try this option, at least in the next schema upgrade or when setting up
 Slony.

As I've already suggested, however, if you try to set up slony on a
loaded database, you're going to see all manner of problems.  Slony
takes some heavy-duty locks when it does its setup work.  It's
designed that you should have an application outage for this sort of
work.  Please see previous discussion on the Slony mailing list.

A
-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Load spikes on 8.1.11

2008-07-21 Thread Decibel!
On Fri, Jul 18, 2008 at 02:23:43AM -0400, Andrew Sullivan wrote:
 On Fri, Jul 18, 2008 at 10:41:36AM +0530, Gurjeet Singh wrote:
  
  Just started INIT cluster Slonik command and that spiked too.. for more than
  10 minutes now!!
 
 Are you attempting to do Slony changes (such as install Slony) on an
 active database?  I strongly encourage you to read the Slony manual.
 Slony, frankly, sucks for this use case.  The manual says as much,
 although in more orotund phrases than that.

FWIW, I've had few problems getting londiste up and running on a heavily
loaded database. You might need to be a bit careful about when you add
very large tables due to the copy overhead, but other than that I
haven't had issues.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Load spikes on 8.1.11

2008-07-21 Thread Gurjeet Singh
On Tue, Jul 22, 2008 at 1:29 AM, Andrew Sullivan [EMAIL PROTECTED]
wrote:

 On Sat, Jul 19, 2008 at 07:09:46AM +0530, Gurjeet Singh wrote:

  Will try this option, at least in the next schema upgrade or when setting
 up
  Slony.

 As I've already suggested, however, if you try to set up slony on a
 loaded database, you're going to see all manner of problems.  Slony
 takes some heavy-duty locks when it does its setup work.  It's
 designed that you should have an application outage for this sort of
 work.  Please see previous discussion on the Slony mailing list.


Well, a very low activity period of the application (after 11 PM EST) is
chosen as the maintenance window. The application is not down, but has just
the connections open, and almost all of them sitting IDLE.

I am aware of the heavy locking involved with Slony, which should mean that
it blocks the application connections; that's be completely acceptable,
given all the warnings in the Slony docs. But what I am concerned about and
trying to hunt down is why IDLE backend processes are all consuming up all
of CPU (!!!) so much so that I am unable to fire up any new process!

Another possible cause we are looking at now is the role Xeon hyperthreading
can play here. Will keep you all updated.

Thanks and best regards,
-- 
[EMAIL PROTECTED]
[EMAIL PROTECTED] gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB http://www.enterprisedb.com

Mail sent from my BlackLaptop device


Re: [HACKERS] Load spikes on 8.1.11

2008-07-21 Thread Andrew Sullivan
On Tue, Jul 22, 2008 at 02:41:55AM +0530, Gurjeet Singh wrote:

 I am aware of the heavy locking involved with Slony, which should mean that
 it blocks the application connections; that's be completely acceptable,
 given all the warnings in the Slony docs. But what I am concerned about and
 trying to hunt down is why IDLE backend processes are all consuming up all
 of CPU (!!!) so much so that I am unable to fire up any new process!

Ah, well, then, yes, the spinlock improvements probably will help
you.  But you should disabuse yourself of the idea that IDLE
processes have no cost.  You still have to talk to all those
connections when doing schema changes.

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Load spikes on 8.1.11

2008-07-21 Thread Tom Lane
Andrew Sullivan [EMAIL PROTECTED] writes:
 On Tue, Jul 22, 2008 at 02:41:55AM +0530, Gurjeet Singh wrote:
 I am aware of the heavy locking involved with Slony, which should mean that
 it blocks the application connections; that's be completely acceptable,
 given all the warnings in the Slony docs. But what I am concerned about and
 trying to hunt down is why IDLE backend processes are all consuming up all
 of CPU (!!!) so much so that I am unable to fire up any new process!

 Ah, well, then, yes, the spinlock improvements probably will help
 you.  But you should disabuse yourself of the idea that IDLE
 processes have no cost.  You still have to talk to all those
 connections when doing schema changes.

Yeah.  In fact this is sounding more and more like the known problem
with sinval message response causing a thundering herd effect: the
idle processes all sit idle until the sinval message queue gets long
enough to rouse alarm bells, and then they all get signaled at once
and all try to clean the message queue at once, leading to very
heavy contention for the SInvalLock.  That code's been rewritten in
CVS HEAD to try to alleviate the problem, but no existing release
has the fix.

See thread here for prior report:
http://archives.postgresql.org/pgsql-performance/2008-01/msg1.php

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Load spikes on 8.1.11

2008-07-18 Thread Andrew Sullivan
On Fri, Jul 18, 2008 at 10:41:36AM +0530, Gurjeet Singh wrote:
 
 Just started INIT cluster Slonik command and that spiked too.. for more than
 10 minutes now!!

Are you attempting to do Slony changes (such as install Slony) on an
active database?  I strongly encourage you to read the Slony manual.
Slony, frankly, sucks for this use case.  The manual says as much,
although in more orotund phrases than that.

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Load spikes on 8.1.11

2008-07-18 Thread David Fetter
On Fri, Jul 18, 2008 at 10:15:42AM +0530, Gurjeet Singh wrote:
 On Fri, Jul 18, 2008 at 10:05 AM, Gurjeet Singh [EMAIL PROTECTED]
 wrote:
 
  On Fri, Jul 18, 2008 at 9:58 AM, Tom Lane [EMAIL PROTECTED] wrote:
 
  Gurjeet Singh [EMAIL PROTECTED] writes:
   During these spikes, in the 'top' sessions we see the 'idle' PG
   processes consuming between 2 and 5 % CPU, and since the box has 8 CPUS
  (2
   sockets and each CPU is a quad core Intel Xeon processors) and somewhere
   around 200 Postgres processes, the load spikes to above 200; and it does
   this very sharply.
 
  This looks like heavy contention for a spinlock.  You need to get a
  higher-level analysis of what's happening before anyone can say much
  more than that.
 
  Note that 8.1 is pretty much ancient history as far as scalability to
  8-core hardware goes.  You should probably consider updating to 8.3
  before investing too much time in tracking down what's happening.
  If you can still show the problem on 8.3 then there would be some
  interest in fixing it ...
 
 
  Upgrading is on the cards, but not as high priority as I would like it to
  be! This is a production box, and we desperatly need some respite from these
  spikes.
 
  Can you please elaborate on what high level diagnosis would you need?
 
  I just ran DROP SCHEMA _slony schema CASCADE; and it spiked again, on a
  very low loaded box!!
 
  Thanks for all you help.
 
 
 Would reducing the number of connections on the DB help in reducing the
 spike?

Just generally, reducing the number of connections to the DB will help
in reducing resource consumption.

When you first get a chance, use or set up a test environment where
you can test the upgrade to 8.3.latest.

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Load spikes on 8.1.11

2008-07-18 Thread Gurjeet Singh
On Fri, Jul 18, 2008 at 7:15 PM, David Fetter [EMAIL PROTECTED] wrote:

 On Fri, Jul 18, 2008 at 10:15:42AM +0530, Gurjeet Singh wrote:
  On Fri, Jul 18, 2008 at 10:05 AM, Gurjeet Singh [EMAIL PROTECTED]
 
  wrote:
 
   On Fri, Jul 18, 2008 at 9:58 AM, Tom Lane [EMAIL PROTECTED] wrote:
  
   Gurjeet Singh [EMAIL PROTECTED] writes:
During these spikes, in the 'top' sessions we see the 'idle' PG
processes consuming between 2 and 5 % CPU, and since the box has 8
 CPUS
   (2
sockets and each CPU is a quad core Intel Xeon processors) and
 somewhere
around 200 Postgres processes, the load spikes to above 200; and it
 does
this very sharply.
  
   This looks like heavy contention for a spinlock.  You need to get a
   higher-level analysis of what's happening before anyone can say much
   more than that.
  
   Note that 8.1 is pretty much ancient history as far as scalability to
   8-core hardware goes.  You should probably consider updating to 8.3
   before investing too much time in tracking down what's happening.
   If you can still show the problem on 8.3 then there would be some
   interest in fixing it ...
  
  
   Upgrading is on the cards, but not as high priority as I would like it
 to
   be! This is a production box, and we desperatly need some respite from
 these
   spikes.
  
   Can you please elaborate on what high level diagnosis would you need?
  
   I just ran DROP SCHEMA _slony schema CASCADE; and it spiked again, on
 a
   very low loaded box!!
  
   Thanks for all you help.
  
 
  Would reducing the number of connections on the DB help in reducing the
  spike?

 Just generally, reducing the number of connections to the DB will help
 in reducing resource consumption.


Will try this option, at least in the next schema upgrade or when setting up
Slony.



 When you first get a chance, use or set up a test environment where
 you can test the upgrade to 8.3.latest.


Based on the thread above, we seem to be moving towards greater consensus on
upgrade. One of the major hurdles in our environment's upgrade is the loss
of implicit casts in 8.3.

Following is the environment we have:

select version();
  version
---
 PostgreSQL 8.1.11 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
3.4.6 20060404 (Red Hat 3.4.6-9)
(1 row)

I cannot see oprofile installed on this box, so will try to get that
installed and get you guys some more details when this happens next.

Thanks,
-- 
[EMAIL PROTECTED]
[EMAIL PROTECTED] gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB http://www.enterprisedb.com

Mail sent from my BlackLaptop device


[HACKERS] Load spikes on 8.1.11

2008-07-17 Thread Gurjeet Singh
Hi All,

I have been perplexed by random load spikes on an 8.1.11 instance. many
a times they are random, in the sense we cannot tie a particular scenario as
the cause for it! But a few times we can see that when we are executing huge
scripts, which include DDL as well as DML, the load on the box spikes to
above 200. We see similar load spikes other times too when we are not
running any such task on the DB.

During these spikes, in the 'top' sessions we see the 'idle' PG
processes consuming between 2 and 5 % CPU, and since the box has 8 CPUS (2
sockets and each CPU is a quad core Intel Xeon processors) and somewhere
around 200 Postgres processes, the load spikes to above 200; and it does
this very sharply.

We are running the scripts using psql -f, but we can see the load even
while running the commands on by one!

When there's no load, an strace session on an 'idle' PG process looks
like:

[EMAIL PROTECTED] data]$ strace -p 9375
Process 9375 attached - interrupt to quit
recvfrom(9,  unfinished ...
Process 9375 detached


But under these heavy load onditions, an 'idle' PG process' strace looks
like:

[EMAIL PROTECTED] data]$ strace -p 22994
Process 22994 attached - interrupt to quit
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 11000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 14000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 17000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 31000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 51000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 4000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 5000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 3000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 6000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 12000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 12000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 23000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 27000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 47000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 7}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 4000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 7000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 11000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 16000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 19000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 35000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 53000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 75000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 76000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 102000}) = 0 (Timeout)
Process 22994 detached


So I guess there's something very wrong with the above 'select' calls.

Can somebody please shed some light on this? Let me know what
OS/hardware specs you need.

Any help is greatly appreciated.

Thanks in advance,

-- 
[EMAIL PROTECTED]
[EMAIL PROTECTED] gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB http://www.enterprisedb.com

Mail sent from my BlackLaptop device


Re: [HACKERS] Load spikes on 8.1.11

2008-07-17 Thread Gurjeet Singh
Just an addition... the strace o/p with selects timing out just runs almost
continuously, it doesn't seem to pause anywhere!

On Fri, Jul 18, 2008 at 9:16 AM, Gurjeet Singh [EMAIL PROTECTED]
wrote:

 Hi All,

 I have been perplexed by random load spikes on an 8.1.11 instance. many
 a times they are random, in the sense we cannot tie a particular scenario as
 the cause for it! But a few times we can see that when we are executing huge
 scripts, which include DDL as well as DML, the load on the box spikes to
 above 200. We see similar load spikes other times too when we are not
 running any such task on the DB.

 During these spikes, in the 'top' sessions we see the 'idle' PG
 processes consuming between 2 and 5 % CPU, and since the box has 8 CPUS (2
 sockets and each CPU is a quad core Intel Xeon processors) and somewhere
 around 200 Postgres processes, the load spikes to above 200; and it does
 this very sharply.

 We are running the scripts using psql -f, but we can see the load even
 while running the commands on by one!

 When there's no load, an strace session on an 'idle' PG process looks
 like:

 [EMAIL PROTECTED] data]$ strace -p 9375
 Process 9375 attached - interrupt to quit
 recvfrom(9,  unfinished ...
 Process 9375 detached


 But under these heavy load onditions, an 'idle' PG process' strace
 looks like:

 [EMAIL PROTECTED] data]$ strace -p 22994
 Process 22994 attached - interrupt to quit
 select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 1}) = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 11000}) = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 14000}) = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 17000}) = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 31000}) = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 51000}) = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 4000})  = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 5000})  = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 3000})  = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 6000})  = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 12000}) = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 12000}) = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 23000}) = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 27000}) = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 47000}) = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 7}) = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 4000})  = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 7000})  = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 11000}) = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 16000}) = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 19000}) = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 35000}) = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 53000}) = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 75000}) = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 76000}) = 0 (Timeout)
 select(0, NULL, NULL, NULL, {0, 102000}) = 0 (Timeout)
 Process 22994 detached


 So I guess there's something very wrong with the above 'select' calls.

 Can somebody please shed some light on this? Let me know what
 OS/hardware specs you need.

 Any help is greatly appreciated.

 Thanks in advance,

 --
 [EMAIL PROTECTED]
 [EMAIL PROTECTED] gmail | hotmail | indiatimes | yahoo }.com

 EnterpriseDB http://www.enterprisedb.com

 Mail sent from my BlackLaptop device




-- 
[EMAIL PROTECTED]
[EMAIL PROTECTED] gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB http://www.enterprisedb.com

Mail sent from my BlackLaptop device


Re: [HACKERS] Load spikes on 8.1.11

2008-07-17 Thread Tom Lane
Gurjeet Singh [EMAIL PROTECTED] writes:
 During these spikes, in the 'top' sessions we see the 'idle' PG
 processes consuming between 2 and 5 % CPU, and since the box has 8 CPUS (2
 sockets and each CPU is a quad core Intel Xeon processors) and somewhere
 around 200 Postgres processes, the load spikes to above 200; and it does
 this very sharply.

This looks like heavy contention for a spinlock.  You need to get a
higher-level analysis of what's happening before anyone can say much
more than that.

Note that 8.1 is pretty much ancient history as far as scalability to
8-core hardware goes.  You should probably consider updating to 8.3
before investing too much time in tracking down what's happening.
If you can still show the problem on 8.3 then there would be some
interest in fixing it ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Load spikes on 8.1.11

2008-07-17 Thread Gurjeet Singh
On Fri, Jul 18, 2008 at 9:58 AM, Tom Lane [EMAIL PROTECTED] wrote:

 Gurjeet Singh [EMAIL PROTECTED] writes:
  During these spikes, in the 'top' sessions we see the 'idle' PG
  processes consuming between 2 and 5 % CPU, and since the box has 8 CPUS
 (2
  sockets and each CPU is a quad core Intel Xeon processors) and somewhere
  around 200 Postgres processes, the load spikes to above 200; and it does
  this very sharply.

 This looks like heavy contention for a spinlock.  You need to get a
 higher-level analysis of what's happening before anyone can say much
 more than that.

 Note that 8.1 is pretty much ancient history as far as scalability to
 8-core hardware goes.  You should probably consider updating to 8.3
 before investing too much time in tracking down what's happening.
 If you can still show the problem on 8.3 then there would be some
 interest in fixing it ...


Upgrading is on the cards, but not as high priority as I would like it to
be! This is a production box, and we desperatly need some respite from these
spikes.

Can you please elaborate on what high level diagnosis would you need?

I just ran DROP SCHEMA _slony schema CASCADE; and it spiked again, on a
very low loaded box!!

Thanks for all you help.



-- 
[EMAIL PROTECTED]
[EMAIL PROTECTED] gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB http://www.enterprisedb.com

Mail sent from my BlackLaptop device


Re: [HACKERS] Load spikes on 8.1.11

2008-07-17 Thread Gurjeet Singh
On Fri, Jul 18, 2008 at 10:05 AM, Gurjeet Singh [EMAIL PROTECTED]
wrote:

 On Fri, Jul 18, 2008 at 9:58 AM, Tom Lane [EMAIL PROTECTED] wrote:

 Gurjeet Singh [EMAIL PROTECTED] writes:
  During these spikes, in the 'top' sessions we see the 'idle' PG
  processes consuming between 2 and 5 % CPU, and since the box has 8 CPUS
 (2
  sockets and each CPU is a quad core Intel Xeon processors) and somewhere
  around 200 Postgres processes, the load spikes to above 200; and it does
  this very sharply.

 This looks like heavy contention for a spinlock.  You need to get a
 higher-level analysis of what's happening before anyone can say much
 more than that.

 Note that 8.1 is pretty much ancient history as far as scalability to
 8-core hardware goes.  You should probably consider updating to 8.3
 before investing too much time in tracking down what's happening.
 If you can still show the problem on 8.3 then there would be some
 interest in fixing it ...


 Upgrading is on the cards, but not as high priority as I would like it to
 be! This is a production box, and we desperatly need some respite from these
 spikes.

 Can you please elaborate on what high level diagnosis would you need?

 I just ran DROP SCHEMA _slony schema CASCADE; and it spiked again, on a
 very low loaded box!!

 Thanks for all you help.


Would reducing the number of connections on the DB help in reducing the
spike?



-- 
[EMAIL PROTECTED]
[EMAIL PROTECTED] gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB http://www.enterprisedb.com

Mail sent from my BlackLaptop device


Re: [HACKERS] Load spikes on 8.1.11

2008-07-17 Thread Tom Lane
Gurjeet Singh [EMAIL PROTECTED] writes:
 Can you please elaborate on what high level diagnosis would you need?

Well, we'd need some idea of which spinlock is being contended for...

 I just ran DROP SCHEMA _slony schema CASCADE; and it spiked again, on a
 very low loaded box!!

That *might* mean that the problem is contention for SInvalLock, since
dropping a large schema would result in a lot of sinval traffic.  Or
maybe it's something else.  Do your spikes correspond to large DDL
changes?

If your platform has oprofile or DTrace or some such then getting an
execution profile with that type of tool would tell something.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Load spikes on 8.1.11

2008-07-17 Thread Andrew Sullivan
On Fri, Jul 18, 2008 at 10:05:33AM +0530, Gurjeet Singh wrote:

 I just ran DROP SCHEMA _slony schema CASCADE; and it spiked again, on a
 very low loaded box!!

Ah, well, if slony is involved, then you have possible locking
problems in the database _also_ to contend with, along with the
spinlock problems.  This will for sure cause spikes.

You need to tell us more about what you're doing.  And I bet some of
it belongs on the slony lists.

A
-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Load spikes on 8.1.11

2008-07-17 Thread Gurjeet Singh
On Fri, Jul 18, 2008 at 10:21 AM, Andrew Sullivan [EMAIL PROTECTED]
wrote:

 On Fri, Jul 18, 2008 at 10:05:33AM +0530, Gurjeet Singh wrote:

  I just ran DROP SCHEMA _slony schema CASCADE; and it spiked again, on a
  very low loaded box!!

 Ah, well, if slony is involved, then you have possible locking
 problems in the database _also_ to contend with, along with the
 spinlock problems.  This will for sure cause spikes.

 You need to tell us more about what you're doing.  And I bet some of
 it belongs on the slony lists.


I am in the eye of the storm right now.

Just started INIT cluster Slonik command and that spiked too.. for more than
10 minutes now!!


-- 
[EMAIL PROTECTED]
[EMAIL PROTECTED] gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB http://www.enterprisedb.com

Mail sent from my BlackLaptop device