Hi All,

    I have been perplexed by random load spikes on an 8.1.11 instance. many
a times they are random, in the sense we cannot tie a particular scenario as
the cause for it! But a few times we can see that when we are executing huge
scripts, which include DDL as well as DML, the load on the box spikes to
above 200. We see similar load spikes other times too when we are not
running any such task on the DB.

    During these spikes, in the 'top' sessions we see the 'idle' PG
processes consuming between 2 and 5 % CPU, and since the box has 8 CPUS (2
sockets and each CPU is a quad core Intel Xeon processors) and somewhere
around 200 Postgres processes, the load spikes to above 200; and it does
this very sharply.

    We are running the scripts using psql -f, but we can see the load even
while running the commands on by one!

    When there's no load, an strace session on an 'idle' PG process looks
like:

[EMAIL PROTECTED] data]$ strace -p 9375
Process 9375 attached - interrupt to quit
recvfrom(9,  <unfinished ...>
Process 9375 detached


    But under these heavy load onditions, an 'idle' PG process' strace looks
like:

[EMAIL PROTECTED] data]$ strace -p 22994
Process 22994 attached - interrupt to quit
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 10000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 11000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 14000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 17000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 31000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 51000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 4000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 5000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 3000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 6000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 12000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 12000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 23000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 27000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 47000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 70000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 4000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 7000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 11000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 16000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 19000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 35000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 53000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 75000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 76000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 102000}) = 0 (Timeout)
Process 22994 detached


    So I guess there's something very wrong with the above 'select' calls.

    Can somebody please shed some light on this? Let me know what
OS/hardware specs you need.

    Any help is greatly appreciated.

Thanks in advance,

-- 
[EMAIL PROTECTED]
[EMAIL PROTECTED] gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB http://www.enterprisedb.com

Mail sent from my BlackLaptop device

Reply via email to