Just an addition... the strace o/p with selects timing out just runs almost continuously, it doesn't seem to pause anywhere!
On Fri, Jul 18, 2008 at 9:16 AM, Gurjeet Singh <[EMAIL PROTECTED]> wrote: > Hi All, > > I have been perplexed by random load spikes on an 8.1.11 instance. many > a times they are random, in the sense we cannot tie a particular scenario as > the cause for it! But a few times we can see that when we are executing huge > scripts, which include DDL as well as DML, the load on the box spikes to > above 200. We see similar load spikes other times too when we are not > running any such task on the DB. > > During these spikes, in the 'top' sessions we see the 'idle' PG > processes consuming between 2 and 5 % CPU, and since the box has 8 CPUS (2 > sockets and each CPU is a quad core Intel Xeon processors) and somewhere > around 200 Postgres processes, the load spikes to above 200; and it does > this very sharply. > > We are running the scripts using psql -f, but we can see the load even > while running the commands on by one! > > When there's no load, an strace session on an 'idle' PG process looks > like: > > [EMAIL PROTECTED] data]$ strace -p 9375 > Process 9375 attached - interrupt to quit > recvfrom(9, <unfinished ...> > Process 9375 detached > > > But under these heavy load onditions, an 'idle' PG process' strace > looks like: > > [EMAIL PROTECTED] data]$ strace -p 22994 > Process 22994 attached - interrupt to quit > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 10000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 11000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 14000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 17000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 31000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 51000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 2000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 4000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 5000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 2000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 2000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 3000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 6000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 12000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 12000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 23000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 27000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 47000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 70000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 2000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 4000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 7000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 11000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 16000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 19000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 35000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 53000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 75000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 76000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 102000}) = 0 (Timeout) > Process 22994 detached > > > So I guess there's something very wrong with the above 'select' calls. > > Can somebody please shed some light on this? Let me know what > OS/hardware specs you need. > > Any help is greatly appreciated. > > Thanks in advance, > > -- > [EMAIL PROTECTED] > [EMAIL PROTECTED] gmail | hotmail | indiatimes | yahoo }.com > > EnterpriseDB http://www.enterprisedb.com > > Mail sent from my BlackLaptop device > -- [EMAIL PROTECTED] [EMAIL PROTECTED] gmail | hotmail | indiatimes | yahoo }.com EnterpriseDB http://www.enterprisedb.com Mail sent from my BlackLaptop device