[PERFORM] Simple query showing 270 hours of CPU time

2007-07-20 Thread Dan Harris
Today, I looked at 'top' on my PG server and saw a pid that reported 270 hours 
of CPU time.  Considering this is a very simple query, I was surprised to say 
the least.  I was about to just kill the pid, but I figured I'd try and see 
exactly what it was stuck doing for so long.


Here's the strace summary as run for a few second sample:

% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 97.250.671629  92  7272   semop
  1.760.012171 40630   recvfrom
  0.570.003960  6660   gettimeofday
  0.360.002512  2890   sendto
  0.050.000317  1032   lseek
  0.010.49   148   select
-- --- --- - - 
100.000.690638  7532   total

Here's the query:

select id from eventkeywords where word = '3322'

If I run the query manually, it completes in about 500ms, which is very 
reasonable.

There are 408563 rows in this table.  I just noticed there is no index on word ( 
there should be! ).  Would this have caused the problem?


This is 8.0.12

Linux sunrise 2.6.15-26-amd64-server #1 SMP Fri Sep 8 20:33:15 UTC 2006 x86_64 
GNU/Linux


Any idea what might have set it into this loop?

-Dan

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Simple query showing 270 hours of CPU time

2007-07-20 Thread Tom Lane
Dan Harris [EMAIL PROTECTED] writes:
 Here's the strace summary as run for a few second sample:

 % time seconds  usecs/call callserrors syscall
 -- --- --- - - 
   97.250.671629  92  7272   semop
1.760.012171 40630   recvfrom
0.570.003960  6660   gettimeofday
0.360.002512  2890   sendto
0.050.000317  1032   lseek
0.010.49   148   select
 -- --- --- - - 
 100.000.690638  7532   total

 Here's the query:
 select id from eventkeywords where word = '3322'

How sure are you that (a) that's really what it's doing and (b) you are
not observing multiple executions of the query?  There are no recvfrom
calls in the inner loops of the backend AFAIR, so this looks to me like
the execution of 30 different queries.  The number of semops is
distressingly high, but that's a contention issue not an
amount-of-runtime issue.  I think you're looking at a backend that has
simply executed one heckuva lot of queries on behalf of its client,
and that inquiring into what the client thinks it's doing might be the
first order of business.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Simple query showing 270 hours of CPU time

2007-07-20 Thread PFC


Today, I looked at 'top' on my PG server and saw a pid that reported 270  
hours of CPU time.  Considering this is a very simple query, I was  
surprised to say the least.  I was about to just kill the pid, but I  
figured I'd try and see exactly what it was stuck doing for so long.


	If you are using connection pooling, or if your client keeps the  
connections for a long time, this backend could be very old...
	With PHP's persistent connections, for instance, backends restart when  
you restart the webserver, which isn't usually very often.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Simple query showing 270 hours of CPU time

2007-07-20 Thread Dan Harris

Tom Lane wrote:

Dan Harris [EMAIL PROTECTED] writes:

Here's the strace summary as run for a few second sample:



% time seconds  usecs/call callserrors syscall
-- --- --- - - 
  97.250.671629  92  7272   semop
   1.760.012171 40630   recvfrom
   0.570.003960  6660   gettimeofday
   0.360.002512  2890   sendto
   0.050.000317  1032   lseek
   0.010.49   148   select
-- --- --- - - 
100.000.690638  7532   total



Here's the query:
select id from eventkeywords where word = '3322'


How sure are you that (a) that's really what it's doing and (b) you are
not observing multiple executions of the query?  There are no recvfrom
calls in the inner loops of the backend AFAIR, so this looks to me like
the execution of 30 different queries.  The number of semops is
distressingly high, but that's a contention issue not an
amount-of-runtime issue.  


You were absolutely right.  This is one connection that is doing a whole lot of 
( slow ) queries.  I jumped the gun on this and assumed it was a single query 
taking this long.  Sorry to waste time and bandwidth.


Since you mentioned the number of semops is distressingly high, does this 
indicate a tuning problem?  The machine has 64GB of RAM and as far as I can tell 
about 63GB is all cache.  I wonder if this is a clue to an undervalued 
memory-related setting somewhere?


-Dan

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Simple query showing 270 hours of CPU time

2007-07-20 Thread Tom Lane
Dan Harris [EMAIL PROTECTED] writes:
 Since you mentioned the number of semops is distressingly high, does this 
 indicate a tuning problem?

More like an old-version problem.  We've done a lot of work on
concurrent performance since 8.0.x, and most likely you are hitting
one of the bottlenecks that have been solved since then.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate