Re: [GENERAL] PG9.1.2 -- 3 day old orphaned/non-responsive query -- pg_(cancel)(terminate)_backend has no effect

2012-01-14 Thread Josh Kupershmidt
On Fri, Jan 13, 2012 at 12:02 PM, Reid Thompson reid.thomp...@ateb.com wrote:

 The select query has been apparently orphaned by it's originating
 terminal or script or application.  pg_/cancel/terminate/_backend both
 return t, but the query does not cancel/terminate.   sigterm via the OS
 has no effect either.

You're sure you're killing the right backend (looks like the process
ID should be 32697, from what you posted)? What does strace tell you
the backend is doing?

 I'm not sure whether the vacuum is active, and just slow, or if it's
 spinning, or...?  I've taken no action toward the vacuum process.

Above you talk about the select query has been apparently orphaned.
Are you trying to kill off the vacuum or the SELECT query?

Josh

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] PG9.1.2 -- 3 day old orphaned/non-responsive query -- pg_(cancel)(terminate)_backend has no effect

2012-01-13 Thread Reid Thompson
Looking for some quidance or suggestions.

PostgreSQL 9.1.2 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.5 
20110214 (Red Hat 4.4.5-6), 64-bit
$ uname -a
Linux db1.hw.ateb.com 2.6.32-131.21.1.el6.x86_64 #1 SMP Tue Nov 22 19:48:09 GMT 
2011 x86_64 x86_64 x86_64 GNU/Linux


   datid|  datname  | procpid |  usesysid  | usename  | application_name | 
client_addr  | client_hostname | client_port | backend_start |  
xact_start   |  query_start  | waiting |  
  current_query 
+---+-++--+--+--+-+-+---+---+---+-+--

 2034098332 | reporting |   30479 | 2033174751 | jbaucom  |  | 
172.16.48.79 | |1860 | 2012-01-10 10:55:03.784225-05 | 
2012-01-10 10:56:28.726281-05 | 2012-01-10 10:56:28.738283-05 | f   | s
elect * from data_vw where clientid = 164 and filldate  '01-01-2012'
 2034098332 | reporting |   32697 | 10 | postgres |  |  
| | | 2012-01-12 19:02:06.911721-05 | 
2012-01-12 19:02:06.941263-05 | 2012-01-12 19:02:06.941263-05 | f   | a
utovacuum: VACUUM public.data_201201
(2 rows)


There are 3 db's in this cluster. These above two activities are
essentially hitting the same underlying child table.  data_vw is on
data_parent which has data_MM children.

The select query has been apparently orphaned by it's originating
terminal or script or application.  pg_/cancel/terminate/_backend both
return t, but the query does not cancel/terminate.   sigterm via the OS
has no effect either.

I'm not sure whether the vacuum is active, and just slow, or if it's
spinning, or...?  I've taken no action toward the vacuum process.

reporting=# select count(*) from data_201201;
  count  
-
 9170468
(1 row)

There are approx 500K records inserted daily, so the table had approx
7.5M records when the query started.

$ cat /proc/30479/wchan 
sk_stream_wait_memory


Thanks,
reid


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general