Re: [PERFORM] Ineffective autovacuum

2011-09-27 Thread Marti Raudsepp
1. First things first: vacuum cannot delete tuples that are still
visible to any old running transactions. You might have some very long
queries or transactions that prevent it from cleaning properly:

select * from pg_stat_activity where xact_start  now()-interval '10 minutes';

2. On 8.3 and earlier servers with large tables, it's critical that
your max_fsm_pages and max_fsm_relations are tuned properly. Failing
that, autovacuum will permanently leak space that can only be fixed
with a VACUUM FULL (which will take an exclusive lock and run for a
very long time)

PostgreSQL version 8.4 addressed this problem, but for the
unfortunate, you have to follow the tuning advice here:
https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server#autovacuum_max_fsm_pages.2C_max_fsm_relations

On Tue, Sep 27, 2011 at 08:08, Royce Ausburn royce...@inomial.com wrote:
 I've noticed that the same tables seem to be auto vacuum'd over and over 
 again… Some of the tables are a bit surprising in that they're updated 
 semi-regularly, but not enough (I'd think) to warrant an autovacuum every few 
 minutes… Is this unusual?

Maybe they're just auto-analyze processes? Those get triggered on
insert-only tables too, when vacuum normally wouldn't run.

 Perhaps unrelated: I've done some digging around and happened across a 
 nightly task doing:
 select pg_stat_reset()

AFAIK (but I could be wrong), vacuum uses a separate set of statistics
not affected by pg_stat_reset.

Regards,
Marti

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Ineffective autovacuum

2011-09-27 Thread Tom Lane
Royce Ausburn royce...@inomial.com writes:
 Since sending this first email I've up'd the autovacuum log level and I've 
 noticed that the same tables seem to be auto vacuum'd over and over again… 
 Some of the tables are a bit surprising in that they're updated 
 semi-regularly, but not enough (I'd think) to warrant an autovacuum every few 
 minutes… Is this unusual?

Well, that proves autovacuum isn't getting blocked anyway.  At this
point I suspect that Marti has fingered the correct issue: you likely
need to increase the FSM settings.  You should try running a manual
VACUUM VERBOSE and see if it suggests that more FSM space is needed
(there'll be some FSM stats at the end of the verbose printout).

 Perhaps unrelated: I've done some digging around and happened across a 
 nightly task doing:
 select pg_stat_reset()
 on each of the databases in the cluster…. I've no idea why we're doing that 
 (and our usual sysadmin / DBA has resigned, so I doubt I'll ever know).  
 There must have been a reason at the time, but I wonder if this might be 
 interfering with things?

Hmm, it's not helping any.  Anything that needs vacuuming, but less
often than once a day, would get missed due to the stats getting
forgotten.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Ineffective autovacuum

2011-09-27 Thread Scott Marlowe
On Tue, Sep 27, 2011 at 7:49 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Royce Ausburn royce...@inomial.com writes:
 Since sending this first email I've up'd the autovacuum log level and I've 
 noticed that the same tables seem to be auto vacuum'd over and over again… 
 Some of the tables are a bit surprising in that they're updated 
 semi-regularly, but not enough (I'd think) to warrant an autovacuum every 
 few minutes… Is this unusual?

 Well, that proves autovacuum isn't getting blocked anyway.  At this
 point I suspect that Marti has fingered the correct issue: you likely
 need to increase the FSM settings.  You should try running a manual
 VACUUM VERBOSE and see if it suggests that more FSM space is needed
 (there'll be some FSM stats at the end of the verbose printout).

That's the probably the best first step, a good second one might be to
increase the aggressiveness of autovac by lowering the delay, and
increasing the cost limit.

OP: You need to watch it a little closer during the day.  first do as
suggested and increase the max_fsm_pages.  High settings on it don't
cost a lot as they're only 6 bytes per page.  So 1M max_fsm_pages
costs 6M of shared RAM.  After that run vacuum verbose every hour or
two to keep an eye on the trend of how many pages it says are needed.
If that number doesn't stabilize, but just keeps growing then you're
not vacuuming aggressively enough.  Up autovacuum_vacuum_cost_limit by
a couple of factors, and lower autovacuum_vacuum_cost_delay to 5ms or
less.  Make sure you don't swamp your IO subsystem.  On big machines
with lots of spindles it's hard to swamp the IO.  On smaller
workstation class machines it's pretty easy.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Ineffective autovacuum

2011-09-27 Thread Royce Ausburn

On 27/09/2011, at 8:29 PM, Marti Raudsepp wrote:

 1. First things first: vacuum cannot delete tuples that are still
 visible to any old running transactions. You might have some very long
 queries or transactions that prevent it from cleaning properly:
 
 select * from pg_stat_activity where xact_start  now()-interval '10 minutes';

Thanks -- that query is very handy.  I suspect this might be the cause of our 
woes as this query results in a handful of long lived connections, however 
they're connections to databases other than the one that I'm having trouble 
with.  

I've checked up on the FSM as you suggested, I don't think that's the problem 
as there're no warnings in the verbose output nor the logs.  But another clue:

DETAIL:  93 dead row versions cannot be removed yet.

After clearing those stuffed transactions vacuum verbose manages to clear away 
all the dead rows… That's confirmation enough for me - Now to find the 
application bugs - Thanks Tom, Marti  Scott for your help!

--Royce



Re: [PERFORM] Ineffective autovacuum

2011-09-26 Thread Tom Lane
Royce Ausburn royce...@inomial.com writes:
 I have a problem with autovacuum apparently not doing the job I need it to do.

Hm, I wonder whether you're getting bit by bug #5759, which was fixed
after 8.3.12.

 I have a table named datasession that is frequently inserted, updated and 
 deleted from.  Typically the table will have a few thousand rows in it.  Each 
 row typically survives a few days and is updated every 5 - 10 mins.  The 
 application receives unreliable, potentially duplicate data from its source, 
 so this table is heavily used for synchronising application threads as well.  
 A typical access pattern is:

 - tx begin
 - SELECT FOR UPDATE on a single row
 - Do some application processing (1 - 100 ms)
 - Possibly UPDATE the row
 - tx commit

Transactions of that form would not interfere with autovacuum.  You'd
need something that wants exclusive lock, like a schema change.

 I've read some recent threads and found a discussion (below) on auto vacuum 
 that mentions auto vacuum will be cancelled when a client requests a lock 
 that auto vacuum is using… My questions:
 1) Does it look like I'm affected by the same problem as in the below 
 discussion?

Not unless you're seeing a lot of canceling autovacuum task messages
in the postmaster log.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Ineffective autovacuum

2011-09-26 Thread Royce Ausburn



On 27/09/2011, at 2:21 PM, Tom Lane wrote:

 Royce Ausburn royce...@inomial.com writes:
 I have a problem with autovacuum apparently not doing the job I need it to 
 do.
 
 Hm, I wonder whether you're getting bit by bug #5759, which was fixed
 after 8.3.12.

If this were the case, would I see lots of auto vacuum worker processes in ps 
that are essentially doing nothing because they're sleeping all the time?  If 
so, then I think perhaps not.

 
 I have a table named datasession that is frequently inserted, updated and 
 deleted from.  Typically the table will have a few thousand rows in it.  
 Each row typically survives a few days and is updated every 5 - 10 mins.  
 The application receives unreliable, potentially duplicate data from its 
 source, so this table is heavily used for synchronising application threads 
 as well.  A typical access pattern is:
 
 - tx begin
 - SELECT FOR UPDATE on a single row
 - Do some application processing (1 - 100 ms)
 - Possibly UPDATE the row
 - tx commit
 
 Transactions of that form would not interfere with autovacuum.  You'd
 need something that wants exclusive lock, like a schema change.
 
 I've read some recent threads and found a discussion (below) on auto vacuum 
 that mentions auto vacuum will be cancelled when a client requests a lock 
 that auto vacuum is using∑ My questions:
 1) Does it look like I'm affected by the same problem as in the below 
 discussion?
 
 Not unless you're seeing a lot of canceling autovacuum task messages
 in the postmaster log.

Okay - This is not the case.

Since sending this first email I've up'd the autovacuum log level and I've 
noticed that the same tables seem to be auto vacuum'd over and over again… Some 
of the tables are a bit surprising in that they're updated semi-regularly, but 
not enough (I'd think) to warrant an autovacuum every few minutes… Is this 
unusual?


Perhaps unrelated: I've done some digging around and happened across a nightly 
task doing:

select pg_stat_reset()

on each of the databases in the cluster…. I've no idea why we're doing that 
(and our usual sysadmin / DBA has resigned, so I doubt I'll ever know).  There 
must have been a reason at the time, but I wonder if this might be interfering 
with things?

At any rate, I think the logs might glean some more interesting information, 
I'll let it alone for a few hours and hopefully I'll have some more useful 
information.

--Royce


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance