Re: [PERFORM] Ineffective autovacuum
1. First things first: vacuum cannot delete tuples that are still visible to any old running transactions. You might have some very long queries or transactions that prevent it from cleaning properly: select * from pg_stat_activity where xact_start now()-interval '10 minutes'; 2. On 8.3 and earlier servers with large tables, it's critical that your max_fsm_pages and max_fsm_relations are tuned properly. Failing that, autovacuum will permanently leak space that can only be fixed with a VACUUM FULL (which will take an exclusive lock and run for a very long time) PostgreSQL version 8.4 addressed this problem, but for the unfortunate, you have to follow the tuning advice here: https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server#autovacuum_max_fsm_pages.2C_max_fsm_relations On Tue, Sep 27, 2011 at 08:08, Royce Ausburn royce...@inomial.com wrote: I've noticed that the same tables seem to be auto vacuum'd over and over again… Some of the tables are a bit surprising in that they're updated semi-regularly, but not enough (I'd think) to warrant an autovacuum every few minutes… Is this unusual? Maybe they're just auto-analyze processes? Those get triggered on insert-only tables too, when vacuum normally wouldn't run. Perhaps unrelated: I've done some digging around and happened across a nightly task doing: select pg_stat_reset() AFAIK (but I could be wrong), vacuum uses a separate set of statistics not affected by pg_stat_reset. Regards, Marti -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Ineffective autovacuum
Royce Ausburn royce...@inomial.com writes: Since sending this first email I've up'd the autovacuum log level and I've noticed that the same tables seem to be auto vacuum'd over and over again⦠Some of the tables are a bit surprising in that they're updated semi-regularly, but not enough (I'd think) to warrant an autovacuum every few minutes⦠Is this unusual? Well, that proves autovacuum isn't getting blocked anyway. At this point I suspect that Marti has fingered the correct issue: you likely need to increase the FSM settings. You should try running a manual VACUUM VERBOSE and see if it suggests that more FSM space is needed (there'll be some FSM stats at the end of the verbose printout). Perhaps unrelated: I've done some digging around and happened across a nightly task doing: select pg_stat_reset() on each of the databases in the clusterâ¦. I've no idea why we're doing that (and our usual sysadmin / DBA has resigned, so I doubt I'll ever know). There must have been a reason at the time, but I wonder if this might be interfering with things? Hmm, it's not helping any. Anything that needs vacuuming, but less often than once a day, would get missed due to the stats getting forgotten. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Ineffective autovacuum
On Tue, Sep 27, 2011 at 7:49 AM, Tom Lane t...@sss.pgh.pa.us wrote: Royce Ausburn royce...@inomial.com writes: Since sending this first email I've up'd the autovacuum log level and I've noticed that the same tables seem to be auto vacuum'd over and over again… Some of the tables are a bit surprising in that they're updated semi-regularly, but not enough (I'd think) to warrant an autovacuum every few minutes… Is this unusual? Well, that proves autovacuum isn't getting blocked anyway. At this point I suspect that Marti has fingered the correct issue: you likely need to increase the FSM settings. You should try running a manual VACUUM VERBOSE and see if it suggests that more FSM space is needed (there'll be some FSM stats at the end of the verbose printout). That's the probably the best first step, a good second one might be to increase the aggressiveness of autovac by lowering the delay, and increasing the cost limit. OP: You need to watch it a little closer during the day. first do as suggested and increase the max_fsm_pages. High settings on it don't cost a lot as they're only 6 bytes per page. So 1M max_fsm_pages costs 6M of shared RAM. After that run vacuum verbose every hour or two to keep an eye on the trend of how many pages it says are needed. If that number doesn't stabilize, but just keeps growing then you're not vacuuming aggressively enough. Up autovacuum_vacuum_cost_limit by a couple of factors, and lower autovacuum_vacuum_cost_delay to 5ms or less. Make sure you don't swamp your IO subsystem. On big machines with lots of spindles it's hard to swamp the IO. On smaller workstation class machines it's pretty easy. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Ineffective autovacuum
On 27/09/2011, at 8:29 PM, Marti Raudsepp wrote: 1. First things first: vacuum cannot delete tuples that are still visible to any old running transactions. You might have some very long queries or transactions that prevent it from cleaning properly: select * from pg_stat_activity where xact_start now()-interval '10 minutes'; Thanks -- that query is very handy. I suspect this might be the cause of our woes as this query results in a handful of long lived connections, however they're connections to databases other than the one that I'm having trouble with. I've checked up on the FSM as you suggested, I don't think that's the problem as there're no warnings in the verbose output nor the logs. But another clue: DETAIL: 93 dead row versions cannot be removed yet. After clearing those stuffed transactions vacuum verbose manages to clear away all the dead rows… That's confirmation enough for me - Now to find the application bugs - Thanks Tom, Marti Scott for your help! --Royce
Re: [PERFORM] Ineffective autovacuum
Royce Ausburn royce...@inomial.com writes: I have a problem with autovacuum apparently not doing the job I need it to do. Hm, I wonder whether you're getting bit by bug #5759, which was fixed after 8.3.12. I have a table named datasession that is frequently inserted, updated and deleted from. Typically the table will have a few thousand rows in it. Each row typically survives a few days and is updated every 5 - 10 mins. The application receives unreliable, potentially duplicate data from its source, so this table is heavily used for synchronising application threads as well. A typical access pattern is: - tx begin - SELECT FOR UPDATE on a single row - Do some application processing (1 - 100 ms) - Possibly UPDATE the row - tx commit Transactions of that form would not interfere with autovacuum. You'd need something that wants exclusive lock, like a schema change. I've read some recent threads and found a discussion (below) on auto vacuum that mentions auto vacuum will be cancelled when a client requests a lock that auto vacuum is using My questions: 1) Does it look like I'm affected by the same problem as in the below discussion? Not unless you're seeing a lot of canceling autovacuum task messages in the postmaster log. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Ineffective autovacuum
On 27/09/2011, at 2:21 PM, Tom Lane wrote: Royce Ausburn royce...@inomial.com writes: I have a problem with autovacuum apparently not doing the job I need it to do. Hm, I wonder whether you're getting bit by bug #5759, which was fixed after 8.3.12. If this were the case, would I see lots of auto vacuum worker processes in ps that are essentially doing nothing because they're sleeping all the time? If so, then I think perhaps not. I have a table named datasession that is frequently inserted, updated and deleted from. Typically the table will have a few thousand rows in it. Each row typically survives a few days and is updated every 5 - 10 mins. The application receives unreliable, potentially duplicate data from its source, so this table is heavily used for synchronising application threads as well. A typical access pattern is: - tx begin - SELECT FOR UPDATE on a single row - Do some application processing (1 - 100 ms) - Possibly UPDATE the row - tx commit Transactions of that form would not interfere with autovacuum. You'd need something that wants exclusive lock, like a schema change. I've read some recent threads and found a discussion (below) on auto vacuum that mentions auto vacuum will be cancelled when a client requests a lock that auto vacuum is using∑ My questions: 1) Does it look like I'm affected by the same problem as in the below discussion? Not unless you're seeing a lot of canceling autovacuum task messages in the postmaster log. Okay - This is not the case. Since sending this first email I've up'd the autovacuum log level and I've noticed that the same tables seem to be auto vacuum'd over and over again… Some of the tables are a bit surprising in that they're updated semi-regularly, but not enough (I'd think) to warrant an autovacuum every few minutes… Is this unusual? Perhaps unrelated: I've done some digging around and happened across a nightly task doing: select pg_stat_reset() on each of the databases in the cluster…. I've no idea why we're doing that (and our usual sysadmin / DBA has resigned, so I doubt I'll ever know). There must have been a reason at the time, but I wonder if this might be interfering with things? At any rate, I think the logs might glean some more interesting information, I'll let it alone for a few hours and hopefully I'll have some more useful information. --Royce -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance