Re: [PERFORM] Ineffective autovacuum
1. First things first: vacuum cannot delete tuples that are still visible to any old running transactions. You might have some very long queries or transactions that prevent it from cleaning properly: select * from pg_stat_activity where xact_start now()-interval '10 minutes'; 2. On 8.3 and earlier servers with large tables, it's critical that your max_fsm_pages and max_fsm_relations are tuned properly. Failing that, autovacuum will permanently leak space that can only be fixed with a VACUUM FULL (which will take an exclusive lock and run for a very long time) PostgreSQL version 8.4 addressed this problem, but for the unfortunate, you have to follow the tuning advice here: https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server#autovacuum_max_fsm_pages.2C_max_fsm_relations On Tue, Sep 27, 2011 at 08:08, Royce Ausburn royce...@inomial.com wrote: I've noticed that the same tables seem to be auto vacuum'd over and over again… Some of the tables are a bit surprising in that they're updated semi-regularly, but not enough (I'd think) to warrant an autovacuum every few minutes… Is this unusual? Maybe they're just auto-analyze processes? Those get triggered on insert-only tables too, when vacuum normally wouldn't run. Perhaps unrelated: I've done some digging around and happened across a nightly task doing: select pg_stat_reset() AFAIK (but I could be wrong), vacuum uses a separate set of statistics not affected by pg_stat_reset. Regards, Marti -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Ineffective autovacuum
Royce Ausburn royce...@inomial.com writes: Since sending this first email I've up'd the autovacuum log level and I've noticed that the same tables seem to be auto vacuum'd over and over again⦠Some of the tables are a bit surprising in that they're updated semi-regularly, but not enough (I'd think) to warrant an autovacuum every few minutes⦠Is this unusual? Well, that proves autovacuum isn't getting blocked anyway. At this point I suspect that Marti has fingered the correct issue: you likely need to increase the FSM settings. You should try running a manual VACUUM VERBOSE and see if it suggests that more FSM space is needed (there'll be some FSM stats at the end of the verbose printout). Perhaps unrelated: I've done some digging around and happened across a nightly task doing: select pg_stat_reset() on each of the databases in the clusterâ¦. I've no idea why we're doing that (and our usual sysadmin / DBA has resigned, so I doubt I'll ever know). There must have been a reason at the time, but I wonder if this might be interfering with things? Hmm, it's not helping any. Anything that needs vacuuming, but less often than once a day, would get missed due to the stats getting forgotten. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Ineffective autovacuum
On Tue, Sep 27, 2011 at 7:49 AM, Tom Lane t...@sss.pgh.pa.us wrote: Royce Ausburn royce...@inomial.com writes: Since sending this first email I've up'd the autovacuum log level and I've noticed that the same tables seem to be auto vacuum'd over and over again… Some of the tables are a bit surprising in that they're updated semi-regularly, but not enough (I'd think) to warrant an autovacuum every few minutes… Is this unusual? Well, that proves autovacuum isn't getting blocked anyway. At this point I suspect that Marti has fingered the correct issue: you likely need to increase the FSM settings. You should try running a manual VACUUM VERBOSE and see if it suggests that more FSM space is needed (there'll be some FSM stats at the end of the verbose printout). That's the probably the best first step, a good second one might be to increase the aggressiveness of autovac by lowering the delay, and increasing the cost limit. OP: You need to watch it a little closer during the day. first do as suggested and increase the max_fsm_pages. High settings on it don't cost a lot as they're only 6 bytes per page. So 1M max_fsm_pages costs 6M of shared RAM. After that run vacuum verbose every hour or two to keep an eye on the trend of how many pages it says are needed. If that number doesn't stabilize, but just keeps growing then you're not vacuuming aggressively enough. Up autovacuum_vacuum_cost_limit by a couple of factors, and lower autovacuum_vacuum_cost_delay to 5ms or less. Make sure you don't swamp your IO subsystem. On big machines with lots of spindles it's hard to swamp the IO. On smaller workstation class machines it's pretty easy. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Ineffective autovacuum
On 27/09/2011, at 8:29 PM, Marti Raudsepp wrote: 1. First things first: vacuum cannot delete tuples that are still visible to any old running transactions. You might have some very long queries or transactions that prevent it from cleaning properly: select * from pg_stat_activity where xact_start now()-interval '10 minutes'; Thanks -- that query is very handy. I suspect this might be the cause of our woes as this query results in a handful of long lived connections, however they're connections to databases other than the one that I'm having trouble with. I've checked up on the FSM as you suggested, I don't think that's the problem as there're no warnings in the verbose output nor the logs. But another clue: DETAIL: 93 dead row versions cannot be removed yet. After clearing those stuffed transactions vacuum verbose manages to clear away all the dead rows… That's confirmation enough for me - Now to find the application bugs - Thanks Tom, Marti Scott for your help! --Royce
[PERFORM] Ineffective autovacuum
Hi all, I have a problem with autovacuum apparently not doing the job I need it to do. I have a table named datasession that is frequently inserted, updated and deleted from. Typically the table will have a few thousand rows in it. Each row typically survives a few days and is updated every 5 - 10 mins. The application receives unreliable, potentially duplicate data from its source, so this table is heavily used for synchronising application threads as well. A typical access pattern is: - tx begin - SELECT FOR UPDATE on a single row - Do some application processing (1 - 100 ms) - Possibly UPDATE the row - tx commit In a few instances of our application we're seeing this table grow obscenely to the point where our monitoring servers get us out of bed to manually vacuum. I like sleep, so I want to fix this =D I've read some recent threads and found a discussion (below) on auto vacuum that mentions auto vacuum will be cancelled when a client requests a lock that auto vacuum is using… My questions: 1) Does it look like I'm affected by the same problem as in the below discussion? 2) Are there better solutions to this problem than a periodic task that vacuums/truncates-and-rebuilds the table? Perhaps relevant info: # select version(); version -- PostgreSQL 8.3.12 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC) 4.2.4 (Ubuntu 4.2.4-1ubuntu3) (1 row) Auto vacuum and vacuum parameters are set to the factory defaults. Cheers, --Royce From: Tom Lane t...@sss.pgh.pa.us Subject: Re: [GENERAL] Vacuum as easily obtained locks Date: 4 August 2011 1:52:02 AM AEST To: Michael Graham mgra...@bloxx.com Cc: Pavan Deolasee pavan.deola...@gmail.com, pgsql-gene...@postgresql.org On Wed, 2011-08-03 at 11:40 -0400, Tom Lane wrote: The other problem is that once autovacuum has gotten the lock, it has to keep it for long enough to re-scan the truncatable pages (to make sure they're still empty). And it is set up so that any access to the table will kick autovacuum off the lock. An access pattern like that would very likely prevent it from ever truncating, if there are a lot of pages that need to be truncated. (There's been some discussion of modifying this behavior, but nothing's been done about it yet.) Michael Graham mgra...@bloxx.com writes: Ah! This looks like it is very much the issue. Since I've got around 150GB of data that should be truncatable and a select every ~2s. Just to confirm would postgres write: 2011-08-03 16:09:55 BST ERROR: canceling autovacuum task 2011-08-03 16:09:55 BST CONTEXT: automatic vacuum of table traffic.public.logdata5queue Under those circumstances? Yup ... If you do a manual VACUUM, it won't allow itself to get kicked off the lock ... but as noted upthread, that will mean your other queries get blocked till it's done. Not sure there's any simple fix for this that doesn't involve some downtime. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [PERFORM] Ineffective autovacuum
Royce Ausburn royce...@inomial.com writes: I have a problem with autovacuum apparently not doing the job I need it to do. Hm, I wonder whether you're getting bit by bug #5759, which was fixed after 8.3.12. I have a table named datasession that is frequently inserted, updated and deleted from. Typically the table will have a few thousand rows in it. Each row typically survives a few days and is updated every 5 - 10 mins. The application receives unreliable, potentially duplicate data from its source, so this table is heavily used for synchronising application threads as well. A typical access pattern is: - tx begin - SELECT FOR UPDATE on a single row - Do some application processing (1 - 100 ms) - Possibly UPDATE the row - tx commit Transactions of that form would not interfere with autovacuum. You'd need something that wants exclusive lock, like a schema change. I've read some recent threads and found a discussion (below) on auto vacuum that mentions auto vacuum will be cancelled when a client requests a lock that auto vacuum is using My questions: 1) Does it look like I'm affected by the same problem as in the below discussion? Not unless you're seeing a lot of canceling autovacuum task messages in the postmaster log. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Ineffective autovacuum
On 27/09/2011, at 2:21 PM, Tom Lane wrote: Royce Ausburn royce...@inomial.com writes: I have a problem with autovacuum apparently not doing the job I need it to do. Hm, I wonder whether you're getting bit by bug #5759, which was fixed after 8.3.12. If this were the case, would I see lots of auto vacuum worker processes in ps that are essentially doing nothing because they're sleeping all the time? If so, then I think perhaps not. I have a table named datasession that is frequently inserted, updated and deleted from. Typically the table will have a few thousand rows in it. Each row typically survives a few days and is updated every 5 - 10 mins. The application receives unreliable, potentially duplicate data from its source, so this table is heavily used for synchronising application threads as well. A typical access pattern is: - tx begin - SELECT FOR UPDATE on a single row - Do some application processing (1 - 100 ms) - Possibly UPDATE the row - tx commit Transactions of that form would not interfere with autovacuum. You'd need something that wants exclusive lock, like a schema change. I've read some recent threads and found a discussion (below) on auto vacuum that mentions auto vacuum will be cancelled when a client requests a lock that auto vacuum is using∑ My questions: 1) Does it look like I'm affected by the same problem as in the below discussion? Not unless you're seeing a lot of canceling autovacuum task messages in the postmaster log. Okay - This is not the case. Since sending this first email I've up'd the autovacuum log level and I've noticed that the same tables seem to be auto vacuum'd over and over again… Some of the tables are a bit surprising in that they're updated semi-regularly, but not enough (I'd think) to warrant an autovacuum every few minutes… Is this unusual? Perhaps unrelated: I've done some digging around and happened across a nightly task doing: select pg_stat_reset() on each of the databases in the cluster…. I've no idea why we're doing that (and our usual sysadmin / DBA has resigned, so I doubt I'll ever know). There must have been a reason at the time, but I wonder if this might be interfering with things? At any rate, I think the logs might glean some more interesting information, I'll let it alone for a few hours and hopefully I'll have some more useful information. --Royce -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance