Re: [GENERAL] Enabling Autovacuum Postgres 9.1 (was Unexpectedly high disk space usage)
Greg Williamson wrote: running transactions can cause autovacuum processes to stall out or be autocancelled. Long running transactions - is now long? In our system it's rare to have a transaction (even a prepared transaction) last much longer than a few minutes. Is that enough time to cause problems with AutoVacuum? The only situation where I would expect that to be a problem is in a very small table which is updated hundreds of times per second. Could you elaborate on this, or point me to a previous thread ? I had a situation where there was a need for consecutive numbers (i.e., no gaps, to satisfy financial auditors), so we needed assignment of these numbers to be transactional rather than using SEQUENCE or SERIAL objects. There was a very small table for assigning these numbers, which was very frequently updated. In a quiescent state all rows in the table would fit in one page. Before tuning autovacuum to be more aggressive, the table bloated, causing performance to fall off. Then autovacuum would kick in and it would get even worse. So naturally, my first response was to make autovacuum less aggressive, which caused the table to bloat even more under normal load, and caused autovacuum to have an even bigger impact when it did kick in. The table bloated to thousands of pages. Then I tried the opposite approach: I cranked up autovacuum to be very aggressive. Under normal load the table settled in at five to ten pages and performance was great. However, any long-running transaction could cause some bloat, so a big report could still cause this one table to become a performance problem. I found that a CLUSTER ran sub-second, because autovacuum did pare the index down to just the needed entries, so I set up a crontab job to CLUSTER this one table a few times per day. That worked out great for me. I think the trick is to try to make autovacuum keep up as much as possible, identify any issues it is not handling, and narrowly target those particular areas with extraordinary maintenance. -Kevin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Enabling Autovacuum Postgres 9.1 (was Unexpectedly high disk space usage)
Lists wrote: There's a wealth of how to tune PG instruction that's old and (based on this thread alone) often stale enough to be classified as disinformative. For example, nearest I can tell, the entirety of this page is just wrong and/or irrelevant for 9.x and up: http://wiki.postgresql.org/wiki/VACUUM_FULL. In constrast to the advice to avoid Vacuum Full [...] comes the insight that vacuum full is necessary to clean up stale data that is not at the end of the table. (See Jeff Janes 11/10/2012 email) I thought about posting a clarification to what Jeff said there -- it is only necessary to run VACUUM FULL to free up space in the sense of having the database give disk space back to the OS so that the next time you need space in the table it must be re-allocated from the OS. This can be a bad thing in terms of slower allocation of space for new tuples and allocation of non-contiguous disk space. You are much better off running autovacuum aggressively enough that you don't need to run VACUUM FULL than to run it periodically. I have gone for years at a time on large databases (up to several TB) without ever running VACUUM FULL on a table. It is a valuable tool to recover from a bad state due to failure to properly maintain the database; it is not a substitute for doing things right. Long running transactions can cause autovacuum processes to stall out or be autocancelled. Long running transactions - is now long? In our system it's rare to have a transaction (even a prepared transaction) last much longer than a few minutes. Is that enough time to cause problems with AutoVacuum? The only situation where I would expect that to be a problem is in a very small table which is updated hundreds of times per second. Should I increase the max_workers field from the default of 3 to (perhaps) 10? Noting that my solution to the disk space problem is effectively a max_worker of 1 since it's all done sequentially, I wonder if reducing max_workers would actually be better? I would increase autovacuum_max_workers. Also, what's the thread count ? Is that max_workers? I believe that's what was meant. Why would I want to reduce the cost delay to 0, and how does this relate to cost_limit? I've never gone all the way to 0. I would be reluctant to go below 10; rather I would increase the cost limit. Autovacuum adds costs for what it is doing, and when it hits the limit it sleeps for the cost delay interval. I would think you would want to allow the process to work on other things periodically. given my substantial I/O subsystem, I'd want to drop cost_delay to near zero and set the cost_limit really high, which is a rough restatement of the last quoted paragraph above. (I think) I agree. Assuming that I make these suggestions and notice a subsequent system load problem, what information should I be gathering in order to provide better post-incident forensics? Run a report on bloat periodically, to make sure it is staying under control. You might want to monitor for long-running transactions and prepared transactions which the transaction manager has lost track of. The can be real killers. During any incident of poor performance, it is good to gather iostat or vmstat samples at an interval of 1 to 10 seconds for a few minutes, and to capture the contents of pg_stat_activity and pg_locks. Turning on logging of checkpoint activity, autovacuum activity, and slow queries can provide useful information when you match times from the logging up against the times of slow periods. Lastly, there's the question of reindexing before full vacuum. I've observed that not doing a manual reindex prior to vacuum full did not, in fact, free up the space, even though I've been told that reindex is implicit in the vacuum process. (?!) I'm confident that I can reproduce this behavior given a bit of time to allow one of our database servers to bloat back up. You never did say what version that was on. In 9.0 and later, VACUUM FULL will drop and recreate all indexes from scratch after rebuilding the heap, so it's pretty hard to imagine why dropping and recreating all indexes right *before* rebuilding the heap would have much impact. Prior to 9.0, VACUUM FULL would bloat indexes, so it was always a good idea to REINDEX in addition to running VACUUM FULL, although it was much more effective to run it *after* the VACUUM FULL. -Kevin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Enabling Autovacuum Postgres 9.1 (was Unexpectedly high disk space usage)
Kevin -- You wrote: ... running transactions can cause autovacuum processes to stall out or be autocancelled. Long running transactions - is now long? In our system it's rare to have a transaction (even a prepared transaction) last much longer than a few minutes. Is that enough time to cause problems with AutoVacuum? The only situation where I would expect that to be a problem is in a very small table which is updated hundreds of times per second. Could you elaborate on this, or point me to a previous thread ? I've got precisely such a database; currently we do an hourly reindex on all user tables in addition to some fairly standard autovac settings. The tables themselves don't seem to bloat much during ordinary operations but if we don't reindex performance tanks; when the site has issues we sometimes see table bloat but it seems to be dealt with by autovac. This is a postgres 9.1 instance w/ SSDs, lots of RAM (24 gigs) and relatively small tables (maybe a few thousands of rows in most cases, total footprint of the database is 240 megs) and being updated in the low thousands of times per second. Thanks! Greg Williamson -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Enabling Autovacuum Postgres 9.1 (was Unexpectedly high disk space usage)
On 11/13/2012 04:04 AM, Lists wrote: There's a wealth of how to tune PG instruction that's old and (based on this thread alone) often stale enough to be classified as disinformative. For example, nearest I can tell, the entirety of this page is just wrong and/or irrelevant for 9.x and up: http://wiki.postgresql.org/wiki/VACUUM_FULL. Agreed, that needs fixing. I'll sort that out. That information was important before the VACUUM FULL rewrite, but is now severely outdated. It needs to be split into pre-9.1 and 9.1+ sections. In constrast to the advice to avoid Vacuum Full ( http://wiki.postgresql.org/wiki/Introduction_to_VACUUM,_ANALYZE,_EXPLAIN,_and_COUNT) comes the insight that vacuum full is necessary to clean up stale data that is not at the end of the table. (See Jeff Janes 11/10/2012 email) That part isn't quite right AFAIK. If you keep table bloat under control, ordinary VACCUM is perfectly sufficient. You only need/want to VACUUM FULL if you wish to truncate a table, reducing its size by compacting rows. In an actively used database that's usually pretty pointless, since new rows will then be added to the end, growing the table. You might as well just keep on re-using the space, rather than going through those compact-and-expand cycles. You'll suffer from less file fragmentation that way and won't be paying the costs of file system allocations. If you have a table that you've dramatically reduced in size (say, by deleting the vast majority of it) and you won't be adding more rows to replace the old ones, that's when VACUUM FULL makes sense. It's a bit like those utilities that claim to free or clean or de-fragment memory. They seem good, but they're actually grossly counter-productive, because the system then has to re-read cached data and otherwise fight to get back to its old equilibrium. It's typically the same for Pg: you want to aim for equilibrium, not free space that'll just promptly get re-allocated. If you do have a bad bloat problem, I'd set a non-default FILLFACTOR before doing a VACUUM FULL, so you still have some free space within the table after vacuum completes. That way you won't be immediately paying the cost of allocating space for new rows as soon as any UPDATEs or INSERTs come in. non-full vacuum can only free space from the end of the table. This would imply that a full analyze is a good idea, at least periodically (weekly/monthly/quarterly) in a database that combines the use of temp tables and periodic changes to persistent objects. Does autovacuum ever do a full analyze? What about autovacuum and the reindex question at the end of this email? I'm not sure there's any such thing as a full analyze. VACUUM FULL ANALYZE is Do a VACUUM FULL and an ANALYZE, not Do a full analyze. Autovacuum should be taking care of analyze and table statistics. If it isn't, adjust autovacuum parameters so that it does. These two can be killers. Long running transactions can cause autovacuum processes to stall out or be autocancelled. Long running transactions - is now long? In our system it's rare to have a transaction (even a prepared transaction) last much longer than a few minutes. Is that enough time to cause problems with AutoVacuum? Not generally, no. PostgreSQL can't clean up rows that are still visible to a transaction. So if your transactions are three minutes long, that's a three minute delay before VACUUM can clean up DELETEd rows or dead rows left by UPDATEs. Not a biggie even on a pretty high load DB. You should generally be concerned only when transactions are open over user think time or are abandoned by buggy applications - cases where the transaction length is many minutes or hours, potentially unbounded. Uncommitted prepared transactions are also a problem for similar reasons. Lastly, there's the question of reindexing before full vacuum. I've observed that not doing a manual reindex prior to vacuum full did not, in fact, free up the space, even though I've been told that reindex is implicit in the vacuum process. (?!) VACUUM FULL, post-9.1, should take care of index bloat. Pre-9.1 VACUUM FULL could make index bloat worse. Ordinary VACUUM will not truncate indexes AFAIK, only mark free space within them so it can be re-used. Same deal as with the table its self: this is usually what you want. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [GENERAL] Enabling Autovacuum Postgres 9.1 (was Unexpectedly high disk space usage)
On 11/13/2012 10:29 AM, Craig Ringer wrote: On 11/13/2012 04:04 AM, Lists wrote: There's a wealth of how to tune PG instruction that's old and (based on this thread alone) often stale enough to be classified as disinformative. For example, nearest I can tell, the entirety of this page is just wrong and/or irrelevant for 9.x and up: http://wiki.postgresql.org/wiki/VACUUM_FULL. Agreed, that needs fixing. I'll sort that out. That information was important before the VACUUM FULL rewrite, but is now severely outdated. It needs to be split into pre-9.1 and 9.1+ sections. Correction - pre-9.0 and 9.0+ -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services
Re: [GENERAL] enabling autovacuum
Chander Ganesan wrote: Jeremy Harris wrote: Version: PostgreSQL 8.2.4 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20070418 (Red Hat 4.1.2-10) We have one problematic table, which has a steady stream of entries and a weekly mass-delete of ancient history. The bloat query from Greg Sabino Mullane (thanks to Greg Smith for pointing it out) returns: Inserts don't generate dead tuples, and AVD looks at obsolete tuples.. As such, I wouldn't expect AVD to kick off until after you did a mass delete...assuming that delete was sizable enough to trigger a vacuum. Ah, that would explain it - thankyou. So I need to retreat to the question of why the weekly vacuum permits the observed bloat. Any ideas? More information that I could gather? Thanks, Jeremy ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] enabling autovacuum
Jeremy Harris wrote: Chander Ganesan wrote: Inserts don't generate dead tuples, and AVD looks at obsolete tuples.. As such, I wouldn't expect AVD to kick off until after you did a mass delete...assuming that delete was sizable enough to trigger a vacuum. Ah, that would explain it - thankyou. So I need to retreat to the question of why the weekly vacuum permits the observed bloat. Any ideas? More information that I could gather? Autovacuum will kick off an analyze if you do enough inserts however. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] enabling autovacuum
Jeremy Harris wrote: Hi, We're starting to run autovacuum for the first time on a system that's been running with nightly cron-driven vacuum for some time. Version: PostgreSQL 8.2.4 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20070418 (Red Hat 4.1.2-10) We have one problematic table, which has a steady stream of entries and a weekly mass-delete of ancient history. The bloat query from Greg Sabino Mullane (thanks to Greg Smith for pointing it out) returns: Inserts don't generate dead tuples, and AVD looks at obsolete tuples.. As such, I wouldn't expect AVD to kick off until after you did a mass delete...assuming that delete was sizable enough to trigger a vacuum. -- Chander Ganesan Open Technology Group, Inc. One Copley Parkway, Suite 210 Morrisville, NC 27560 Phone: 877-258-8987/919-463-0999 http://www.otg-nc.com Ask me about our Expert PostgreSQL PostGIS Training ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] enabling autovacuum
On Jan 28, 2008 10:17 PM, Jeremy Harris [EMAIL PROTECTED] wrote: Hi, We're starting to run autovacuum for the first time on a system that's been running with nightly cron-driven vacuum for some time. Version: PostgreSQL 8.2.4 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20070418 (Red Hat 4.1.2-10) We have one problematic table, which has a steady stream of entries and a weekly mass-delete of ancient history. The bloat query from Greg Sabino Mullane (thanks to Greg Smith for pointing it out) returns: schemaname | tablename | reltuples | relpages | otta | tbloat | wastedpages | wastedbytes | wastedsize |iname| ituples | ipages | iotta | ibloat | wastedipages | wastedibytes | wastedisize ++---+--+--++-+-++-+-++---++--+--+- public | rcpt_audit | 1300300 | 152149 | 6365 | 23.9 | 145784 | 1194262528 | 1139 MB| rcpt_audit_msg_audit_id_idx | 1300300 | 6798 | 3819 |1.8 | 2979 | 24403968 | 23 MB public | rcpt_audit | 1300300 | 152149 | 6365 | 23.9 | 145784 | 1194262528 | 1139 MB| rcpt_audit_id_idx | 1300300 | 4727 | 3819 |1.2 | 908 | 7438336 | 7264 kB We have uncommented autovacuum = on in postgresql.conf and run service postgresql reload. pg_stat_all_tables shows 4 tables as autoanalyzed at about that time; 3 of which were also autovacuumed. The problem table is not included; no other autos are logged there in the succeeding 24 hours. Is other action needed to enable autovacuum? The autovacuum tuning parameters are all at default settings. We have max_fsm_pages = 200 max_fsm_relations = 10 ... Are there any other changes we should make to stop this table getting so bloated? Is it possible that this table didn't see many updates, today? You could add an entry to pg_catalog.pg_autovacuum to customize the handling of your Favorite Table. http://www.postgresql.org/docs/8.2/static/catalog-pg-autovacuum.html You might lower the thresholds for that table... -- http://linuxfinances.info/info/linuxdistributions.html The definition of insanity is doing the same thing over and over and expecting different results. -- assortedly attributed to Albert Einstein, Benjamin Franklin, Rita Mae Brown, and Rudyard Kipling ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] enabling autovacuum
Christopher Browne wrote: Is it possible that this table didn't see many updates, today? Nope; about 24000 (according to the id sequence). - Jeremy ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[OT] Re: [GENERAL] enabling autovacuum
On Mon, 2008-01-28 at 22:17 +, Jeremy Harris wrote: We have one problematic table, which has a steady stream of entries and a weekly mass-delete of ancient history. The bloat query from Greg Sabino Mullane (thanks to Greg Smith for pointing it out) returns: schemaname | tablename | reltuples | relpages | otta | tbloat | wastedpages | wastedbytes | wastedsize |iname| ituples | ipages | iotta | ibloat | wastedipages | wastedibytes | wastedisize ++---+--+--++-+-++-+-++---++--+--+- public | rcpt_audit | 1300300 | 152149 | 6365 | 23.9 | 145784 | 1194262528 | 1139 MB| rcpt_audit_msg_audit_id_idx | 1300300 | 6798 | 3819 |1.8 | 2979 | 24403968 | 23 MB public | rcpt_audit | 1300300 | 152149 | 6365 | 23.9 | 145784 | 1194262528 | 1139 MB| rcpt_audit_id_idx | 1300300 | 4727 | 3819 |1.2 | 908 | 7438336 | 7264 kB Can you let me know what is the sql used to generate such a nice summary of the tables? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [OT] Re: [GENERAL] enabling autovacuum
On Tue, 29 Jan 2008, Ow Mun Heng wrote: Can you let me know what is the sql used to generate such a nice summary of the tables? Might as well dupe the old text; this went out to the performance list: Greg Sabino Mullane released a Nagios plug-in for PostgreSQL that you can grab at http://bucardo.org/nagios_postgres/ , and while that is itself nice the thing I found most remarkable is the bloat check. The majority of that code is an impressive bit of SQL that anyone could use even if you have no interest in Nagios, which is why I point it out for broader attention. Look in check_postgres.pl for the check_bloat routine and the big statement starting at the aptly labled This was fun to write section. If you pull that out of there and replace $MINPAGES and $MINIPAGES near the end with real values, you can pop that into a standalone query and execute it directly. That's what gives the summary Jeremy included in his message. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [OT] Re: [GENERAL] enabling autovacuum
On Mon, 2008-01-28 at 20:57 -0500, Greg Smith wrote: On Tue, 29 Jan 2008, Ow Mun Heng wrote: Can you let me know what is the sql used to generate such a nice summary of the tables? Might as well dupe the old text; this went out to the performance list: Greg Sabino Mullane released a Nagios plug-in for PostgreSQL that you can grab at http://bucardo.org/nagios_postgres/ , and while that is itself nice the thing I found most remarkable is the bloat check. The majority of that code is an impressive bit of SQL that anyone could use even if you have no interest in Nagios, which is why I point it out for broader attention. Look in check_postgres.pl for the check_bloat routine and the big statement starting at the aptly labled This was fun to write section. If you pull that out of there and replace $MINPAGES and $MINIPAGES near the end with real values, you can pop that into a standalone query and execute it directly. I'm subscribed to perf list and I _did_ take a look at the tool previously. However, something happened and I didn't managed to look at it throughly or something. I'll take another look at it and thanks for the pointers.. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/