Re: [GENERAL] Enabling Autovacuum Postgres 9.1 (was Unexpectedly high disk space usage)

2012-11-13 Thread Kevin Grittner
Greg Williamson wrote:

 running transactions can cause autovacuum processes to stall
 out or be autocancelled. Long running transactions - is now
 long? In our system it's rare to have a transaction (even a
 prepared transaction) last much longer than a few minutes. Is
 that enough time to cause problems with AutoVacuum?

 The only situation where I would expect that to be a problem is in
 a very small table which is updated hundreds of times per second.
 
 Could you elaborate on this, or point me to a previous thread ?

I had a situation where there was a need for consecutive numbers
(i.e., no gaps, to satisfy financial auditors), so we needed
assignment of these numbers to be transactional rather than using
SEQUENCE or SERIAL objects. There was a very small table for
assigning these numbers, which was very frequently updated.  In a
quiescent state all rows in the table would fit in one page. Before
tuning autovacuum to be more aggressive, the table bloated, causing
performance to fall off. Then autovacuum would kick in and it would
get even worse. So naturally, my first response was to make
autovacuum less aggressive, which caused the table to bloat even more
under normal load, and caused autovacuum to have an even bigger
impact when it did kick in. The table bloated to thousands of pages.

Then I tried the opposite approach: I cranked up autovacuum to be
very aggressive. Under normal load the table settled in at five to
ten pages and performance was great. However, any long-running
transaction could cause some bloat, so a big report could still cause
this one table to become a performance problem. I found that a
CLUSTER ran sub-second, because autovacuum did pare the index down to
just the needed entries, so I set up a crontab job to CLUSTER this
one table a few times per day. That worked out great for me.

I think the trick is to try to make autovacuum keep up as much as
possible, identify any issues it is not handling, and narrowly target
those particular areas with extraordinary maintenance.

-Kevin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Enabling Autovacuum Postgres 9.1 (was Unexpectedly high disk space usage)

2012-11-12 Thread Kevin Grittner
Lists wrote:
 There's a wealth of how to tune PG instruction that's old and
 (based on this thread alone) often stale enough to be classified
 as disinformative. For example, nearest I can tell, the entirety of
 this page is just wrong and/or irrelevant for 9.x and up: 
 http://wiki.postgresql.org/wiki/VACUUM_FULL. In constrast to the
 advice to avoid Vacuum Full [...] comes the insight that vacuum
 full is necessary to clean up stale data that is not at the end
 of the table. (See Jeff Janes 11/10/2012 email)

I thought about posting a clarification to what Jeff said there -- it
is only necessary to run VACUUM FULL to free up space in the sense
of having the database give disk space back to the OS so that the
next time you need space in the table it must be re-allocated from
the OS. This can be a bad thing in terms of slower allocation of
space for new tuples and allocation of non-contiguous disk space. You
are much better off running autovacuum aggressively enough that you
don't need to run VACUUM FULL than to run it periodically. I have
gone for years at a time on large databases (up to several TB)
without ever running VACUUM FULL on a table. It is a valuable tool to
recover from a bad state due to failure to properly maintain the
database; it is not a substitute for doing things right.

 Long running transactions can cause autovacuum processes to stall
 out or be autocancelled. Long running transactions - is now
 long? In our system it's rare to have a transaction (even a
 prepared transaction) last much longer than a few minutes. Is that
 enough time to cause problems with AutoVacuum?

The only situation where I would expect that to be a problem is in a
very small table which is updated hundreds of times per second.

 Should I increase the max_workers field from the default of 3 to 
 (perhaps) 10? Noting that my solution to the disk space problem is 
 effectively a max_worker of 1 since it's all done sequentially, I
 wonder if reducing max_workers would actually be better?

I would increase autovacuum_max_workers.

 Also, what's the thread count ? Is that max_workers?

I believe that's what was meant.

 Why would I want to reduce the cost delay to 0, and how does this
 relate to cost_limit?

I've never gone all the way to 0. I would be reluctant to go below
10; rather I would increase the cost limit. Autovacuum adds costs for
what it is doing, and when it hits the limit it sleeps for the cost
delay interval. I would think you would want to allow the process to
work on other things periodically.

 given my substantial I/O subsystem, I'd want to drop cost_delay to
 near zero and set the cost_limit really high, which is a rough
 restatement of the last quoted paragraph above. (I think)

I agree.

 Assuming that I make these suggestions and notice a subsequent
 system load problem, what information should I be gathering in
 order to provide better post-incident forensics?

Run a report on bloat periodically, to make sure it is staying under
control. You might want to monitor for long-running transactions and
prepared transactions which the transaction manager has lost track
of. The can be real killers.

During any incident of poor performance, it is good to gather iostat
or vmstat samples at an interval of 1 to 10 seconds for a few
minutes, and to capture the contents of pg_stat_activity and
pg_locks.

Turning on logging of checkpoint activity, autovacuum activity, and
slow queries can provide useful information when you match times from
the logging up against the times of slow periods.

 Lastly, there's the question of reindexing before full vacuum.
 I've observed that not doing a manual reindex prior to vacuum full
 did not, in fact, free up the space, even though I've been told
 that reindex is implicit in the vacuum process. (?!) I'm confident
 that I can reproduce this behavior given a bit of time to allow one
 of our database servers to bloat back up.

You never did say what version that was on. In 9.0 and later, VACUUM
FULL will drop and recreate all indexes from scratch after rebuilding
the heap, so it's pretty hard to imagine why dropping and recreating
all indexes right *before* rebuilding the heap would have much
impact. Prior to 9.0, VACUUM FULL would bloat indexes, so it was
always a good idea to REINDEX in addition to running VACUUM FULL,
although it was much more effective to run it *after* the VACUUM
FULL.

-Kevin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Enabling Autovacuum Postgres 9.1 (was Unexpectedly high disk space usage)

2012-11-12 Thread Greg Williamson
Kevin --

You wrote:
...
 running transactions can cause autovacuum processes to stall
 out or be autocancelled. Long running transactions - is now
 long? In our system it's rare to have a transaction (even a
 prepared transaction) last much longer than a few minutes. Is that
 enough time to cause problems with AutoVacuum?

The only situation where I would expect that to be a problem is in a
very small table which is updated hundreds of times per second.


Could you elaborate on this, or point me to a previous thread ?

I've got precisely such a database; currently we do an hourly reindex on all 
user tables in addition to some fairly standard autovac settings.

The tables themselves don't seem to bloat much during ordinary operations but 
if we don't reindex performance tanks; when the site has issues we sometimes 
see table bloat but it seems to be dealt with by autovac.

This is a postgres 9.1 instance w/ SSDs, lots of RAM (24 gigs) and relatively 
small tables (maybe a few thousands of rows in most cases, total footprint of 
the database is 240 megs) and being updated in the low thousands of times per 
second.

Thanks!

Greg Williamson



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Enabling Autovacuum Postgres 9.1 (was Unexpectedly high disk space usage)

2012-11-12 Thread Craig Ringer
On 11/13/2012 04:04 AM, Lists wrote:

 There's a wealth of how to tune PG instruction that's old and (based
 on this thread alone) often stale enough to be classified as
 disinformative. For example, nearest I can tell, the entirety of this
 page is just wrong and/or irrelevant for 9.x and up:
 http://wiki.postgresql.org/wiki/VACUUM_FULL.
Agreed, that needs fixing. I'll sort that out. That information was
important before the VACUUM FULL rewrite, but is now severely outdated.
It needs to be split into pre-9.1 and 9.1+ sections.

 In constrast to the advice to avoid Vacuum Full (
 http://wiki.postgresql.org/wiki/Introduction_to_VACUUM,_ANALYZE,_EXPLAIN,_and_COUNT)
 comes the insight that vacuum full is necessary to clean up stale data
 that is not at the end of the table. (See Jeff Janes 11/10/2012 email)
That part isn't quite right AFAIK.

If you keep table bloat under control, ordinary VACCUM is perfectly
sufficient. You only need/want to VACUUM FULL if you wish to truncate a
table, reducing its size by compacting rows. In an actively used
database that's usually pretty pointless, since new rows will then be
added to the end, growing the table. You might as well just keep on
re-using the space, rather than going through those compact-and-expand
cycles. You'll suffer from less file fragmentation that way and won't be
paying the costs of file system allocations.

If you have a table that you've dramatically reduced in size (say, by
deleting the vast majority of it) and you won't be adding more rows to
replace the old ones, that's when VACUUM FULL makes sense.

It's a bit like those utilities that claim to free or clean or
de-fragment memory. They seem good, but they're actually grossly
counter-productive, because the system then has to re-read cached data
and otherwise fight to get back to its old equilibrium. It's typically
the same for Pg: you want to aim for equilibrium, not free space that'll
just promptly get re-allocated.

If you do have a bad bloat problem, I'd set a non-default FILLFACTOR
before doing a VACUUM FULL, so you still have some free space within the
table after vacuum completes. That way you won't be immediately paying
the cost of allocating space for new rows as soon as any UPDATEs or
INSERTs come in.


 non-full vacuum can
 only free space from the end of the table.
 This would imply that a full analyze is a good idea, at least
 periodically (weekly/monthly/quarterly) in a database that combines
 the use of temp tables and periodic changes to persistent objects.
 Does autovacuum ever do a full analyze? What about autovacuum and
 the reindex question at the end of this email?
I'm not sure there's any such thing as a full analyze. VACUUM FULL
ANALYZE is Do a VACUUM FULL and an ANALYZE, not Do a full analyze.

Autovacuum should be taking care of analyze and table statistics. If it
isn't, adjust autovacuum parameters so that it does.

 These two can be killers.  Long running transactions can cause
 autovacuum processes to stall out or be autocancelled.
 Long running transactions - is now long? In our system it's rare to
 have a transaction (even a prepared transaction) last much longer than
 a few minutes. Is that enough time to cause problems with AutoVacuum?

Not generally, no.

PostgreSQL can't clean up rows that are still visible to a transaction.
So if your transactions are three minutes long, that's a three minute
delay before VACUUM can clean up DELETEd rows or dead rows left by
UPDATEs. Not a biggie even on a pretty high load DB.

You should generally be concerned only when transactions are open over
user think time or are abandoned by buggy applications - cases where
the transaction length is many minutes or hours, potentially unbounded.
Uncommitted prepared transactions are also a problem for similar reasons.

 Lastly, there's the question of reindexing before full vacuum. I've
 observed that not doing a manual reindex prior to vacuum full did not,
 in fact, free up the space, even though I've been told that reindex is
 implicit in the vacuum process. (?!)

VACUUM FULL, post-9.1, should take care of index bloat.

Pre-9.1 VACUUM FULL could make index bloat worse.

Ordinary VACUUM will not truncate indexes AFAIK, only mark free space
within them so it can be re-used. Same deal as with the table its self:
this is usually what you want.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



Re: [GENERAL] Enabling Autovacuum Postgres 9.1 (was Unexpectedly high disk space usage)

2012-11-12 Thread Craig Ringer
On 11/13/2012 10:29 AM, Craig Ringer wrote:
 On 11/13/2012 04:04 AM, Lists wrote:

 There's a wealth of how to tune PG instruction that's old and (based
 on this thread alone) often stale enough to be classified as
 disinformative. For example, nearest I can tell, the entirety of this
 page is just wrong and/or irrelevant for 9.x and up:
 http://wiki.postgresql.org/wiki/VACUUM_FULL.
 Agreed, that needs fixing. I'll sort that out. That information was
 important before the VACUUM FULL rewrite, but is now severely
 outdated. It needs to be split into pre-9.1 and 9.1+ sections.
Correction - pre-9.0 and 9.0+

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



Re: [GENERAL] enabling autovacuum

2008-01-30 Thread Jeremy Harris

Chander Ganesan wrote:

Jeremy Harris wrote:

Version:
 PostgreSQL 8.2.4 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 
4.1.2 20070418 (Red Hat 4.1.2-10)


We have one problematic table, which has a steady stream of entries
and a weekly mass-delete of ancient history.  The bloat query from
Greg Sabino Mullane (thanks to Greg Smith for pointing it out) returns:
Inserts don't generate dead tuples, and AVD looks at obsolete tuples..  
As such,  I wouldn't expect AVD to kick off until after you did a mass 
delete...assuming that delete was sizable enough to trigger a vacuum.


Ah, that would explain it - thankyou.   So I need to retreat to
the question of why the weekly vacuum permits the observed bloat.
Any ideas?   More information that I could gather?

Thanks,
  Jeremy

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] enabling autovacuum

2008-01-30 Thread Matthew T. O'Connor

Jeremy Harris wrote:

Chander Ganesan wrote:
Inserts don't generate dead tuples, and AVD looks at obsolete 
tuples..  As such,  I wouldn't expect AVD to kick off until after you 
did a mass delete...assuming that delete was sizable enough to 
trigger a vacuum.


Ah, that would explain it - thankyou.   So I need to retreat to
the question of why the weekly vacuum permits the observed bloat.
Any ideas?   More information that I could gather? 


Autovacuum will kick off an analyze if you do enough inserts however.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] enabling autovacuum

2008-01-29 Thread Chander Ganesan

Jeremy Harris wrote:

Hi,

We're starting to run autovacuum for the first time on a system
that's been running with nightly cron-driven vacuum for some time.

Version:
 PostgreSQL 8.2.4 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 
4.1.2 20070418 (Red Hat 4.1.2-10)


We have one problematic table, which has a steady stream of entries
and a weekly mass-delete of ancient history.  The bloat query from
Greg Sabino Mullane (thanks to Greg Smith for pointing it out) returns:
Inserts don't generate dead tuples, and AVD looks at obsolete tuples..  
As such,  I wouldn't expect AVD to kick off until after you did a mass 
delete...assuming that delete was sizable enough to trigger a vacuum.


--
Chander Ganesan
Open Technology Group, Inc.
One Copley Parkway, Suite 210
Morrisville, NC  27560
Phone: 877-258-8987/919-463-0999
http://www.otg-nc.com
Ask me about our Expert PostgreSQL  PostGIS Training


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] enabling autovacuum

2008-01-28 Thread Christopher Browne
On Jan 28, 2008 10:17 PM, Jeremy Harris [EMAIL PROTECTED] wrote:
 Hi,

 We're starting to run autovacuum for the first time on a system
 that's been running with nightly cron-driven vacuum for some time.

 Version:
   PostgreSQL 8.2.4 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 
 20070418 (Red Hat 4.1.2-10)

 We have one problematic table, which has a steady stream of entries
 and a weekly mass-delete of ancient history.  The bloat query from
 Greg Sabino Mullane (thanks to Greg Smith for pointing it out) returns:

  schemaname | tablename  | reltuples | relpages | otta | tbloat | wastedpages 
 | wastedbytes | wastedsize |iname| ituples | ipages | 
 iotta | ibloat | wastedipages | wastedibytes | wastedisize
 ++---+--+--++-+-++-+-++---++--+--+-
  public | rcpt_audit |   1300300 |   152149 | 6365 |   23.9 |  145784 
 |  1194262528 | 1139 MB| rcpt_audit_msg_audit_id_idx | 1300300 |   6798 | 
  3819 |1.8 | 2979 | 24403968 | 23 MB
  public | rcpt_audit |   1300300 |   152149 | 6365 |   23.9 |  145784 
 |  1194262528 | 1139 MB| rcpt_audit_id_idx   | 1300300 |   4727 | 
  3819 |1.2 |  908 |  7438336 | 7264 kB
 We have uncommented autovacuum = on in postgresql.conf and run
 service postgresql reload.  pg_stat_all_tables shows 4 tables
 as autoanalyzed at about that time; 3 of which were also
 autovacuumed.  The problem table is not included; no other autos
 are logged there in the succeeding 24 hours.
   Is other action needed to enable autovacuum?


 The autovacuum tuning parameters are all at default settings.
 We have
  max_fsm_pages = 200
  max_fsm_relations = 10
...
 Are there any other changes we should make to stop this table
 getting so bloated?

Is it possible that this table didn't see many updates, today?

You could add an entry to pg_catalog.pg_autovacuum to customize the
handling of your Favorite Table.

http://www.postgresql.org/docs/8.2/static/catalog-pg-autovacuum.html

You might lower the thresholds for that table...
-- 
http://linuxfinances.info/info/linuxdistributions.html
The definition of insanity is doing the same thing over and over and
expecting different results.  -- assortedly attributed to Albert
Einstein, Benjamin Franklin, Rita Mae Brown, and Rudyard Kipling

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] enabling autovacuum

2008-01-28 Thread Jeremy Harris

Christopher Browne wrote:

Is it possible that this table didn't see many updates, today?


Nope; about 24000 (according to the id sequence).
- Jeremy

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[OT] Re: [GENERAL] enabling autovacuum

2008-01-28 Thread Ow Mun Heng

On Mon, 2008-01-28 at 22:17 +, Jeremy Harris wrote:
 We have one problematic table, which has a steady stream of entries
 and a weekly mass-delete of ancient history.  The bloat query from
 Greg Sabino Mullane (thanks to Greg Smith for pointing it out) returns:
 
  schemaname | tablename  | reltuples | relpages | otta | tbloat | wastedpages 
 | wastedbytes | wastedsize |iname| ituples | ipages | 
 iotta | ibloat | wastedipages | wastedibytes | wastedisize 
 ++---+--+--++-+-++-+-++---++--+--+-
  public | rcpt_audit |   1300300 |   152149 | 6365 |   23.9 |  145784 
 |  1194262528 | 1139 MB| rcpt_audit_msg_audit_id_idx | 1300300 |   6798 | 
  3819 |1.8 | 2979 | 24403968 | 23 MB
  public | rcpt_audit |   1300300 |   152149 | 6365 |   23.9 |  145784 
 |  1194262528 | 1139 MB| rcpt_audit_id_idx   | 1300300 |   4727 | 
  3819 |1.2 |  908 |  7438336 | 7264 kB
 
Can you let me know what is the sql used to generate such a nice summary
of the tables?


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [OT] Re: [GENERAL] enabling autovacuum

2008-01-28 Thread Greg Smith

On Tue, 29 Jan 2008, Ow Mun Heng wrote:


Can you let me know what is the sql used to generate such a nice summary
of the tables?


Might as well dupe the old text; this went out to the performance list:

Greg Sabino Mullane released a Nagios plug-in for PostgreSQL that you can 
grab at http://bucardo.org/nagios_postgres/ , and while that is itself 
nice the thing I found most remarkable is the bloat check. The majority of 
that code is an impressive bit of SQL that anyone could use even if you 
have no interest in Nagios, which is why I point it out for broader 
attention. Look in check_postgres.pl for the check_bloat routine and the 
big statement starting at the aptly labled This was fun to write 
section. If you pull that out of there and replace $MINPAGES and 
$MINIPAGES near the end with real values, you can pop that into a 
standalone query and execute it directly.


That's what gives the summary Jeremy included in his message.

--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [OT] Re: [GENERAL] enabling autovacuum

2008-01-28 Thread Ow Mun Heng

On Mon, 2008-01-28 at 20:57 -0500, Greg Smith wrote:
 On Tue, 29 Jan 2008, Ow Mun Heng wrote:
 
  Can you let me know what is the sql used to generate such a nice summary
  of the tables?
 
 Might as well dupe the old text; this went out to the performance list:
 
 Greg Sabino Mullane released a Nagios plug-in for PostgreSQL that you can 
 grab at http://bucardo.org/nagios_postgres/ , and while that is itself 
 nice the thing I found most remarkable is the bloat check. The majority of 
 that code is an impressive bit of SQL that anyone could use even if you 
 have no interest in Nagios, which is why I point it out for broader 
 attention. Look in check_postgres.pl for the check_bloat routine and the 
 big statement starting at the aptly labled This was fun to write 
 section. If you pull that out of there and replace $MINPAGES and 
 $MINIPAGES near the end with real values, you can pop that into a 
 standalone query and execute it directly.

I'm subscribed to perf list and I _did_ take a look at the tool
previously. However, something happened and I didn't managed to look at
it throughly or something.

I'll take another look at it and thanks for the pointers..

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/