[GENERAL] Bloated Table
Hello everybody. I'm using a view (http://pgsql.tapoueh.org/site/html/news/20080131.bloat.html) to display the bloat (unused disk space) of the tables in a PostgreSQL database. Using this data I want to implement a database maintenance script automatically exectuting a VACUUM FULL on these tables. Unfortunately I am finding a table to have bloat which can't be reclaimed. I have tried VACUUM, REINDEX, VACUUM FULL ANALYZE with REINDEX, and even dump and restore. The view always shows 375MB of bloat for the table. Is this normal? Here's the table structure: Table public.foobar_log Column | Type | Modifiers +--+ +--+--- foorbarid | integer | not null default nextval('foobar_log_id_seq'::regclass) created_at | timestamp with time zone | not null foo| character varying(50)| not null bar| character varying(16)| not null chit | integer | not null chat | boolean | not null default false Indexes: bar_index btree (bar) foobarid_foobar_log_key btree (foobarid) chit_foobar_log_key btree (chit) The table consists of approximately 2.4 million entries. Any help is appreciated. Kind regards, Alex -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bloated Table
On Wed, May 27, 2009 at 3:54 PM, Alexander Schöcke a...@turtle-entertainment.de wrote: Hello everybody. I'm using a view (http://pgsql.tapoueh.org/site/html/news/20080131.bloat.html) to display the bloat (unused disk space) of the tables in a PostgreSQL database. Using this data I want to implement a database maintenance script automatically exectuting a VACUUM FULL on these tables. Unfortunately I am finding a table to have bloat which can't be reclaimed. I have tried VACUUM, REINDEX, VACUUM FULL ANALYZE with REINDEX, and even dump and restore. The view always shows 375MB of bloat for the table. Is this normal? yes it is normal, if you do updates on it, and it doesn't use HOT (for instance if you update indexed column), now, in real life you shouldn't really ever need to vacuum full. Reindex - yes, but not vacuum full. Make sure, that you adjust fsm_* settings in your config, best thing is tu run vacuum analyze verbose, and see if it warns you about it being too small. -- GJ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bloated Table
In response to Alexander Schöcke a...@turtle-entertainment.de: Hello everybody. I'm using a view (http://pgsql.tapoueh.org/site/html/news/20080131.bloat.html) to display the bloat (unused disk space) of the tables in a PostgreSQL database. Using this data I want to implement a database maintenance script automatically exectuting a VACUUM FULL on these tables. Unfortunately I am finding a table to have bloat which can't be reclaimed. I have tried VACUUM, REINDEX, VACUUM FULL ANALYZE with REINDEX, and even dump and restore. The view always shows 375MB of bloat for the table. Is this normal? Here's the table structure: Table public.foobar_log Column | Type | Modifiers +--+ +--+--- foorbarid | integer | not null default nextval('foobar_log_id_seq'::regclass) created_at | timestamp with time zone | not null foo| character varying(50)| not null bar| character varying(16)| not null chit | integer | not null chat | boolean | not null default false Indexes: bar_index btree (bar) foobarid_foobar_log_key btree (foobarid) chit_foobar_log_key btree (chit) The table consists of approximately 2.4 million entries. Any help is appreciated. What is the output of VACUUM VERBOSE foobar_log? -- Bill Moran http://www.potentialtech.com http://people.collaborativefusion.com/~wmoran/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Bloated Table
Hello everybody. I'm using a view (http://pgsql.tapoueh.org/site/html/news/20080131.bloat.html) to display the bloat (unused disk space) of the tables in a PostgreSQL database. Using this data I want to implement a database maintenance script automatically exectuting a VACUUM FULL on these tables. Unfortunately I am finding a table to have bloat which can't be reclaimed. I have tried VACUUM, REINDEX, VACUUM FULL ANALYZE with REINDEX, and even dump and restore. The view always shows 375MB of bloat for the table. Is this normal? Here's the table structure: Table public.foobar_log Column | Type | Modifiers +--+--- foorbarid | integer | not null default nextval('foobar_log_id_seq'::regclass) created_at | timestamp with time zone | not null foo| character varying(50)| not null bar| character varying(16)| not null chit | integer | not null chat | boolean | not null default false Indexes: bar_index btree (bar) foobarid_foobar_log_key btree (foobarid) chit_foobar_log_key btree (chit) The table consists of approximately 2.4 million entries. Any help is appreciated. Kind regards, Alex -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bloated Table
=?iso-8859-1?Q?Alexander_Sch=F6cke?= a...@turtle-entertainment.de writes: I'm using a view (http://pgsql.tapoueh.org/site/html/news/20080131.bloat.html) to display the bloat (unused disk space) of the tables in a PostgreSQL database. I wouldn't trust the calculations that view does in the least. You might look at contrib/pgstattuple if you want numbers that have some relationship to reality (and are correspondingly more expensive to get :-() regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bloated Table
What is the output of VACUUM VERBOSE foobar_log? INFO: vacuuming public.foobar_log INFO: index bar_index now contains 23832007 row versions in 118151 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.02s/0.00u sec elapsed 64.10 sec. INFO: index foobarid_foobar_log_key now contains 23832007 row versions in 65347 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.02s/0.00u sec elapsed 32.60 sec. INFO: index foo_foobar_log_key now contains 23832007 row versions in 65347 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.03s/0.00u sec elapsed 25.14 sec. INFO: foobar_log: found 0 removable, 23832007 nonremovable row versions in 269835 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 1 pages contain useful free space. 0 pages are entirely empty. CPU 0.16s/0.09u sec elapsed 299.55 sec. VACUUM -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bloated Table
On Wed, 2009-05-27 at 11:15 -0400, Tom Lane wrote: =?iso-8859-1?Q?Alexander_Sch=F6cke?= a...@turtle-entertainment.de writes: I'm using a view (http://pgsql.tapoueh.org/site/html/news/20080131.bloat.html) to display the bloat (unused disk space) of the tables in a PostgreSQL database. I wouldn't trust the calculations that view does in the least. You might look at contrib/pgstattuple if you want numbers that have some relationship to reality (and are correspondingly more expensive to get :-() regards, tom lane Is the referenced query reliable for even estimating, or is it flat our wrong? Co-workers that were PGCon are saying that this is becoming a popular/accepted way to check for bloated tables. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bloated Table
Brad Nicholson wrote: On Wed, 2009-05-27 at 11:15 -0400, Tom Lane wrote: =?iso-8859-1?Q?Alexander_Sch=F6cke?= a...@turtle-entertainment.de writes: I'm using a view (http://pgsql.tapoueh.org/site/html/news/20080131.bloat.html) to display the bloat (unused disk space) of the tables in a PostgreSQL database. I wouldn't trust the calculations that view does in the least. You might look at contrib/pgstattuple if you want numbers that have some relationship to reality (and are correspondingly more expensive to get :-() Is the referenced query reliable for even estimating, or is it flat our wrong? Co-workers that were PGCon are saying that this is becoming a popular/accepted way to check for bloated tables. If ma is supposed to be maxalign, then this code is broken because it only reports mingw32 as 8, all others as 4, which is wrong. However I think the big problem is that it relies on pg_class.relpages and reltuples which are only accurate just after VACUUM, only a sample-based estimate just after ANALYZE, and wrong at any other time (assuming the table has any movement). -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bloated Table
Alvaro Herrera alvhe...@commandprompt.com writes: Brad Nicholson wrote: On Wed, 2009-05-27 at 11:15 -0400, Tom Lane wrote: I wouldn't trust the calculations that view does in the least. If ma is supposed to be maxalign, then this code is broken because it only reports mingw32 as 8, all others as 4, which is wrong. I didn't bother to go through every detail of the calculations, but there are multiple small errors there. (Handling alignment for the tuple header and not anyplace else is pretty pointless, for instance, even if you had the correct alignment number for the machine.) However ... However I think the big problem is that it relies on pg_class.relpages and reltuples which are only accurate just after VACUUM, only a sample-based estimate just after ANALYZE, and wrong at any other time (assuming the table has any movement). That's big problem number one, and big problem number two is that it has no good idea of the width of variable-width fields. (Should I even mention TOAST?) It's an interesting exercise in trying to estimate bloat without groveling through the whole relation, but I seriously doubt you could ever get numbers this way that are trustworthy enough to drive maintenance decisions. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bloated Table
Tom Lane wrote: It's an interesting exercise in trying to estimate bloat without groveling through the whole relation, but I seriously doubt you could ever get numbers this way that are trustworthy enough to drive maintenance decisions. Well, pg_stattuple is way too expensive to be running every minute ... -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bloated Table
Alvaro Herrera alvhe...@commandprompt.com writes: Tom Lane wrote: It's an interesting exercise in trying to estimate bloat without groveling through the whole relation, but I seriously doubt you could ever get numbers this way that are trustworthy enough to drive maintenance decisions. Well, pg_stattuple is way too expensive to be running every minute ... Agreed. I was idly wondering whether it'd be useful to have an option in pg_stattuple to sample some small fraction of the pages in a relation, rather than exhaustively examining each one. It'd be subject to all the normal ills of sampling, of course, but I'd still trust it way more than this view. (Especially noting that the view is already dependent on sampling anyway, since it's looking at pg_stats...) regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bloated Table
On Wed, 27 May 2009, Tom Lane wrote: It's an interesting exercise in trying to estimate bloat without groveling through the whole relation Interesting and an extremely common request. I just added an item to the Vacuum section of the TODO list while you were listing issues and potential solutions here: Provide more information in order to improve user-side estimates of dead space bloat in relations links to this thread. As pointed out upthread, there's a fair number of people who have picked up on this particular query for bloat estimation. It seemed accurate enough for gross maintenance use when I did a quick check of its results before, but those were tables without variable widths, TOAST, etc. This is one of those boring tasks that DBAs really want more monitoring-friendly visibility into. -- * Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bloated Table
On May 27, 2009, at 3:25 PM, Greg Smith wrote: Interesting and an extremely common request. I just added an item to the Vacuum section of the TODO list while you were listing issues and potential solutions here: Provide more information in order to improve user-side estimates of dead space bloat in relations links to this thread. Given this, is there is a recommended way of detecting bloat in indices automatically, rather than just looking at them once in a while and saying, Hm, that looks pretty big? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Bloated Table
Christophe x...@thebuild.com writes: Given this, is there is a recommended way of detecting bloat in indices automatically, rather than just looking at them once in a while and saying, Hm, that looks pretty big? contrib/pgstattuple works; it's just too expensive to run every few minutes ... regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general