[GENERAL] Bloated Table

2009-05-27 Thread Alexander Schöcke
Hello everybody.

I'm using a view (http://pgsql.tapoueh.org/site/html/news/20080131.bloat.html) 
to display the bloat (unused disk space) of the tables in a PostgreSQL 
database. Using this data I want to implement a database maintenance script 
automatically exectuting a VACUUM FULL on these tables.
Unfortunately I am finding a table to have bloat which can't be reclaimed. I 
have tried VACUUM, REINDEX, VACUUM FULL ANALYZE with REINDEX, and even dump and 
restore. The view always shows 375MB of bloat for the table.

Is this normal? Here's the table structure:

Table public.foobar_log
   Column   |   Type   | Modifiers
+--+
+--+---
 foorbarid  | integer  | not null default 
nextval('foobar_log_id_seq'::regclass)
 created_at | timestamp with time zone | not null
 foo| character varying(50)| not null
 bar| character varying(16)| not null
 chit   | integer  | not null
 chat   | boolean  | not null default false
Indexes:
bar_index btree (bar)
foobarid_foobar_log_key btree (foobarid)
chit_foobar_log_key btree (chit)


The table consists of approximately 2.4 million entries.


Any help is appreciated.
Kind regards, Alex

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bloated Table

2009-05-27 Thread Grzegorz Jaśkiewicz
On Wed, May 27, 2009 at 3:54 PM, Alexander Schöcke
a...@turtle-entertainment.de wrote:
 Hello everybody.

 I'm using a view 
 (http://pgsql.tapoueh.org/site/html/news/20080131.bloat.html) to display the 
 bloat (unused disk space) of the tables in a PostgreSQL database. Using this 
 data I want to implement a database maintenance script automatically 
 exectuting a VACUUM FULL on these tables.
 Unfortunately I am finding a table to have bloat which can't be reclaimed. I 
 have tried VACUUM, REINDEX, VACUUM FULL ANALYZE with REINDEX, and even dump 
 and restore. The view always shows 375MB of bloat for the table.

 Is this normal?
yes it is normal, if you do updates on it, and it doesn't use HOT (for
instance if you update indexed column),

now, in real life you shouldn't really ever need to vacuum full.
Reindex - yes, but not vacuum full.
Make sure, that you adjust fsm_* settings in your config, best thing
is tu run vacuum analyze verbose, and see if it warns you about it
being too small.




-- 
GJ

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bloated Table

2009-05-27 Thread Bill Moran
In response to Alexander Schöcke a...@turtle-entertainment.de:

 Hello everybody.
 
 I'm using a view 
 (http://pgsql.tapoueh.org/site/html/news/20080131.bloat.html) to display the 
 bloat (unused disk space) of the tables in a PostgreSQL database. Using this 
 data I want to implement a database maintenance script automatically 
 exectuting a VACUUM FULL on these tables.
 Unfortunately I am finding a table to have bloat which can't be reclaimed. I 
 have tried VACUUM, REINDEX, VACUUM FULL ANALYZE with REINDEX, and even dump 
 and restore. The view always shows 375MB of bloat for the table.
 
 Is this normal? Here's the table structure:
 
 Table public.foobar_log
Column   |   Type   | Modifiers
 +--+
 +--+---
  foorbarid  | integer  | not null default 
 nextval('foobar_log_id_seq'::regclass)
  created_at | timestamp with time zone | not null
  foo| character varying(50)| not null
  bar| character varying(16)| not null
  chit   | integer  | not null
  chat   | boolean  | not null default false
 Indexes:
 bar_index btree (bar)
 foobarid_foobar_log_key btree (foobarid)
 chit_foobar_log_key btree (chit)
 
 
 The table consists of approximately 2.4 million entries.
 
 Any help is appreciated.

What is the output of VACUUM VERBOSE foobar_log?

-- 
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Bloated Table

2009-05-27 Thread Alexander Schöcke
Hello everybody.

I'm using a view (http://pgsql.tapoueh.org/site/html/news/20080131.bloat.html) 
to display the bloat (unused disk space) of the tables in a PostgreSQL 
database. Using this data I want to implement a database maintenance script 
automatically exectuting a VACUUM FULL on these tables.
Unfortunately I am finding a table to have bloat which can't be reclaimed. I 
have tried VACUUM, REINDEX, VACUUM FULL ANALYZE with REINDEX, and even dump and 
restore. The view always shows 375MB of bloat for the table.

Is this normal? Here's the table structure:

Table public.foobar_log
   Column   |   Type   | Modifiers
+--+---
 foorbarid  | integer  | not null default 
nextval('foobar_log_id_seq'::regclass)
 created_at | timestamp with time zone | not null
 foo| character varying(50)| not null
 bar| character varying(16)| not null
 chit   | integer  | not null
 chat   | boolean  | not null default false
Indexes:
bar_index btree (bar)
foobarid_foobar_log_key btree (foobarid)
chit_foobar_log_key btree (chit)


The table consists of approximately 2.4 million entries.


Any help is appreciated.
Kind regards, Alex

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bloated Table

2009-05-27 Thread Tom Lane
=?iso-8859-1?Q?Alexander_Sch=F6cke?= a...@turtle-entertainment.de writes:
 I'm using a view
 (http://pgsql.tapoueh.org/site/html/news/20080131.bloat.html) to
 display the bloat (unused disk space) of the tables in a PostgreSQL
 database.

I wouldn't trust the calculations that view does in the least.
You might look at contrib/pgstattuple if you want numbers that
have some relationship to reality (and are correspondingly more
expensive to get :-()

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bloated Table

2009-05-27 Thread Alexander Schöcke
What is the output of VACUUM VERBOSE foobar_log?

INFO:  vacuuming public.foobar_log
INFO:  index bar_index now contains 23832007 row versions in 118151 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.02s/0.00u sec elapsed 64.10 sec.
INFO:  index foobarid_foobar_log_key now contains 23832007 row versions in 
65347 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.02s/0.00u sec elapsed 32.60 sec.
INFO:  index foo_foobar_log_key now contains 23832007 row versions in 65347 
pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.03s/0.00u sec elapsed 25.14 sec.
INFO:  foobar_log: found 0 removable, 23832007 nonremovable row versions in 
269835 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
1 pages contain useful free space.
0 pages are entirely empty.
CPU 0.16s/0.09u sec elapsed 299.55 sec.
VACUUM

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bloated Table

2009-05-27 Thread Brad Nicholson
On Wed, 2009-05-27 at 11:15 -0400, Tom Lane wrote:
 =?iso-8859-1?Q?Alexander_Sch=F6cke?= a...@turtle-entertainment.de writes:
  I'm using a view
  (http://pgsql.tapoueh.org/site/html/news/20080131.bloat.html) to
  display the bloat (unused disk space) of the tables in a PostgreSQL
  database.
 
 I wouldn't trust the calculations that view does in the least.
 You might look at contrib/pgstattuple if you want numbers that
 have some relationship to reality (and are correspondingly more
 expensive to get :-()
 
   regards, tom lane

Is the referenced query reliable for even estimating, or is it flat our
wrong?

Co-workers that were PGCon are saying that this is becoming a
popular/accepted way to check for bloated tables.

-- 
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bloated Table

2009-05-27 Thread Alvaro Herrera
Brad Nicholson wrote:
 On Wed, 2009-05-27 at 11:15 -0400, Tom Lane wrote:
  =?iso-8859-1?Q?Alexander_Sch=F6cke?= a...@turtle-entertainment.de writes:
   I'm using a view
   (http://pgsql.tapoueh.org/site/html/news/20080131.bloat.html) to
   display the bloat (unused disk space) of the tables in a PostgreSQL
   database.
  
  I wouldn't trust the calculations that view does in the least.
  You might look at contrib/pgstattuple if you want numbers that
  have some relationship to reality (and are correspondingly more
  expensive to get :-()
 
 Is the referenced query reliable for even estimating, or is it flat our
 wrong?
 
 Co-workers that were PGCon are saying that this is becoming a
 popular/accepted way to check for bloated tables.

If ma is supposed to be maxalign, then this code is broken because
it only reports mingw32 as 8, all others as 4, which is wrong.

However I think the big problem is that it relies on pg_class.relpages
and reltuples which are only accurate just after VACUUM, only a
sample-based estimate just after ANALYZE, and wrong at any other time
(assuming the table has any movement).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bloated Table

2009-05-27 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Brad Nicholson wrote:
 On Wed, 2009-05-27 at 11:15 -0400, Tom Lane wrote:
 I wouldn't trust the calculations that view does in the least.

 If ma is supposed to be maxalign, then this code is broken because
 it only reports mingw32 as 8, all others as 4, which is wrong.

I didn't bother to go through every detail of the calculations, but
there are multiple small errors there.  (Handling alignment for the
tuple header and not anyplace else is pretty pointless, for instance,
even if you had the correct alignment number for the machine.)
However ...

 However I think the big problem is that it relies on pg_class.relpages
 and reltuples which are only accurate just after VACUUM, only a
 sample-based estimate just after ANALYZE, and wrong at any other time
 (assuming the table has any movement).

That's big problem number one, and big problem number two is that it has
no good idea of the width of variable-width fields.  (Should I even
mention TOAST?)

It's an interesting exercise in trying to estimate bloat without
groveling through the whole relation, but I seriously doubt you could
ever get numbers this way that are trustworthy enough to drive
maintenance decisions.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bloated Table

2009-05-27 Thread Alvaro Herrera
Tom Lane wrote:

 It's an interesting exercise in trying to estimate bloat without
 groveling through the whole relation, but I seriously doubt you could
 ever get numbers this way that are trustworthy enough to drive
 maintenance decisions.

Well, pg_stattuple is way too expensive to be running every minute ...

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bloated Table

2009-05-27 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Tom Lane wrote:
 It's an interesting exercise in trying to estimate bloat without
 groveling through the whole relation, but I seriously doubt you could
 ever get numbers this way that are trustworthy enough to drive
 maintenance decisions.

 Well, pg_stattuple is way too expensive to be running every minute ...

Agreed.  I was idly wondering whether it'd be useful to have an option
in pg_stattuple to sample some small fraction of the pages in a
relation, rather than exhaustively examining each one.  It'd be subject
to all the normal ills of sampling, of course, but I'd still trust it
way more than this view.  (Especially noting that the view is already
dependent on sampling anyway, since it's looking at pg_stats...)

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bloated Table

2009-05-27 Thread Greg Smith

On Wed, 27 May 2009, Tom Lane wrote:


It's an interesting exercise in trying to estimate bloat without
groveling through the whole relation


Interesting and an extremely common request.  I just added an item to the 
Vacuum section of the TODO list while you were listing issues and 
potential solutions here:  Provide more information in order to improve 
user-side estimates of dead space bloat in relations links to this 
thread.


As pointed out upthread, there's a fair number of people who have picked 
up on this particular query for bloat estimation.  It seemed accurate 
enough for gross maintenance use when I did a quick check of its results 
before, but those were tables without variable widths, TOAST, etc.  This 
is one of those boring tasks that DBAs really want more 
monitoring-friendly visibility into.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bloated Table

2009-05-27 Thread Christophe


On May 27, 2009, at 3:25 PM, Greg Smith wrote:
Interesting and an extremely common request.  I just added an item  
to the Vacuum section of the TODO list while you were listing issues  
and potential solutions here:  Provide more information in order to  
improve user-side estimates of dead space bloat in relations links  
to this thread.


Given this, is there is a recommended way of detecting bloat in  
indices automatically, rather than just looking at them once in a  
while and saying, Hm, that looks pretty big?


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bloated Table

2009-05-27 Thread Tom Lane
Christophe x...@thebuild.com writes:
 Given this, is there is a recommended way of detecting bloat in  
 indices automatically, rather than just looking at them once in a  
 while and saying, Hm, that looks pretty big?

contrib/pgstattuple works; it's just too expensive to run every
few minutes ...

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general