Re: [PERFORM] Identifying bloated tables

2006-08-28 Thread Peter Childs

On 28/08/06, Michal Taborsky - Internet Mall <[EMAIL PROTECTED]> wrote:

Markus Schaber napsal(a):
> Hi, Michal,
>
> Michal Taborsky - Internet Mall wrote:
>
>> When using this view, you are interested in tables, which have the
>> "bloat" column higher that say 2.0 (in freshly dump/restored/analyzed
>> database they should all be around 1.0).
>
> I just noticed some columns in pg_catalog with a bloat value <1 and a
> negative "wasted space" - is this due to the pseudo nature of them?

It is more likely due to the fact, that these numbers are just
estimates, based on collected table statistics, so for small or
non-standard tables the statistical error is greater that the actual
value. You are usually not interested in tables, which have wasted space
of 1000kB or -1000kB. Also the database must be ANALYZEd properly for
these numbers to carry any significance.



I was just playing around with this table and noticed it preforms the
badly in tables with very small record sizes. This seams to be because
it ignores the system overhead (oid, xmin ctid etc) which seams to be
about 28 bytes per a record this can be quite significate in small
record tables and can cause trouble even with a smal numbers of
record.  Hence I've got a table thats static and fresly "vacuum full"
which reads with a bloat of 4.

Easy to recreate problem to

Create table regionpostcode (area varchar(4), regionid int);

then insert 12 records.

Peter.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Identifying bloated tables

2006-08-28 Thread Michal Taborsky - Internet Mall

Markus Schaber napsal(a):

Hi, Michal,

Michal Taborsky - Internet Mall wrote:


When using this view, you are interested in tables, which have the
"bloat" column higher that say 2.0 (in freshly dump/restored/analyzed
database they should all be around 1.0).


I just noticed some columns in pg_catalog with a bloat value <1 and a
negative "wasted space" - is this due to the pseudo nature of them?


It is more likely due to the fact, that these numbers are just 
estimates, based on collected table statistics, so for small or 
non-standard tables the statistical error is greater that the actual 
value. You are usually not interested in tables, which have wasted space 
of 1000kB or -1000kB. Also the database must be ANALYZEd properly for 
these numbers to carry any significance.


--
Michal Táborský
IT operations chief
Internet Mall, a.s.

Internet Mall - obchody, které si oblíbíte


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Identifying bloated tables

2006-08-28 Thread Markus Schaber
Hi, Michal,

Michal Taborsky - Internet Mall wrote:

> When using this view, you are interested in tables, which have the
> "bloat" column higher that say 2.0 (in freshly dump/restored/analyzed
> database they should all be around 1.0).

I just noticed some columns in pg_catalog with a bloat value <1 and a
negative "wasted space" - is this due to the pseudo nature of them?

Markus


-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Identifying bloated tables

2006-08-28 Thread Alvaro Herrera
Brad Nicholson wrote:
> On Mon, 2006-08-28 at 16:39 +0200, Michal Taborsky - Internet Mall
> wrote:
> > I just put together a view, which helps us in indentifying which 
> > database tables are suffering from space bloat, ie. they take up much 
> > more space than they actually should. I though this might be useful for 
> > some folk here, because the questions about bloat-related performance 
> > degradation are quite common.
> 
> Are you sure you haven't reinvented the wheel?  Have you checked out
> contrib/pgstattuple ?

Actually, pgstattuple needs to scan the whole table, so I think having a
cheap workaround that gives approximate figures is a good idea anyway.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Identifying bloated tables

2006-08-28 Thread Michal Taborsky - Internet Mall

Brad Nicholson napsal(a):
I just put together a view, which helps us in indentifying which 
database tables are suffering from space bloat, ie. they take up much 



Are you sure you haven't reinvented the wheel?  Have you checked out
contrib/pgstattuple ?


Well, I wasn't aware of it, so I guess I did reinvent the wheel. I 
Googled for a solution to this problem, but Googled poorly I suppose.


On the other hand, pgstattuple might be a bit difficult to use for 
not-so-experienced users in answering the question "Which table should I 
shrink?", as you have to first install it from contrib and then come up 
with a select to pick the "worst" relations.


Anyway, if someone finds this view useful, good. If not, ignore it.

Bye.

--
Michal Táborský
IT operations chief
Internet Mall, a.s.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Identifying bloated tables

2006-08-28 Thread Brad Nicholson
On Mon, 2006-08-28 at 16:39 +0200, Michal Taborsky - Internet Mall
wrote:
> I just put together a view, which helps us in indentifying which 
> database tables are suffering from space bloat, ie. they take up much 
> more space than they actually should. I though this might be useful for 
> some folk here, because the questions about bloat-related performance 
> degradation are quite common.

Are you sure you haven't reinvented the wheel?  Have you checked out
contrib/pgstattuple ?

Brad.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Identifying bloated tables

2006-08-28 Thread Michal Taborsky - Internet Mall
I just put together a view, which helps us in indentifying which 
database tables are suffering from space bloat, ie. they take up much 
more space than they actually should. I though this might be useful for 
some folk here, because the questions about bloat-related performance 
degradation are quite common.


When using this view, you are interested in tables, which have the 
"bloat" column higher that say 2.0 (in freshly dump/restored/analyzed 
database they should all be around 1.0).


The bloat problem can be one-time fixed either by VACUUM FULL or 
CLUSTER, but if the problem is coming back after while, you should 
consider doing VACUUM more often or increasing you FSM settings in 
postgresql.conf.


I hope I did the view right, it is more or less accurate, for our 
purposes (for tables of just few pages the numbers may be off, but then 
again, you are usually not much concerned about these tiny 5-page tables 
performance-wise).


Hope this helps someone.

Here comes the view.


CREATE OR REPLACE VIEW "public"."relbloat" (
nspname,
relname,
reltuples,
relpages,
avgwidth,
expectedpages,
bloat,
wastedspace)
AS
SELECT pg_namespace.nspname, pg_class.relname, pg_class.reltuples,
pg_class.relpages, rowwidths.avgwidth, ceil(((pg_class.reltuples *
(rowwidths.avgwidth)::double precision) /
(current_setting('block_size'::text))::double precision)) AS 
expectedpages,

((pg_class.relpages)::double precision / ceil(((pg_class.reltuples *
(rowwidths.avgwidth)::double precision) /
(current_setting('block_size'::text))::double precision))) AS bloat,
ceil(pg_class.relpages)::double precision *
(current_setting('block_size'::text))::double precision) -
ceil((pg_class.reltuples * (rowwidths.avgwidth)::double precision))) /
(1024)::double precision)) AS wastedspace
FROM (((
SELECT pg_statistic.starelid, sum(pg_statistic.stawidth) AS avgwidth
FROM pg_statistic
GROUP BY pg_statistic.starelid
) rowwidths JOIN pg_class ON ((rowwidths.starelid = pg_class.oid))) 
JOIN

pg_namespace ON ((pg_namespace.oid = pg_class.relnamespace)))
WHERE (pg_class.relpages > 1);


Bye.

--
Michal Táborský
IT operations chief
Internet Mall, a.s.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly