Re: [GENERAL] Table using more disk space than expected

2015-09-23 Thread Francisco Olarte
Hi Steve:

On Wed, Sep 23, 2015 at 7:25 PM, Steve Pritchard
 wrote:
> I thought that autovacuum should recover the free space, however I see now
> from the documentation that it doesn't (and that this is deliberate):

> I'll do a VACUUM FULL, which I expect to reduce table_len.

Full is for when you've done a drastic reduction on a table. Some
tables oscillate in size, grow and shrink and regrow.., those do not
benefit of vacuum full on the long run, because if you have a table
which oscilates between , let's say, 10G and 20G you need 20G of disk
space, if you shrink and fill the rest with other uses server will
crash on next growth ( some very special cases may be different, but
in general if you have free space is because you create/delete, be it
directly or via MVCC updates, so having it there for next usage is not
so bad ).

Francisco Olarte.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Table using more disk space than expected

2015-09-23 Thread Steve Pritchard
Tom,

Thanks for the tip about pgstattuple - I hadn't discovered that (and I
hadn't realised that it's bundled in the 9.4.4 distribution).

This is what I get:

SELECT * FROM pgstattuple('observation');

table_len   21,954,740,224
tuple_count 34,373,274
tuple_len  9,307,650,026
tuple_percent 42
dead_tuple_count198,281
dead_tuple_len 52,524,135
dead_tuple_percent  0.2
free_space 12,093,978,284
free_percent 55

So I think my estimate of the space used by live rows wasn't far out
(albeit I accept the method I used may not be reliable).

I thought that autovacuum should recover the free space, however I see now
from the documentation that it doesn't (and that this is deliberate):

"The standard form of VACUUM removes dead row versions in tables and
indexes and marks the space available for future reuse. However, it will
not return the space to the operating system, except  In contrast,
VACUUM FULL actively compacts tables by writing a complete new version of
the table file with no dead space. This minimizes the size of the table,
but can take a long time...
The usual goal of routine vacuuming is to do standard VACUUMs often enough
to avoid needing VACUUM FULL. The autovacuum daemon attempts to work this
way, and in fact will never issue VACUUM FULL."
http://www.postgresql.org/docs/9.4/static/routine-vacuuming.html#VACUUM-FOR-SPACE-RECOVERY

I'll do a VACUUM FULL, which I expect to reduce table_len.

Steve

On 23 September 2015 at 17:50, Tom Lane  wrote:

> Steve Pritchard  writes:
> > -- Average length of a row in bytes:
> > select avg(octet_length(t.*::text)) FROM observation t;
> > -- 287 bytes
>
> That measurement technique doesn't have a lot to do with reality,
> I'm afraid.
>
> The contrib/pgstattuple module could give you a more reliable idea of
> how much space is in use or not in the table.
>
> regards, tom lane
>


Re: [GENERAL] Table using more disk space than expected

2015-09-23 Thread Jimit Amin
Could you please check this after running Vacuum Analyze. I know there may
not be big difference.

As par my analysis this is free space available in table but not free in
respect to server space.
Like table contains 3 type of space.

1 Live row space
2 Dead row space
3 Free space available for that table (before vacuum it is dead row.. After
vacuum it is free to use for new insert or update)

Vacuum Full will give this space back to server.(Exclusively locking of
table)
On 23 Sep 2015 22:07, "Steve Pritchard"  wrote:

> I have a large table in Postgres 9.4.4 called 'observation', which is by
> far the largest item in the database. When I pg_dump this table the
> resulting file is about 9GB. However I've noticed that the pg data
> directory uses about 30GB (excluding pg_xlog).
>
> Looking at the space usage:
>
> -- Size of 'warehouse' database:
> select pg_size_pretty(pg_database_size('warehouse'));
> -- 29GB
>
> -- Total space used by observation table* including indexes*:
> select pg_size_pretty(pg_total_relation_size('observation'));
> -- 29GB
>
> -- Excluding indexes:
> select pg_size_pretty(pg_relation_size('observation'));
> -- 20GB
>
> -- Percentage of dead tuples:
> select pg_stat_get_dead_tuples('observation'::regclass)::numeric * 100 /
> pg_stat_get_live_tuples('observation'::regclass);
> -- 13%
>
> -- Average length of a row in bytes:
> select avg(octet_length(t.*::text)) FROM observation t;
> -- 287 bytes
>
> -- Number of rows * average size of row:
> select pg_size_pretty(count(obs_id) * 287) from observation;
> -- 9.4 GB
>
> If the live tuples take 9.4GB, then dead ones take 1.2GB = 10.6GB total.
>
> What accounts for the remaining 9.4GB? (20GB - 10.6GB)
>
> Steve Pritchard
> British Trust for Ornithology
>


Re: [GENERAL] Table using more disk space than expected

2015-09-23 Thread Tom Lane
Steve Pritchard  writes:
> -- Average length of a row in bytes:
> select avg(octet_length(t.*::text)) FROM observation t;
> -- 287 bytes

That measurement technique doesn't have a lot to do with reality,
I'm afraid.

The contrib/pgstattuple module could give you a more reliable idea of
how much space is in use or not in the table.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Table using more disk space than expected

2015-09-23 Thread Steve Pritchard
I have a large table in Postgres 9.4.4 called 'observation', which is by
far the largest item in the database. When I pg_dump this table the
resulting file is about 9GB. However I've noticed that the pg data
directory uses about 30GB (excluding pg_xlog).

Looking at the space usage:

-- Size of 'warehouse' database:
select pg_size_pretty(pg_database_size('warehouse'));
-- 29GB

-- Total space used by observation table* including indexes*:
select pg_size_pretty(pg_total_relation_size('observation'));
-- 29GB

-- Excluding indexes:
select pg_size_pretty(pg_relation_size('observation'));
-- 20GB

-- Percentage of dead tuples:
select pg_stat_get_dead_tuples('observation'::regclass)::numeric * 100 /
pg_stat_get_live_tuples('observation'::regclass);
-- 13%

-- Average length of a row in bytes:
select avg(octet_length(t.*::text)) FROM observation t;
-- 287 bytes

-- Number of rows * average size of row:
select pg_size_pretty(count(obs_id) * 287) from observation;
-- 9.4 GB

If the live tuples take 9.4GB, then dead ones take 1.2GB = 10.6GB total.

What accounts for the remaining 9.4GB? (20GB - 10.6GB)

Steve Pritchard
British Trust for Ornithology