[GENERAL] Strange results with pg_restore

Oscar Calderon Thu, 17 Oct 2013 08:52:50 -0700

Have a good day. This friday i'm going to migrate an entire database of a
government institution in my country. Those are like 4 database of 2GB each
one. So, i was preparing about what i'm going to do tomorrow. They
currently have PostgreSQL 9.1 installed from source i think, and they will
update RHEL version to 5.7, it will be a reinstall of operative system, so
before that i will backup the databases, copy contents of pg_hba.conf and
so on, and after the reinstallation of RHEL i'm going to install PostgreSQL
9.3 from official repositories alongside contrib and plpython.


I've already tested that on a Centos 5.7 VM and it works fine. But well,
yesterday i was practicing creation of backups and restoring backups in my
laptop with the next specs:

Core i7 2.2 Ghz
8GB RAM
30GB empty space
Windows 8.1

In my laptop i also have PostgreSQL 9.3. First i used the sample database
pgdellstore, what i did is that i created a database, and ran the SQL file
of the database. After that, i created a backup of that database
pgdellstore with pg_dump, a custom format, without compression:

pg_dump -h localhost -p 5432 -U postgres -W -Fc -Z0 -C -d pgdellstore >
pgdellstore.backup

And everything as normal. After that i created a second database and
restored my custom backup with pg_restore trying to speed up using -j
option:

pg_restore -h localhost -p 5432 -U postgres -W -d pgdellstore -j4
pgdellstore.backup

Also, before that, i backed up my postgresql.conf and edited the current
one with the next parameters, according to some research that i did in
google about performance tuning of pg_restore:

shared_buffers=2GB
maintenance_work_mem=1GB
work_mem=128MB
wal_buffers=16MB
checkpoint_segments=8
autovacuum=off
archive_mode=off
fsync=off
full_page_writes=off
checkpoint_timeout=15min
checkpoint_completion_target=0.9
track_counts=off
synchronous_commit=off
bgwriter_delay=50ms

And it restored it in 2 seconds i think, for me it was extremely fast (is a
database of 25MB aproximately) and i had doubts, so i have some queries to
check database tables size and i compared the results of the first database
(the one that i created with original sql script) and the second one (the
one that i created using a custom backup of the first one) and i got this:

DB created with original SQL script:

 schemaname |  tablename   | reltuples |  tamanio   | tamanioord
------------+--------------+-----------+------------+------------
 public     | customers    |     20000 | 5016 kB    |    5136384
 public     | orderlines   |     60350 | 4440 kB    |    4546560
 public     | cust_hist    |     60350 | 3976 kB    |    4071424
 public     | products     |     10000 | 1552 kB    |    1589248
 public     | orders       |     12000 | 1384 kB    |    1417216
 public     | inventory    |     10000 | 704 kB     |     720896
 public     | categories   |        16 | 24 kB      |      24576
 public     | afiliado     |         4 | 24 kB      |      24576
 public     | pruebafechas |         0 | 8192 bytes |       8192
 public     | reorder      |         0 | 0 bytes    |          0
(10 filas)

DB created with custom backup based on first db

 schemaname |  tablename   | reltuples |  tamanio   | tamanioord
------------+--------------+-----------+------------+------------
 public     | customers    |     20000 | 4992 kB    |    5111808
 public     | orderlines   |     60350 | 4416 kB    |    4521984
 public     | cust_hist    |     60350 | 3952 kB    |    4046848
 public     | products     |     10000 | 1528 kB    |    1564672
 public     | orders       |     12000 | 1360 kB    |    1392640
 public     | inventory    |     10000 | 680 kB     |     696320
 public     | afiliado     |         4 | 24 kB      |      24576
 public     | categories   |        16 | 24 kB      |      24576
 public     | pruebafechas |         0 | 8192 bytes |       8192
 public     | reorder      |         0 | 0 bytes    |          0
(10 filas)

This is the query that i used to get those results:

SELECT tbl.schemaname, tbl.tablename, obj.reltuples
, pg_size_pretty(pg_total_relation_size(text('"' || tbl.schemaname || '"."'
|| tbl.tablename || '"'))) tamanio, pg_total_relation_size(text('"' ||
tbl.schemaname || '"."' || tbl.tablename || '"')) tamanioord
FROM pg_tables tbl, pg_class obj WHERE tbl.tablename = obj.relname
AND tbl.schemaname NOT IN ('pg_catalog', 'information_schema') ORDER BY
tamanioord DESC;

As you can see, there are certain variations in size, the number of tuples
remains, but the fields tamanio (formatted size) and tamanioord (size
without format) have a difference, for example the table customers has 5016
kB versus 4992 kB . I'm afraid that in someway i lost data or something
like that, or according to your experience that can be normal and doesn't
mean that information is corrupted or something, because tomorrow when i
will do that with productions databases it will be a chaos if information
gets corrupted.

Also, my second concern is the next: 4 months ago, i did a routine
maintenance of those production database, and first as usual i backup all
the databases. I took one of the backups of a development database for
testing purposes. The backup file weights 279MB and the database is like
1.8 GB  . So, yesterday i also practiced restoring this backup with
pg_restore in a new database, with the same configuration from above. The
command used was this:

pg_restore -h localhost -p 5432 -U postgres -W --create -d debdbrest -j4
devdb.backup

And it took only 1:30 mins aproximately. For me is suspiciously fast becuse
when i was reading about pg_restore and how to enhace performance during
backup restoring, i found various cases in previous postgresql versions
about pg_restore is too slow, and also, i tried the same restore of the
same backups 2 days ago but without postgresql.conf parameters modified and
without using -j flag, and it started and after 2 hours it hadn't finished,
so i cancelled; also in the first try i noted that it printed a lot of
output in the console and in this second try it didn't print anything.

But that time reduction from more than 2 hours to 1:30 mins is too
suspicious for me, but i don't know if, as the previous question that i
have, you consider that this is normal (a backup of 279MB of a database of
1840MB restored in just 1.30 min).

Regards.

***************************
Oscar Calderon
Analista de Sistemas
Soluciones Aplicativas S.A. de C.V.
www.solucionesaplicativas.com
Cel. (503) 7741 7850 Tel. (503) 2522-2834

[GENERAL] Strange results with pg_restore

Reply via email to