Re: [GENERAL] PG backup performance

Andy Colson Mon, 31 May 2010 13:57:29 -0700

On 05/31/2010 02:45 PM, Isabella Ghiurea wrote:


Hi Andy,
Thank you , please, see bellow my answers:
Andy Colson wrote:


On 05/31/2010 11:05 AM, Isabella Ghiurea wrote:
> Hello PG list,
> I 'm looking for some tip, advice toimprove PG backups performance,
> presently running
> pg_dumpall compressed option on raid array 0 getting aprox14GB
writes in
> 45 min, I'm backing up aprox 200GB database cluster daily .
> How can I improve this performance with the present hardware and PG
> version 8.3.6 , can I run parallel backups in PG ?
> Thank you
> Isabella
>

Short answer, yes, you can.
Long answer, we need more info.

We need to know what the slow part is.

Are you CPU bound

CPU


or IO bound?
Are you backing up over a network?

No , on locally disks


How many tables? (well, big tables... how many really big tables).

Around 20 big tables all of them in a one separate schema on separate
tables space from rest of other schemas. I already start backing up
individuals schema.
My big concern is IF I have to recover from backups will take me at
least twice as much time around 1-2 days I expect.


How many cores/cpu's do you have?

4 quad core CPU server. How can I make PG to use multi threading?



I bet all 200GB hardly changes, are you sure you need to back it all
up over and over again? Have you thought of replication?

Yes, but waiting for a more robust build in replication PG version aka
PG 9.1.
Isabella



cool.. my second astronomer :-)

You are cpu bound because pg_dump will use one cpu, and if you have pg_dump 
compress, it uses the same cpu.  But if you use a pipe (pg_dump | gzip) then 
each gets a cpu.  And instead of using pg_dumpall, which dumps one db at a 
time, use pg_dump and run several in parallel.

try something like:

pg_dump big1 | gzip > big1.sql & pg_dump big2 | gzip > big2.sql

This will dump two at a time using 4 cpu's.

Have a backup file for each big table will also make restore faster, because 
you can restore them all at the same time, where as pg_dumpall will restore one 
at a time.

The compression is probably going to take the most time... if your data is not 
very compressible then you'll get a big speed boost by dropping the compress.

-Andy

--
Sent via pgsql-general mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PG backup performance

Reply via email to