On Sun, Jan 1, 2012 at 23:09, Daniel Farina <dan...@heroku.com> wrote:
> On Sun, Jan 1, 2012 at 6:13 AM, Magnus Hagander <mag...@hagander.net> wrote:
>> It also doesn't affect backups taken through pg_basebackup - but I
>> guess you have good reasons for not being able to use that?
>
> Parallel archiving/de-archiving and segmentation of the backup into
> pieces and rate limiting are the most clear gaps.  I don't know if
> there are performance implications either, but I do pass all my bytes
> through unoptimized Python right now -- not exactly a speed demon.
>
> The approach I use is:
>
> * Scan the directory tree immediately after pg_start_backup, taking
> notes of existent files and sizes
> * Split those files into volumes, none of which can exceed 1.5GB.
> These volumes are all disjoint
> * When creating the tar file, set the header for a tar member to have
> as many bytes as recorded in the first pass.  If the file has been
> truncated, pad with zeros (this is also the behavior of GNU Tar).  If
> it grew, only read the number of bytes recorded.
> * Generate and compress these tar files in parallel
> * All the while, the rate of reading files is subject to optional rate 
> limiting

Well, that certainly goes to enough detail to agree that no, that
can't be done with only minor modifications to pg_basebackup. Nor
could it be done with your python program talking directly to the
walsender backend and get around it that way. But you probably already
considered that :D


> As important is the fact that each volume can be downloaded and
> decompressed in a pipeline (no on-disk transformations to de-archive)
> with a tunable amount of concurrency, as all that tar files do not
> overlap for any file, and no file needs to span two tar files thanks
> to Postgres's refusal to deal in files too large for old platforms.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to