Re: Would it be possible to have parallel archiving?

David Steele Tue, 28 Aug 2018 07:27:57 -0700

On 8/28/18 8:32 AM, Stephen Frost wrote:
> 
> * hubert depesz lubaczewski (dep...@depesz.com) wrote:
>> I'm in a situation where we quite often generate more WAL than we can
>> archive. The thing is - archiving takes long(ish) time but it's
>> multi-step process and includes talking to remote servers over network.
>>
>> I tested that simply by running archiving in parallel I can easily get
>> 2-3 times higher throughput.
>>
>> But - I'd prefer to keep postgresql knowing what is archived, and what
>> not, so I can't do the parallelization on my own.
>>
>> So, the question is: is it technically possible to have parallel
>> archivization, and would anyone be willing to work on it (sorry, my
>> c skills are basically none, so I can't realistically hack it myself)
> 
> Not entirely sure what the concern is around "postgresql knowing what is
> archived", but pgbackrest already does exactly this parallel archiving
> for environments where the WAL volume is larger than a single thread can
> handle, and we've been rewriting it in C specifically to make it fast
> enough to be able to keep PG up-to-date regarding what's been pushed
> already.


To be clear, pgBackRest uses the .ready files in archive_status to
parallelize archiving but still notifies PostgreSQL of completion via
the archive_command mechanism.  We do not modify .ready files to .done
directly.

However, we have optimized the C code to provide ~200
notifications/second (3.2GB/s of WAL transfer) which is enough to keep
up with the workloads we have seen.  Larger WAL segment sizes in PG11
will theoretically increase this to 200GB/s, though in practice CPU to
do the compression will become a major bottleneck, not to mention
network, etc.

Regards,
-- 
-David
da...@pgmasters.net

signature.asc
Description: OpenPGP digital signature

Re: Would it be possible to have parallel archiving?

Reply via email to