I have an unusual need:  I need Pg to slow down. I know, we all want our DB to 
go faster, but in this case it's speed is working against me in 1 area.

We have systems that are geo-redundant for HA, with the redundancy being 
handled by DRBD to keep the disks in sync, which it does at the block level. 
For normal operations, it actually works out fairly well. That said, we 
recognize that what we really need to do is one of the forms of streaming (ch 
26 of the manual) which I believe would help this problem a lot if not solve it 
-- but we don't have the time to do that at the moment. I plan and hope to get 
there by the end of the year. The part that hurts so bad is when we do 
maintenance operations that are DB heavy, like deleting really old records out 
of archives (weekly), moving older records from current tables to archive 
tables plus an analyze (every night), running pg_backup (every night), other 
archiving (weekly), and vacuum full to remove bloat (once a quarter). All of 
this generates a lot of disk writes, to state the obvious.

The local server can handle it all just fine, but the network can't handle it 
as it tries to sync to the other server. Sometimes we can add network 
bandwidth, many times we can't as it depends on others. To borrow a phrase from 
the current times, we need to flatten the curve. 😊

A few parts of our maintenance process I've tamed by doing "nice -20" on the 
process (e.g. log rotation); but I can't really do that for Pg because the work 
gets handed off to a background process that's not a direct child process … and 
I don't want to slow the DB as a whole because other work is going on (like 
handling incoming data).

Part of the process I've slowed down by doing the work in chunks of 10K rows at 
a time with a pause between each chunk to allow the network to catch up 
(instead of an entire table in 1 statement). This sort of works, but some 
work/SQL is between hard to next-to-impossible to break up like that. That also 
produces some hard spikes, but that's better than the alternative (next 
sentence). Still, large portions of the process are hard to control and just 
punch the network to full capacity and hold it there for far too long.

So, do I have any other options to help slow down some of the Pg operations? Or 
maybe some other short-term mitigations we can do with Pg configurations? Or is 
this a case where we've already done all we can do and the only answer is move 
to WAL streaming as fast as possible?

If it matters, this is being run on Linux servers. Pg 12.2 is in final testing 
and will be rolled out to production soon -- so feel free to offer suggestions 
that only apply to 12.x.

Thanks,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.

Reply via email to