drbdsetup allows you to control the sync rates. On Tue, Apr 21, 2020 at 14:30 Kevin Brannen <kbran...@efji.com> wrote:
> I have an unusual need: I need Pg to slow down. I know, we all want our > DB to go faster, but in this case it's speed is working against me in 1 > area. > > > > We have systems that are geo-redundant for HA, with the redundancy being > handled by DRBD to keep the disks in sync, which it does at the block > level. For normal operations, it actually works out fairly well. That said, > we recognize that what we really need to do is one of the forms of > streaming (ch 26 of the manual) which I believe would help this problem a > lot if not solve it -- but we don't have the time to do that at the moment. > I plan and hope to get there by the end of the year. The part that hurts so > bad is when we do maintenance operations that are DB heavy, like deleting > really old records out of archives (weekly), moving older records from > current tables to archive tables plus an analyze (every night), running > pg_backup (every night), other archiving (weekly), and vacuum full to > remove bloat (once a quarter). All of this generates a lot of disk writes, > to state the obvious. > > > > The local server can handle it all just fine, but the network can't handle > it as it tries to sync to the other server. Sometimes we can add network > bandwidth, many times we can't as it depends on others. To borrow a phrase > from the current times, we need to flatten the curve. 😊 > > > > A few parts of our maintenance process I've tamed by doing "nice -20" on > the process (e.g. log rotation); but I can't really do that for Pg because > the work gets handed off to a background process that's not a direct child > process … and I don't want to slow the DB as a whole because other work is > going on (like handling incoming data). > > > > Part of the process I've slowed down by doing the work in chunks of 10K > rows at a time with a pause between each chunk to allow the network to > catch up (instead of an entire table in 1 statement). This sort of works, > but some work/SQL is between hard to next-to-impossible to break up like > that. That also produces some hard spikes, but that's better than the > alternative (next sentence). Still, large portions of the process are hard > to control and just punch the network to full capacity and hold it there > for far too long. > > > > So, do I have any other options to help slow down some of the Pg > operations? Or maybe some other short-term mitigations we can do with Pg > configurations? Or is this a case where we've already done all we can do > and the only answer is move to WAL streaming as fast as possible? > > > > If it matters, this is being run on Linux servers. Pg 12.2 is in final > testing and will be rolled out to production soon -- so feel free to offer > suggestions that only apply to 12.x. > > > > Thanks, > > Kevin > This e-mail transmission, and any documents, files or previous e-mail > messages attached to it, may contain confidential information. If you are > not the intended recipient, or a person responsible for delivering it to > the intended recipient, you are hereby notified that any disclosure, > distribution, review, copy or use of any of the information contained in or > attached to this message is STRICTLY PROHIBITED. If you have received this > transmission in error, please immediately notify us by reply e-mail, and > destroy the original transmission and its attachments without reading them > or saving them to disk. Thank you. > -- "Genius might be described as a supreme capacity for getting its possessors into trouble of all kinds." -- Samuel Butler