On Fri, Dec 3, 2021 at 3:01 AM Bossart, Nathan <bossa...@amazon.com> wrote: > > On 12/1/21, 6:48 PM, "Bharath Rupireddy" > <bharath.rupireddyforpostg...@gmail.com> wrote: > > +1 for the overall idea of making the checkpoint faster. In fact, we > > here at our team have been thinking about this problem for a while. If > > there are a lot of files that checkpoint has to loop over and remove, > > IMO, that task can be delegated to someone else (maybe a background > > worker called background cleaner or bg cleaner, of course, we can have > > a GUC to enable or disable it). The checkpoint can just write some > > Right. IMO it isn't optimal to have critical things like startup and > checkpointing depend on somewhat-unrelated tasks. I understand the > desire to avoid adding additional processes, and maybe it is a bigger > hammer than what is necessary to reduce the impact, but it seemed like > a natural solution for this problem. That being said, I'm all for > exploring other ways to handle this.
Having a generic background cleaner process (controllable via a few GUCs), which can delete a bunch of files (snapshot, mapping, old WAL, temp files etc.) or some other task on behalf of the checkpointer, seems to be the easiest solution. I'm too open for other ideas. > > Another idea could be to parallelize the checkpoint i.e. IIUC, the > > tasks that checkpoint do in CheckPointGuts are independent and if we > > have some counters like (how many snapshot/mapping files that the > > server generated) > > Could you elaborate on this? Is your idea that the checkpointer would > create worker processes like autovacuum does? Yes, I was thinking that the checkpointer creates one or more dynamic background workers (we can assume one background worker for now) to delete the files. If a threshold of files crosses (snapshot files count is more than this threshold), the new worker gets spawned which would then enumerate the files and delete the unneeded ones, the checkpointer can proceed with the other tasks and finish the checkpointing. Having said this, I prefer the background cleaner approach over the dynamic background worker. The advantage with the background cleaner being that it can do other tasks (like other kinds of file deletion). Another idea could be that, use the existing background writer to do the file deletion while the checkpoint is happening. But again, this might cause problems because the bg writer flushing dirty buffers will get delayed. Regards, Bharath Rupireddy.