Hi all, Thanks Thomas.
When the bgwriter flushes (cleans) a dirty Postgres buffer, it generates a write() syscall of its own, which I think must increase the number of dirty cache buffers in the Linux kernel (temporarily, until it actually flushes those cache buffers to disk). Therefore it temporarily increases the risk of a write stall (in any process, not just Postgres backends), is that correct? I suppose that if dirty buffers are being cleaned regularly, then it reduces the risk that (1) a Postgres backend which is writing (dirtying buffers) suddenly needs an empty buffer when there are no clean buffers to evict, so it needs to flush a dirty one and (2) the resulting write() syscall would take the kernel over its background dirty limit, so the kernel must flush it immediately, and make the backend wait. By that mechanism I can see that it might reduce the chance of backends having to wait, but by writing more in general (as above) it could also increase it. So when it says "It writes shared buffers so server processes handling user queries seldom or never need to wait for a write to occur", is that really justified, or is that sentence incorrect and we should remove it? Or have I missed something? Thanks, Chris. On Sun, 1 Nov 2020 at 21:00, Thomas Munro <thomas.mu...@gmail.com> wrote: > On Fri, Oct 30, 2020 at 11:24 AM PG Doc comments form > <nore...@postgresql.org> wrote: > > The following documentation comment has been logged on the website: > > > > Page: https://www.postgresql.org/docs/13/runtime-config-resource.html > > Description: > > > > > https://www.postgresql.org/docs/13/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-BACKGROUND-WRITER > > > > says: > > > > "There is a separate server process called the background writer, whose > > function is to issue writes of “dirty” (new or modified) shared buffers. > It > > writes shared buffers so server processes handling user queries seldom or > > never need to wait for a write to occur." > > > > It's not clear what "wait for a write to occur" means: a write() syscall > or > > an fsync() syscall? > > It means pwrite(). That could block if your kernel cache is swamped, > but hopefully it just copies the data into the kernel and returns. > There is an fsync() call, but it's usually queued up for handling by > the checkpointer process some time later. >