On 2016-01-07 21:17:32 +0100, Andres Freund wrote: > On 2016-01-07 21:08:10 +0100, Fabien COELHO wrote: > > Hmmm. What I understood is that the workloads that have some performance > > regressions (regressions that I have *not* seen in the many tests I ran) are > > not due to checkpointer IOs, but rather in settings where most of the writes > > is done by backends or bgwriter. > > As far as I can see you've not run many tests where the hot/warm data > set is larger than memory (the full machine's memory, not > shared_buffers). That quite drastically alters the performance > characteristics here, because you suddenly have lots of synchronous read > IO thrown into the mix. > > Whether it's bgwriter or not I've not fully been able to establish, but > it's a working theory.
Hm. New theory: The current flush interface does the flushing inside FlushBuffer()->smgrwrite()->mdwrite()->FileWrite()->FlushContextSchedule(). The problem with that is that at that point we (need to) hold a content lock on the buffer! Especially on a system that's bottlenecked on IO that means we'll frequently hold content locks for a noticeable amount of time, while flushing blocks, without any need to. Even if that's not the reason for the slowdowns I observed, I think this fact gives further credence to the current "pending flushes" tracking residing on the wrong level. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers