On 1/11/12 4:33 AM, Florian Weimer wrote:
Isn't this pretty much like tuning vm.dirty_bytes?  We generally set it
to pretty low values, and seems to help to smoothen the checkpoints.

When I experimented with dropping the actual size of the cache, checkpoint spikes improved, but things like VACUUM ran terribly slow. On a typical medium to large server nowadays (let's say 16GB+), PostgreSQL needs to have gigabytes of write cache for good performance.

What we're aiming to here is keep the benefits of having that much write cache, while allowing checkpoint related work to send increasingly strong suggestions about ordering what it needs written soon. There's basically three primary states on Linux to be concerned about here:

Dirty:  in the cache via standard write
|
v  pdflush does writeback at 5 or 10% dirty || sync_file_range push
|
Writeback
|
v  write happens in the background || fsync call
|
Stored on disk

The systems with bad checkpoint problems will typically have gigabytes "Dirty", which is necessary for good performance. It's very lazy about pushing things toward "Writeback" though. Getting the oldest portions of the outstanding writes into the Writeback queue more aggressively should make the eventual fsync less likely to block.


--
Greg Smith   2ndQuadrant US    g...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to