Hi,
> Sure, that's what the WAL does. But you still have to checkpoint > eventually. > > Sure, when you run pg_ctl stop. Unlike the WAL it only needs two files, shared_buffers size. I did bogus tests by replacing mask |= BM_PERMANENT; with mask = -1 in BufferSync() and simulating checkpoint with a periodic dd if=/dev/zero of=foo conv=fsync On a saturated storage with %usage locked solid at 100% I got up to 30% speed improvement and fsync latency down by one order of magnitude, some fsync were still slow of course if buffers were already in OS cache. But it's the upper bound, it's was done one a slow storage with bad ratios : (OS cache write)/(disk sequential write) in 50, (sequential write)/(effective random write) in 10 range and a proper implementation would have a 'little' more work to do... (only checkpoint task can write BM_CHECKPOINT_NEEDED buffers keeping them dirty and so on) Didier