On Sun, Dec 13, 2015 at 4:24 AM, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote:
> > > On 12/12/2015 11:39 PM, Andres Freund wrote: > >> On 2015-12-12 23:28:33 +0100, Tomas Vondra wrote: >> >>> On 12/12/2015 11:20 PM, Andres Freund wrote: >>> >>>> On 2015-12-12 22:14:13 +0100, Tomas Vondra wrote: >>>> >>>>> this is the second improvement proposed in the thread [1] about ext4 >>>>> data >>>>> loss issue. It adds another field to control file, tracking the last >>>>> known >>>>> WAL segment. This does not eliminate the data loss, just the silent >>>>> part of >>>>> it when the last segment gets lost (due to forgetting the rename, >>>>> deleting >>>>> it by mistake or whatever). The patch makes sure the cluster refuses to >>>>> start if that happens. >>>>> >>>> >>>> Uh, that's fairly expensive. In many cases it'll significantly >>>> increase the number of fsyncs. >>>> >>> >>> It should do exactly 1 additional fsync per WAL segment. Or do you think >>> otherwise? >>> >> >> Which is nearly doubling the number of fsyncs, for a good number of >> workloads. And it does so to a separate file, i.e. it's not like >> these writes and the flushes can be combined. In workloads where >> pg_xlog is on a separate partition it'll add the only source of >> fsyncs besides checkpoint to the main data directory. >> > > I also think so. > I doubt it will make any difference in practice, at least on reasonable > hardware (which you should have, if fsync performance matters to you). > > But some performance testing will be necessary, I don't expect this to go > in without that. It'd be helpful if you could describe the workload. > > I think to start with you can try to test pgbench read-write workload when the data fits in shared_buffers. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com