On Thu, Nov 12, 2009 at 6:27 PM, Simon Riggs <si...@2ndquadrant.com> wrote: > I agree with you, though it has taken some time to understand what you > said and at first my reaction was to disagree. I think the responses you > got on this are because you dived straight in with a question before > explaining other things around this.
Thanks for clarifying this topic ;) > If recovery starts reading WAL records that have not been fsynced then > we may need to flush a shared buffer to disk that depends upon a > non-fsynced(yet) WAL record. Fsyncing WAL after *every* WAL record is > going to make performance suck even worse and is completely out of the > question. So implementing the fsync-WAL-before-buffer-flush rule during > recovery makes much more sense. It's also only small change during > XlogFlush(). Agreed. This approach has lesser impact on the performance. But, as I said on my first post on this thread, even such low-frequent fsync-WAL-before-buffer-flush might cause a response time spike on the primary because the walreceiver must sleep during that fsync. I think that leaving the WAL-logging business to another process like walwriter is a good idea for reducing further the impact on the walreceiver; In typical case, * The walreceiver receives WAL records, returns the ACK to the primary, saves them in the wal_buffers, and lets the startup process know the arrival. * The walwriter writes and fsyncs the WAL records in the wal_buffers. * The startup process applies the WAL records in the wal_buffers when it receives the notice of the arrival. * The startup process and bgwriter fsyncs the WAL before the buffer flush. Of course, since this approach is too complicated, it's out of the scope of the development for v8.5. > But I also agree with Heikki. Let's plan to do this later in this > release. Okey. I implement nothing around this topic until the core part of asynchronous replication will have been committed. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers