On 21/06/10 12:08, Fujii Masao wrote:
On Wed, Jun 16, 2010 at 5:06 AM, Robert Haas<robertmh...@gmail.com> wrote:
In 9.0, I think we can fix this problem by (1) only streaming WAL that
has been fsync'd and (2) PANIC-ing if the problem occurs anyway. But
in 9.1, with sync rep and the performance demands that entails, I
think that we're going to need to rethink it.
The problem is not that the master streams non-fsync'd WAL, but that the
standby can replay that. So I'm thinking that we can send non-fsync'd WAL
safely if the standby makes the recovery wait until the master has fsync'd
WAL. That is, walsender sends not only non-fsync'd WAL but also WAL flush
location to walreceiver, and the standby applies only the WAL which the
master has already fsync'd. Thought?
I guess, but you have to be very careful to correctly refrain from
applying the WAL. For example, a naive implementation might write the
WAL to disk in walreceiver immediately, but refrain from telling the
startup process about it. If walreceiver is then killed because the
connection is broken (and it will be because the master just crashed),
the startup process will read the streamed WAL from the file in pg_xlog,
and go ahead to apply it anyway.
So maybe there's some room for optimization there, but given the
round-trip required for the acknowledgment anyway it might not buy you
much, and the implementation is not very straightforward. This is
clearly 9.1 material, if worth optimizing at all.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers