> > Don't understand. I am referring to the logic at the top of
> > AdvanceXLInsertBuffer(). We would need to wait for all people reading
> > the contents of wal_buffers.
>
> Oh, I see.
>
> If a slave falls behind, how does it catch up? I guess you're saying
> that it can't fall behind, because the master will block before that
> happens. Also in asynchronous replication? And what about
> when the slave
> is first set up, and needs to catch up with the master?

I think the WAL Sender needs the ability to read the WAL files directly.
In cases where it falls behind, or just started, it needs to be able to catch 
up.
So, it seems we eighter need to copy the WAL buffer into local memory before 
sending,
or "lock" the WAL buffer until send finished.
Useful network timeouts are in the >= 5-10 sec range (even for GbE lan), so I 
don't
think locking WAL buffers is feasible. Thus the WAL sender needs to copy (the 
needed
portion of the current WAL buffer) before send (or use async send that 
immediately
returns when the buffer is copied into the network stack).

When the WAL sender is ready to continue it eighter still finds the next WAL 
buffer
(or the rest of the current buffer) or it needs to fall back to Plan B and
read the WAL files again. A sync client could still wait for the replicate, 
even if
local WAL has already advanced massively. The checkpointer would need the LSN
info from WAL senders to not reuse any still needed WAL files, although in that 
case
it might be time to declare the replicate broken.

Ideally the WAL sender also knows whether the client waits, so it can decide to 
send
a part of a buffer. The WAL sender should wake and act whenever a "network 
packet"
full of WAL buffer is ready, regardless of commits. Whatever size of send seems
appropriate here (might be one WAL page).
The WAL Sender should only need to expect a response, when it sent a commit 
record,
ideally only if a client is waiting (and once in a while at least for every log 
switch).

All in all a useful streamer seems like a lot of work.

Andreas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to