On Thu, Sep 17, 2009 at 10:08, Heikki Linnakangas <heikki.linnakan...@enterprisedb.com> wrote: > Fujii Masao wrote: >> On Tue, Sep 15, 2009 at 7:53 PM, Heikki Linnakangas >> <heikki.linnakan...@enterprisedb.com> wrote: >>> After playing with this a little bit, I think we need logic in the slave >>> to reconnect to the master if the connection is broken for some reason, >>> or can't be established in the first place. At the moment, that is >>> considered as the end of recovery, and the slave starts up. You have the >>> trigger file mechanism to stop that, but it only gives you a chance to >>> manually kill and restart the slave before it chooses a new timeline and >>> starts up, it doesn't reconnect automatically. >> >> I was thinking that the automatic reconnection capability is the TODO item >> for the later CF. The infrastructure for it has already been introduced in >> the >> current patch. Please see the macro MAX_WALRCV_RETRIES (backend/ >> postmaster/walreceiver.c). This is the maximum number of times to retry >> walreceiver. In the current version, this is the fixed value, but we can make >> this user-configurable (parameter of recovery.conf is suitable, I think). > > Ah, I see. > > Robert Haas suggested a while ago that walreceiver could be a > stand-alone utility, not requiring postmaster at all. That would allow > you to set up streaming replication as another way to implement WAL > archiving. Looking at how the processes interact, there really isn't > much communication between walreceiver and the rest of the system, so > that sounds pretty attractive.
Yes, that would be very very useful. > Walreceiver is really a slave to the startup process. The startup > process decides when it's launched, and it's the startup process that > then waits for it to advance. But the way it's set up at the moment, the > startup process needs to ask the postmaster to start it up, and it > doesn't look very robust to me. For example, if launching walreceiver > fails for some reason, startup process will just hang waiting for it. > > I'm thinking that walreceiver should be a stand-alone program that the > startup process launches, similar to how it invokes restore_command in > PITR recovery. Instead of using system(), though, it would use > fork+exec, and a pipe to communicate. Not having looked at all into the details, that sounds like a nice improvement :-) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers