Hi hackers, If you shut down a primary server, a standby that is streaming from it says54:
LOG: replication terminated by primary server DETAIL: End of WAL reached on timeline 1 at 0/14F4B68. FATAL: could not send end-of-streaming message to primary: no COPY in progress Isn't that FATAL ereport a bug? I haven't worked out the root cause but the immediate problem seems to be libpqrcv_endstreaming calls PQputCopyEnd which doesn't like the state that the libpq connection is in, namely PGASYNC_BUSY. That state seems to have been established by the call to walrcv_receive that returned -1 (end of copy). It doesn't happen in the similar case of promotion of the remote server. How is clean server shutdown supposed to work? It looks like walsender sends COPY 0 and then just hangs up. Meanwhile, walreceiver has to distinguish between that case and the the new timeline case which involves a further exchange of messages. Is an explicit message at the end of the copy stream saying either "goodbye" or "but wait, there's more" lacking here? Or is there some other way that walreceiver could distinguish between clean shutdown of remote server (no error necessary), unclean shutdown of remote server, and timeline negotiation? -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers