This bug seems to have snuck in there with the introduction of walmethods.
AFAICT we are testing the result of sync() backwards, so whenever a partial
segment exists for pg_receivewal, it will fail. It will then unlink the
file, so when it retries 5 seconds later it works.
It also doesn't log the failure. Oops.
Attached patch reverses the check, and adds a failure message. I'd
appreciate a quick review in case I have the logic backwards in my head...
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c
index f415135..8511e57 100644
--- a/src/bin/pg_basebackup/receivelog.c
+++ b/src/bin/pg_basebackup/receivelog.c
@@ -132,8 +132,11 @@ open_walfile(StreamCtl *stream, XLogRecPtr startpoint)
}
/* fsync file in case of a previous crash */
- if (!stream->walmethod->sync(f))
+ if (stream->walmethod->sync(f) != 0)
{
+ fprintf(stderr,
+ _("%s: could not sync existing transaction log file \"%s\": %s\n"),
+ progname, fn, stream->walmethod->getlasterror());
stream->walmethod->close(f, CLOSE_UNLINK);
return false;
}
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers