This bug seems to have snuck in there with the introduction of walmethods.
AFAICT we are testing the result of sync() backwards, so whenever a partial
segment exists for pg_receivewal, it will fail. It will then unlink the
file, so when it retries 5 seconds later it works.

It also doesn't log the failure. Oops.

Attached patch reverses the check, and adds a failure message. I'd
appreciate a quick review in case I have the logic backwards in my head...

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ <http://www.hagander.net/>
 Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c
index f415135..8511e57 100644
--- a/src/bin/pg_basebackup/receivelog.c
+++ b/src/bin/pg_basebackup/receivelog.c
@@ -132,8 +132,11 @@ open_walfile(StreamCtl *stream, XLogRecPtr startpoint)
 			}
 
 			/* fsync file in case of a previous crash */
-			if (!stream->walmethod->sync(f))
+			if (stream->walmethod->sync(f) != 0)
 			{
+				fprintf(stderr,
+						_("%s: could not sync existing transaction log file \"%s\": %s\n"),
+						progname, fn, stream->walmethod->getlasterror());
 				stream->walmethod->close(f, CLOSE_UNLINK);
 				return false;
 			}
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to