On 25.12.2019 13:45, Andrey Borodin wrote:
25 дек. 2019 г., в 15:28, Maksim Milyutin <milyuti...@gmail.com> написал(а):

Synchronous replication
does not guarantee that a committed write is actually on any replica,
but it does in general guarantee that a commit has been replicated
before sending a response to the client. That's arguably more
important because the rest of what the application might depend on the
transaction completing and replicating successfully. I don't know of
cases other than cancellation in which a response is sent to the
client without replication when synchronous replication is enabled.

Yes, at query canceling (e.g. by timeout from client driver) client receives 
response about completed transaction (though with warning which not all client 
drivers can handle properly) and the guarantee about successfully replicated 
transaction *violates*.
We obviously need a design discussion here to address the issue. But the 
immediate question is should we add this topic to January CF items?


+1 on posting this topic to January CF.

Andrey, some fixes from me:

1) pulled out the cancelling of QueryCancelPending from internal branch where synchronous_commit_cancelation is set so that to avoid dummy iterations with printing message "canceling the wait for ..."

2) rewrote errdetail message under cancelling query: I hold in this case we cannot assert that transaction committed locally because its changes are not visible as yet so I propose to write about locally flushed commit wal record.

Updated patch is attached.

--
Best regards,
Maksim Milyutin

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 5353b6ab0b..844ce36d79 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -80,6 +80,7 @@ bool		DefaultXactDeferrable = false;
 bool		XactDeferrable;
 
 int			synchronous_commit = SYNCHRONOUS_COMMIT_ON;
+bool		synchronous_commit_cancelation = true;
 
 /*
  * When running as a parallel worker, we place only a single
diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c
index 7bf7a1b7f8..0a533d2d79 100644
--- a/src/backend/replication/syncrep.c
+++ b/src/backend/replication/syncrep.c
@@ -250,13 +250,21 @@ SyncRepWaitForLSN(XLogRecPtr lsn, bool commit)
 		 */
 		if (ProcDiePending)
 		{
-			ereport(WARNING,
+			if (synchronous_commit_cancelation)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_ADMIN_SHUTDOWN),
+						errmsg("canceling the wait for synchronous replication and terminating connection due to administrator command"),
+						errdetail("The transaction has already committed locally, but might not have been replicated to the standby.")));
+				whereToSendOutput = DestNone;
+				SyncRepCancelWait();
+				break;
+			}
+
+			ereport(PANIC,
 					(errcode(ERRCODE_ADMIN_SHUTDOWN),
-					 errmsg("canceling the wait for synchronous replication and terminating connection due to administrator command"),
-					 errdetail("The transaction has already committed locally, but might not have been replicated to the standby.")));
-			whereToSendOutput = DestNone;
-			SyncRepCancelWait();
-			break;
+					errmsg("canceling the wait for synchronous replication and terminating connection due to administrator command"),
+					errdetail("The transaction has already committed locally, but might not have been replicated to the standby.")));
 		}
 
 		/*
@@ -268,11 +276,18 @@ SyncRepWaitForLSN(XLogRecPtr lsn, bool commit)
 		if (QueryCancelPending)
 		{
 			QueryCancelPending = false;
+			if (synchronous_commit_cancelation)
+			{
+				ereport(WARNING,
+						(errmsg("canceling wait for synchronous replication due to user request"),
+						 errdetail("The transaction has already committed locally, but might not have been replicated to the standby.")));
+				SyncRepCancelWait();
+				break;
+			}
+
 			ereport(WARNING,
-					(errmsg("canceling wait for synchronous replication due to user request"),
-					 errdetail("The transaction has already committed locally, but might not have been replicated to the standby.")));
-			SyncRepCancelWait();
-			break;
+					(errmsg("canceling wait for synchronous replication due requested, but cancelation is not allowed"),
+					 errdetail("The COMMIT record has already flushed to WAL locally and might not have been replicated to the standby. We must wait here.")));
 		}
 
 		/*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 4b4911d5ec..a279faaeb5 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1222,6 +1222,15 @@ static struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"synchronous_commit_cancelation", PGC_USERSET, WAL_SETTINGS,
+			gettext_noop("Allow to cancel waiting for replication of transaction commited localy."),
+			NULL
+		},
+		&synchronous_commit_cancelation,
+		true, NULL, NULL, NULL
+	},
+
 	{
 		{"log_checkpoints", PGC_SIGHUP, LOGGING_WHAT,
 			gettext_noop("Logs each checkpoint."),
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index 9d2899dea1..5443c25edf 100644
--- a/src/include/access/xact.h
+++ b/src/include/access/xact.h
@@ -81,6 +81,9 @@ typedef enum
 /* Synchronous commit level */
 extern int	synchronous_commit;
 
+/* Allow cancelation of queries waiting for sync replication but commited locally */
+extern bool synchronous_commit_cancelation;
+
 /*
  * Miscellaneous flag bits to record events which occur on the top level
  * transaction. These flags are only persisted in MyXactFlags and are intended

Reply via email to