On 25.12.2019 13:45, Andrey Borodin wrote:
25 дек. 2019 г., в 15:28, Maksim Milyutin <milyuti...@gmail.com> написал(а):
Synchronous replication
does not guarantee that a committed write is actually on any replica,
but it does in general guarantee that a commit has been replicated
before sending a response to the client. That's arguably more
important because the rest of what the application might depend on the
transaction completing and replicating successfully. I don't know of
cases other than cancellation in which a response is sent to the
client without replication when synchronous replication is enabled.
Yes, at query canceling (e.g. by timeout from client driver) client receives
response about completed transaction (though with warning which not all client
drivers can handle properly) and the guarantee about successfully replicated
transaction *violates*.
We obviously need a design discussion here to address the issue. But the
immediate question is should we add this topic to January CF items?
+1 on posting this topic to January CF.
Andrey, some fixes from me:
1) pulled out the cancelling of QueryCancelPending from internal branch
where synchronous_commit_cancelation is set so that to avoid dummy
iterations with printing message "canceling the wait for ..."
2) rewrote errdetail message under cancelling query: I hold in this case
we cannot assert that transaction committed locally because its changes
are not visible as yet so I propose to write about locally flushed
commit wal record.
Updated patch is attached.
--
Best regards,
Maksim Milyutin
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 5353b6ab0b..844ce36d79 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -80,6 +80,7 @@ bool DefaultXactDeferrable = false;
bool XactDeferrable;
int synchronous_commit = SYNCHRONOUS_COMMIT_ON;
+bool synchronous_commit_cancelation = true;
/*
* When running as a parallel worker, we place only a single
diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c
index 7bf7a1b7f8..0a533d2d79 100644
--- a/src/backend/replication/syncrep.c
+++ b/src/backend/replication/syncrep.c
@@ -250,13 +250,21 @@ SyncRepWaitForLSN(XLogRecPtr lsn, bool commit)
*/
if (ProcDiePending)
{
- ereport(WARNING,
+ if (synchronous_commit_cancelation)
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_ADMIN_SHUTDOWN),
+ errmsg("canceling the wait for synchronous replication and terminating connection due to administrator command"),
+ errdetail("The transaction has already committed locally, but might not have been replicated to the standby.")));
+ whereToSendOutput = DestNone;
+ SyncRepCancelWait();
+ break;
+ }
+
+ ereport(PANIC,
(errcode(ERRCODE_ADMIN_SHUTDOWN),
- errmsg("canceling the wait for synchronous replication and terminating connection due to administrator command"),
- errdetail("The transaction has already committed locally, but might not have been replicated to the standby.")));
- whereToSendOutput = DestNone;
- SyncRepCancelWait();
- break;
+ errmsg("canceling the wait for synchronous replication and terminating connection due to administrator command"),
+ errdetail("The transaction has already committed locally, but might not have been replicated to the standby.")));
}
/*
@@ -268,11 +276,18 @@ SyncRepWaitForLSN(XLogRecPtr lsn, bool commit)
if (QueryCancelPending)
{
QueryCancelPending = false;
+ if (synchronous_commit_cancelation)
+ {
+ ereport(WARNING,
+ (errmsg("canceling wait for synchronous replication due to user request"),
+ errdetail("The transaction has already committed locally, but might not have been replicated to the standby.")));
+ SyncRepCancelWait();
+ break;
+ }
+
ereport(WARNING,
- (errmsg("canceling wait for synchronous replication due to user request"),
- errdetail("The transaction has already committed locally, but might not have been replicated to the standby.")));
- SyncRepCancelWait();
- break;
+ (errmsg("canceling wait for synchronous replication due requested, but cancelation is not allowed"),
+ errdetail("The COMMIT record has already flushed to WAL locally and might not have been replicated to the standby. We must wait here.")));
}
/*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 4b4911d5ec..a279faaeb5 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1222,6 +1222,15 @@ static struct config_bool ConfigureNamesBool[] =
NULL, NULL, NULL
},
+ {
+ {"synchronous_commit_cancelation", PGC_USERSET, WAL_SETTINGS,
+ gettext_noop("Allow to cancel waiting for replication of transaction commited localy."),
+ NULL
+ },
+ &synchronous_commit_cancelation,
+ true, NULL, NULL, NULL
+ },
+
{
{"log_checkpoints", PGC_SIGHUP, LOGGING_WHAT,
gettext_noop("Logs each checkpoint."),
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index 9d2899dea1..5443c25edf 100644
--- a/src/include/access/xact.h
+++ b/src/include/access/xact.h
@@ -81,6 +81,9 @@ typedef enum
/* Synchronous commit level */
extern int synchronous_commit;
+/* Allow cancelation of queries waiting for sync replication but commited locally */
+extern bool synchronous_commit_cancelation;
+
/*
* Miscellaneous flag bits to record events which occur on the top level
* transaction. These flags are only persisted in MyXactFlags and are intended