[HACKERS] WIP: long transactions on hot standby feedback replica / proof of concept

i . kartyshov Mon, 04 Sep 2017 04:05:24 -0700

Our clients complain about this issue and therefore I want to raise thediscussion and suggest several solutions to this problem:

I. Why does PG use Fatal when Error is enough to release lock that riseslock conflict?

"If (RecoveryConflictPending && DoingCommandRead)"

II. Do we really need to truncate the table on hot standby exactly atthe same time when truncate on master occurs?

In my case conflict happens when the autovacuum truncates table tbl1 onmaster while backend on replica is performing a long transactioninvolving the same table tbl1. This happens because truncate takes anAccessExclusiveLock. To tackle this issue we have several options:

1. We can postpone the truncate on the master until all the replicashave finished their transactions (in this case, feedback requests to themaster should be sent frequently)

Patch 1
vacuum_lazy_truncate.patch

2. Maybe there is an option somehow not to send AccessExclusiveLock andnot to truncate table on the replica right away. We could try to wait alittle and truncate tbl1 on replica again.

Here is a patch that makes replica skip truncate WAL record if sometransaction using the same table is already running on replica. And italso forces master to send truncate_wal to replica with actual tablelength every time autovacuum starts.

Patch 2
standby_truncate_skip_v1.patch

In this case the transaction which is running for several hours won’t beinterrupted because of truncate. Even if for some reason we haven’ttruncated this table tbl1 right away, nothing terrible will happen. Thenext truncate wal record will reduce the file size by the actual length(if some inserts or updates have been performed on master).

If you have any ideas or concerns about suggested solution I’ll be gladto hear them.

Thanks!

--
Ivan Kartyshov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 45b1859..bf73a1b 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -280,7 +280,8 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 	 * Optionally truncate the relation.
 	 */
 	if (should_attempt_truncation(vacrelstats))
-		lazy_truncate_heap(onerel, vacrelstats);
+		if (OldestXmin >= ShmemVariableCache->latestCompletedXid)
+			lazy_truncate_heap(onerel, vacrelstats);
 
 	/* Report that we are now doing final cleanup */
 	pgstat_progress_update_param(PROGRESS_VACUUM_PHASE,

diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 9a5fde0..a6dc369 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -31,6 +31,9 @@
 #include "storage/smgr.h"
 #include "utils/memutils.h"
 #include "utils/rel.h"
+#include "storage/lmgr.h"
+#include "storage/procarray.h"
+#include "access/transam.h"
 
 /*
  * We keep a list of all relations (represented as RelFileNode values)
@@ -495,9 +498,21 @@ smgr_redo(XLogReaderState *record)
 		xl_smgr_truncate *xlrec = (xl_smgr_truncate *) XLogRecGetData(record);
 		SMgrRelation reln;
 		Relation	rel;
+		LOCKTAG		locktag;
+		VirtualTransactionId *backends;
+		VirtualTransactionId *backends2;
 
 		reln = smgropen(xlrec->rnode, InvalidBackendId);
 
+		SET_LOCKTAG_RELATION(locktag, reln->smgr_rnode.node.dbNode,
+							 reln->smgr_rnode.node.relNode);
+
+		backends = GetLockConflicts(&locktag, AccessExclusiveLock);
+		backends2 = GetConflictingVirtualXIDs(InvalidTransactionId, InvalidOid);
+
+		if (!VirtualTransactionIdIsValid(*backends) && !VirtualTransactionIdIsValid(*backends2))
+		{
+
 		/*
 		 * Forcibly create relation if it doesn't exist (which suggests that
 		 * it was dropped somewhere later in the WAL sequence).  As in
@@ -542,6 +557,10 @@ smgr_redo(XLogReaderState *record)
 			visibilitymap_truncate(rel, xlrec->blkno);
 
 		FreeFakeRelcacheEntry(rel);
+		} else
+		{
+			XLogFlush(lsn);
+		}
 	}
 	else
 		elog(PANIC, "smgr_redo: unknown op code %u", info);
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 45b1859..fd91db0 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -61,6 +61,8 @@
 #include "utils/pg_rusage.h"
 #include "utils/timestamp.h"
 #include "utils/tqual.h"
+#include "catalog/storage_xlog.h"
+#include "access/rmgr.h"
 
 
 /*
@@ -317,6 +319,32 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 		new_rel_pages = vacrelstats->old_rel_pages;
 		new_rel_tuples = vacrelstats->old_rel_tuples;
 	}
+	else
+	{
+		if (RelationNeedsWAL(onerel))
+		{
+			/* Get the current relation length */
+			LockRelationForExtension(onerel, ExclusiveLock);
+
+			/*
+			 * Make an XLOG entry reporting the file truncation.
+			 */
+			XLogRecPtr	lsn;
+			xl_smgr_truncate xlrec;
+
+			xlrec.blkno = vacrelstats->rel_pages;
+			xlrec.rnode = onerel->rd_node;
+
+			XLogBeginInsert();
+			XLogRegisterData((char *) &xlrec, sizeof(xlrec));
+
+			lsn = XLogInsert(RM_SMGR_ID,
+							 XLOG_SMGR_TRUNCATE | XLR_SPECIAL_REL_UPDATE);
+			UnlockRelationForExtension(onerel, ExclusiveLock);
+			XLogFlush(lsn);
+		}
+
+	}
 
 	visibilitymap_count(onerel, &new_rel_allvisible, NULL);
 	if (new_rel_allvisible > new_rel_pages)
@@ -397,7 +425,6 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
 			appendStringInfo(&buf, _("avg read rate: %.3f MB/s, avg write rate: %.3f MB/s\n"),
 							 read_rate, write_rate);
 			appendStringInfo(&buf, _("system usage: %s"), pg_rusage_show(&ru0));
-
 			ereport(LOG,
 					(errmsg_internal("%s", buf.data)));
 			pfree(buf.data);
@@ -1820,7 +1847,7 @@ lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats)
 		vacrelstats->pages_removed += old_rel_pages - new_rel_pages;
 		vacrelstats->rel_pages = new_rel_pages;
 
-		ereport(elevel,
+		ereport(WARNING,
 				(errmsg("\"%s\": truncated %u to %u pages",
 						RelationGetRelationName(onerel),
 						old_rel_pages, new_rel_pages),
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 2b26173..f04888e 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -816,7 +816,7 @@ LockAcquireExtended(const LOCKTAG *locktag,
 		XLogStandbyInfoActive())
 	{
 		LogAccessExclusiveLockPrepare();
-		log_lock = true;
+//		log_lock = true;
 	}
 
 	/*

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] WIP: long transactions on hot standby feedback replica / proof of concept

Reply via email to