Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

Christian Kruse Fri, 21 Feb 2014 03:27:13 -0800

Hi,

On 19.02.2014 08:11, Amit Kapila wrote:
> I have looked into this patch. Below are some points which I
> think should be improved in this patch:


Thank you for your review.

> 1. New context added by this patch is display at wrong place
> […]
> Do this patch expect to print the context at cancel request
> as well?
> I don't find it useful. I think the reason is that after setup of
> context, if any other error happens, it will display the newly setup
> context.

To be honest, I don't see a problem here. If you cancel the request in
most cases it is because it doesn't finish in an acceptable time. So
displaying the context seems logical to me.

> 2a. About the message:
> LOG:  process 1936 still waiting for ShareLock on transaction 842
> after 1014.000ms
> CONTEXT:  relation name: foo (OID 57496)
> tuple (ctid (0,1)): (1, 2, abc  )
> 
> Isn't it better to display database name as well, as if we see in
> one of related messages as below, it displays database name as well.
> 
> "LOG:  process 3012 still waiting for ExclusiveLock on tuple (0,1) of
> relation 57
> 499 of database 12045 after 1014.000 ms"

Maybe. But then we either have duplicate infos in some cases (e.g. on
a table lock) or we have to distinguish error cases and show distinct
contextes. And also getting the database name from a relation is not
really straight forward (according to Andres we would have to look at
rel->rd_node.dbNode) and since other commands dealing with names don't
do so, either, I think we should leave out the database name.

> 2b. I think there is no need to display *ctid* before tuple offset, it seems
> to be obvious as well as consistent with other similar message.

Ok, I'll drop it.

> 3. About callback I think rather than calling setup/tear down
> functions from multiple places, it will be better to have wrapper
> functions for XactTableLockWait() and MultiXactIdWait(). Just check
> in plpgsql_exec_function(), how the errorcallback is set, can we do
> something similar without to avoid palloc?

OK, Attached.

> 4. About the values of tuple
> I think it might be useful to display some unique part of tuple for
> debugging part, but what will we do incase there is no primary
> key on table, so may be we can display initial few columns (2 or 3)
> or just display tuple offset as is done in other similar message
> shown above.

Currently I simply display the whole tuple with truncating long
fields. This is plain easy, no distinction necessary. I did some code
reading and it seems somewhat complex to get the PK columns and it
seems that we need another lock for this, too – maybe not the best
idea when we are already in locking trouble, do you disagree?

Also showing just a few columns when no PK is available might not be
helpful since the tuple is not necessarily identified by the columns
showed. And since we drop the CTID there is no alternative solution to
identify the tuple.

IMHO simply displaying the whole tuple is the best solution in this
case, do you agree?

> More to the point, I have observed that in few cases
> displaying values of tuple can be confusing as well. For Example:
> […]
> Now here it will be bit confusing to display values with tuple, as
> in session-3 statement has asked to update value (3), but we have
> started waiting for value (4). Although it is right, but might not
> make much sense.

What do you suggest to avoid confusion? I can see what you are talking
about, but I'm not sure what to do about it. Keep in mind that this
patch should help debugging locking, so IMHO it is essential to be
able to identify a tuple and therefore knowing its values.

> Also after session-2 commits the transaction, session-3 will say
> acquired lock, however still it will not be able to update it.

To be honest, I don't think that this is a problem of the
patch. Concurrency is not easy and it might lead to confusing
situations. I don't think that we should drop the tuple values because
of this, since they provide useful and precious debugging information.

Attached you will find a new version of the patch, mainly using
wrapper functions for XactLockTableWait() and MultiXactIdWait().

Best regards,

-- 
 Christian Kruse               http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index a771ccb..71aaa93 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -107,6 +107,8 @@ static TransactionId MultiXactIdGetUpdateXid(TransactionId xmax,
 						uint16 t_infomask);
 static void MultiXactIdWait(MultiXactId multi, MultiXactStatus status,
 				int *remaining, uint16 infomask);
+static void MultiXactIdWaitWithErrorContext(Relation rel, HeapTuple tup, MultiXactId multi,
+					MultiXactStatus status, int *remaining, uint16 infomask);
 static bool ConditionalMultiXactIdWait(MultiXactId multi,
 						   MultiXactStatus status, int *remaining,
 						   uint16 infomask);
@@ -2703,8 +2705,8 @@ l1:
 		if (infomask & HEAP_XMAX_IS_MULTI)
 		{
 			/* wait for multixact */
-			MultiXactIdWait((MultiXactId) xwait, MultiXactStatusUpdate,
-							NULL, infomask);
+			MultiXactIdWaitWithErrorContext(relation, &tp, (MultiXactId) xwait,
+									  MultiXactStatusUpdate, NULL, infomask);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -2730,7 +2732,7 @@ l1:
 		else
 		{
 			/* wait for regular transaction to end */
-			XactLockTableWait(xwait);
+			XactLockTableWaitWithErrorCallback(relation, &tp, xwait);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3255,8 +3257,8 @@ l2:
 			int			remain;
 
 			/* wait for multixact */
-			MultiXactIdWait((MultiXactId) xwait, mxact_status, &remain,
-							infomask);
+			MultiXactIdWaitWithErrorContext(relation, &oldtup,
+					   (MultiXactId) xwait, mxact_status, &remain, infomask);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3330,7 +3332,7 @@ l2:
 			else
 			{
 				/* wait for regular transaction to end */
-				XactLockTableWait(xwait);
+				XactLockTableWaitWithErrorCallback(relation, &oldtup, xwait);
 				LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 				/*
@@ -4398,7 +4400,8 @@ l3:
 										RelationGetRelationName(relation))));
 				}
 				else
-					MultiXactIdWait((MultiXactId) xwait, status, NULL, infomask);
+					MultiXactIdWaitWithErrorContext(relation, tuple,
+								(MultiXactId) xwait, status, NULL, infomask);
 
 				/* if there are updates, follow the update chain */
 				if (follow_updates &&
@@ -4453,7 +4456,7 @@ l3:
 										RelationGetRelationName(relation))));
 				}
 				else
-					XactLockTableWait(xwait);
+					XactLockTableWaitWithErrorCallback(relation, tuple, xwait);
 
 				/* if there are updates, follow the update chain */
 				if (follow_updates &&
@@ -5140,7 +5143,7 @@ l4:
 					if (needwait)
 					{
 						LockBuffer(buf, BUFFER_LOCK_UNLOCK);
-						XactLockTableWait(members[i].xid);
+						XactLockTableWaitWithErrorCallback(rel, &mytup, members[i].xid);
 						pfree(members);
 						goto l4;
 					}
@@ -5200,7 +5203,7 @@ l4:
 				if (needwait)
 				{
 					LockBuffer(buf, BUFFER_LOCK_UNLOCK);
-					XactLockTableWait(rawxmax);
+					XactLockTableWaitWithErrorCallback(rel, &mytup, rawxmax);
 					goto l4;
 				}
 				if (res != HeapTupleMayBeUpdated)
@@ -6158,6 +6161,33 @@ MultiXactIdWait(MultiXactId multi, MultiXactStatus status,
 }
 
 /*
+ * MultiXactIdWaitWithErrorContext
+ *		Sets up the error context callback for reporting tuple
+ *		information and relation for a lock to aquire
+ *
+ * Use this instead of calling MultiXactIdWait()
+ */
+static void
+MultiXactIdWaitWithErrorContext(Relation rel, HeapTuple tup, MultiXactId multi,
+					 MultiXactStatus status, int *remaining, uint16 infomask)
+{
+	struct XactLockTableWaitLockInfo info;
+	ErrorContextCallback callback;
+
+	info.rel = rel;
+	info.tuple = tup;
+
+	callback.callback = XactLockTableWaitErrorContextCallback;
+	callback.arg = &info;
+	callback.previous = error_context_stack;
+	error_context_stack = &callback;
+
+	MultiXactIdWait(multi, status, remaining, infomask);
+
+	error_context_stack = callback.previous;
+}
+
+/*
  * ConditionalMultiXactIdWait
  *		As above, but only lock if we can get the lock without blocking.
  *
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index 9140749..c363c35 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -164,7 +164,7 @@ top:
 		{
 			/* Have to wait for the other guy ... */
 			_bt_relbuf(rel, buf);
-			XactLockTableWait(xwait);
+			XactLockTableWaitWithErrorCallback(rel, NULL, xwait);
 			/* start over... */
 			_bt_freestack(stack);
 			goto top;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index cebca95..58e4ddc 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -2295,7 +2295,8 @@ IndexBuildHeapScan(Relation heapRelation,
 							 * Must drop the lock on the buffer before we wait
 							 */
 							LockBuffer(scan->rs_cbuf, BUFFER_LOCK_UNLOCK);
-							XactLockTableWait(xwait);
+							XactLockTableWaitWithErrorCallback(heapRelation,
+														   heapTuple, xwait);
 							goto recheck;
 						}
 					}
@@ -2341,7 +2342,8 @@ IndexBuildHeapScan(Relation heapRelation,
 							 * Must drop the lock on the buffer before we wait
 							 */
 							LockBuffer(scan->rs_cbuf, BUFFER_LOCK_UNLOCK);
-							XactLockTableWait(xwait);
+							XactLockTableWaitWithErrorCallback(heapRelation,
+														   heapTuple, xwait);
 							goto recheck;
 						}
 
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 6b5f198..794e3d0 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -1981,7 +1981,8 @@ EvalPlanQualFetch(EState *estate, Relation relation, int lockmode,
 			if (TransactionIdIsValid(SnapshotDirty.xmax))
 			{
 				ReleaseBuffer(buffer);
-				XactLockTableWait(SnapshotDirty.xmax);
+				XactLockTableWaitWithErrorCallback(relation, &tuple,
+												   SnapshotDirty.xmax);
 				continue;		/* loop back to repeat heap_fetch */
 			}
 
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 46895b2..0ce7b85 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -1307,7 +1307,7 @@ retry:
 		if (TransactionIdIsValid(xwait))
 		{
 			index_endscan(index_scan);
-			XactLockTableWait(xwait);
+			XactLockTableWaitWithErrorCallback(heap, tup, xwait);
 			goto retry;
 		}
 
diff --git a/src/backend/storage/lmgr/lmgr.c b/src/backend/storage/lmgr/lmgr.c
index 4c61a6f..7f46686 100644
--- a/src/backend/storage/lmgr/lmgr.c
+++ b/src/backend/storage/lmgr/lmgr.c
@@ -18,12 +18,14 @@
 #include "access/subtrans.h"
 #include "access/transam.h"
 #include "access/xact.h"
+#include "access/htup_details.h"
 #include "catalog/catalog.h"
 #include "miscadmin.h"
 #include "storage/lmgr.h"
 #include "storage/procarray.h"
 #include "utils/inval.h"
-
+#include "utils/lsyscache.h"
+#include "mb/pg_wchar.h"
 
 /*
  * RelationInitLockInfo
@@ -534,6 +536,126 @@ ConditionalXactLockTableWait(TransactionId xid)
 }
 
 /*
+ * XactLockTableWaitErrorContextCallback
+ *		error context callback set up by
+ *		XactLockTableWaitWithErrorCallback. It reports some
+ *		tuple information and the relation of a lock to aquire
+ *
+ */
+void
+XactLockTableWaitErrorContextCallback(void *arg)
+{
+	StringInfoData buf;
+	int			i;
+	bool		write_comma = false;
+	int			maxfieldlen = 30;
+	struct XactLockTableWaitLockInfo *info = (struct XactLockTableWaitLockInfo *) arg;
+
+	if (info->rel != NULL)
+	{
+		errcontext("relation \"%s\".\"%s\"",
+				   get_namespace_name(RelationGetNamespace(info->rel)),
+				   RelationGetRelationName(info->rel));
+	}
+
+	if (info->tuple != NULL)
+	{
+		/*
+		 * Can't produce an error message including the tuple if we're in a
+		 * possibly aborted transaction state, database access might not be
+		 * safe.
+		 */
+		if (geterrlevel() >= ERROR)
+		{
+			errcontext("tuple ctid (%u,%u)",
+					   BlockIdGetBlockNumber(&(info->tuple->t_self.ip_blkid)),
+					   info->tuple->t_self.ip_posid);
+		}
+		else
+		{
+			TupleDesc	desc = RelationGetDescr(info->rel);
+			Datum	   *values = palloc(desc->natts * sizeof(*values));
+			bool	   *nulls = palloc(desc->natts * sizeof(*values));
+
+			heap_deform_tuple(info->tuple, desc, values, nulls);
+
+			initStringInfo(&buf);
+			appendStringInfoChar(&buf, '(');
+
+			for (i = 0; i < desc->natts; i++)
+			{
+				char	   *val;
+				int			vallen;
+
+				if (nulls[i])
+					val = "null";
+				else
+				{
+					Oid			foutoid;
+					bool		typisvarlena;
+
+					getTypeOutputInfo(desc->attrs[i]->atttypid,
+									  &foutoid, &typisvarlena);
+
+					val = OidOutputFunctionCall(foutoid, values[i]);
+				}
+
+				if (write_comma)
+					appendStringInfoString(&buf, ", ");
+				else
+					write_comma = true;
+
+				vallen = strlen(val);
+				if (vallen <= maxfieldlen)
+					appendStringInfoString(&buf, val);
+				else
+				{
+					vallen = pg_mbcliplen(val, vallen, maxfieldlen);
+					appendBinaryStringInfo(&buf, val, vallen);
+					appendStringInfoString(&buf, "...");
+				}
+			}
+
+			appendStringInfoChar(&buf, ')');
+
+			errcontext("tuple %s",
+					   BlockIdGetBlockNumber(&(info->tuple->t_self.ip_blkid)),
+					   info->tuple->t_self.ip_posid, buf.data);
+
+			pfree(buf.data);
+			pfree(values);
+			pfree(nulls);
+		}
+	}
+}
+
+/*
+ * XactLockTableWaitWithErrorCallback
+ *		Sets up the error context callback for reporting tuple
+ *		information and relation for a lock to aquire
+ *
+ * Use this instead of calling XactTableLockWait()
+ */
+void
+XactLockTableWaitWithErrorCallback(Relation rel, HeapTuple tuple, TransactionId xid)
+{
+	struct XactLockTableWaitLockInfo info;
+	ErrorContextCallback callback;
+
+	info.rel = rel;
+	info.tuple = tuple;
+
+	callback.callback = XactLockTableWaitErrorContextCallback;
+	callback.arg = &info;
+	callback.previous = error_context_stack;
+	error_context_stack = &callback;
+
+	XactLockTableWait(xid);
+
+	error_context_stack = callback.previous;
+}
+
+/*
  * WaitForLockersMultiple
  *		Wait until no transaction holds locks that conflict with the given
  *		locktags at the given lockmode.
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 8705586..6231db6 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -1216,6 +1216,23 @@ geterrposition(void)
 }
 
 /*
+ * geterrlevel --- return the currently set error level
+ *
+ * This is only intended for use in error callback subroutines, since there
+ * is no other place outside elog.c where the concept is meaningful.
+ */
+int
+geterrlevel(void)
+{
+	ErrorData  *edata = &errordata[errordata_stack_depth];
+
+	/* we don't bother incrementing recursion_depth */
+	CHECK_STACK_DEPTH();
+
+	return edata->elevel;
+}
+
+/*
  * getinternalerrposition --- same for internal error position
  *
  * This is only intended for use in error callback subroutines, since there
diff --git a/src/include/storage/lmgr.h b/src/include/storage/lmgr.h
index e74ae21..97248a7 100644
--- a/src/include/storage/lmgr.h
+++ b/src/include/storage/lmgr.h
@@ -18,6 +18,13 @@
 #include "storage/itemptr.h"
 #include "storage/lock.h"
 #include "utils/rel.h"
+#include "access/htup.h"
+
+struct XactLockTableWaitLockInfo
+{
+	Relation	rel;
+	HeapTuple	tuple;
+};
 
 
 extern void RelationInitLockInfo(Relation relation);
@@ -57,6 +64,10 @@ extern void XactLockTableDelete(TransactionId xid);
 extern void XactLockTableWait(TransactionId xid);
 extern bool ConditionalXactLockTableWait(TransactionId xid);
 
+extern void XactLockTableWaitWithErrorCallback(Relation rel, HeapTuple tuple,
+								   TransactionId xid);
+extern void XactLockTableWaitErrorContextCallback(void *arg);
+
 /* Lock VXIDs, specified by conflicting locktags */
 extern void WaitForLockers(LOCKTAG heaplocktag, LOCKMODE lockmode);
 extern void WaitForLockersMultiple(List *locktags, LOCKMODE lockmode);
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index d7916c2..04fd3a2 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -224,6 +224,7 @@ extern int	err_generic_string(int field, const char *str);
 
 extern int	geterrcode(void);
 extern int	geterrposition(void);
+extern int	geterrlevel(void);
 extern int	getinternalerrposition(void);

pgptyioK23dsY.pgp
Description: PGP signature

Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

Reply via email to