On Wed, Jun 2, 2021 at 11:25 AM Amit Kapila <amit.kapil...@gmail.com> wrote:
>
> On Tue, Jun 1, 2021 at 5:23 PM Dilip Kumar <dilipbal...@gmail.com> wrote:
> >
> > On Tue, Jun 1, 2021 at 12:25 PM Amit Kapila <amit.kapil...@gmail.com> wrote:
> >
> > > >
> > > > IMHO, as I stated earlier one way to fix this problem is that we add
> > > > the spec abort operation (DELETE + XLH_DELETE_IS_SUPER flag) to the
> > > > queue, maybe with action name
> > > > "REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT" and as part of processing
> > > > that just cleans up the toast and specinsert tuple and nothing else.
> > > > If we think that makes sense then I can work on that patch?
> > > >
> > >
> > > I think this should solve the problem but let's first try to see if we
> > > have any other problems. Because, say, if we don't have any other
> > > problem, then maybe removing Assert might work but I guess we still
> > > need to process the tuple to find that we don't need to assemble toast
> > > for it which again seems like overkill.
> >
> > Yeah, other operation will also fail, basically, if txn->toast_hash is
> > not NULL then we assume that we need to assemble the tuple using
> > toast, but if there is next insert in another relation and if that
> > relation doesn't have a toast table then it will fail with below
> > error.  And otherwise also, if it is the same relation, then the toast
> > chunks of previous tuple will be used for constructing this new tuple.
> >
>
> I think the same relation case might not create a problem because it
> won't find the entry for it in the toast_hash, so it will return from
> there but the other two problems will be there.

Right

So, one idea could be
> to just avoid those two cases (by simply adding return for those
> cases) and still we can rely on toast clean up on the next
> insert/update/delete. However, your approach looks more natural to me
> as that will allow us to clean up memory in all cases instead of
> waiting till the transaction end. So, I still vote for going with your
> patch's idea of cleaning at spec_abort but I am fine if you and others
> decide not to process spec_abort message. What do you think? Tomas, do
> you have any opinion on this matter?

I agree that processing with spec abort looks more natural and ideally
the current code expects it to be getting cleaned after the change,
that's the reason we have those assertions and errors.  OTOH I agree
that we can just return from those conditions because now we know that
with the current code those situations are possible.  My vote is with
handling the spec abort option (Option1) because that looks more
natural way of handling these issues and we also don't have to clean
up the hash in "ReorderBufferReturnTXN" if no followup change after
spec abort.  I am attaching the patches with both the approaches for
the reference.

Once we finalize on the approach then I will work on pending review
comments and also prepare the back branch patches.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From f607e140cae21183fea2fc226029d7673478509e Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Tue, 1 Jun 2021 19:53:47 +0530
Subject: [PATCH v2] Bug fix for speculative abort

If speculative insert has a toast table insert then if that tuple
is not confirmed then the toast hash is not cleaned and that is
creating various problem like a) memory leak b) next insert is
using these uncleaned toast data for its insertion and other
error and assersion failure.  So this patch handle that by
queuing the spec abort changes and cleaning up the toast hash
on spec abort.  Currently, in this patch we are queuing up all
the spec abort changes, but as an optimization we can avoid
queuing the spec abort for toast tables but for that we need to
log that as a flag in WAL.
---
 src/backend/replication/logical/decode.c        | 16 +++++++-----
 src/backend/replication/logical/reorderbuffer.c | 34 +++++++++++++++++++++++++
 src/include/replication/reorderbuffer.h         |  1 +
 3 files changed, 44 insertions(+), 7 deletions(-)

diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index 7067016..1a0d7dc 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -1040,19 +1040,21 @@ DecodeDelete(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 	if (target_node.dbNode != ctx->slot->data.database)
 		return;
 
+	/* output plugin doesn't look for this origin, no need to queue */
+	if (FilterByOrigin(ctx, XLogRecGetOrigin(r)))
+		return;
+
+	change = ReorderBufferGetChange(ctx->reorder);
+
 	/*
 	 * Super deletions are irrelevant for logical decoding, it's driven by the
 	 * confirmation records.
 	 */
 	if (xlrec->flags & XLH_DELETE_IS_SUPER)
-		return;
-
-	/* output plugin doesn't look for this origin, no need to queue */
-	if (FilterByOrigin(ctx, XLogRecGetOrigin(r)))
-		return;
+		change->action = REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT;
+	else
+		change->action = REORDER_BUFFER_CHANGE_DELETE;
 
-	change = ReorderBufferGetChange(ctx->reorder);
-	change->action = REORDER_BUFFER_CHANGE_DELETE;
 	change->origin_id = XLogRecGetOrigin(r);
 
 	memcpy(&change->data.tp.relnode, &target_node, sizeof(RelFileNode));
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 2d9e127..4940cf5 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -520,6 +520,7 @@ ReorderBufferReturnChange(ReorderBuffer *rb, ReorderBufferChange *change,
 			}
 			break;
 		case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM:
+		case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT:
 		case REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID:
 		case REORDER_BUFFER_CHANGE_INTERNAL_TUPLECID:
 			break;
@@ -2254,6 +2255,36 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
 					specinsert = change;
 					break;
 
+				case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT:
+
+					/*
+					 * Abort for speculative insertion arrived.  So cleanup the
+					 * specinsert tuple and toast hash.  If spec insert change
+					 * is NULL then do nothing, this is possible because we
+					 * have spec abort for each toast entry.  So we just have
+					 * to clean the specinsert and toast hash for the first
+					 * spec abort for the main table and for the remaining
+					 * entries we can just ignore.
+					 *
+					 * XXX For optimization, we may log a flag saying this is
+					 * a spec abort for the toast table and we can avoid queuing
+					 * that change.
+					 */
+					if (specinsert != NULL)
+					{
+						/* Clear the toast chunk */
+						Assert(change->data.tp.clear_toast_afterwards);
+						ReorderBufferToastReset(rb, txn);
+
+						/*
+						* If the speculative insertion was aborted, the record
+						* isn't needed anymore.
+						*/
+						ReorderBufferReturnChange(rb, specinsert, true);
+						specinsert = NULL;
+					}
+					break;
+
 				case REORDER_BUFFER_CHANGE_TRUNCATE:
 					{
 						int			i;
@@ -3754,6 +3785,7 @@ ReorderBufferSerializeChange(ReorderBuffer *rb, ReorderBufferTXN *txn,
 				break;
 			}
 		case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM:
+		case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT:
 		case REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID:
 		case REORDER_BUFFER_CHANGE_INTERNAL_TUPLECID:
 			/* ReorderBufferChange contains everything important */
@@ -4017,6 +4049,7 @@ ReorderBufferChangeSize(ReorderBufferChange *change)
 				break;
 			}
 		case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM:
+		case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT:
 		case REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID:
 		case REORDER_BUFFER_CHANGE_INTERNAL_TUPLECID:
 			/* ReorderBufferChange contains everything important */
@@ -4315,6 +4348,7 @@ ReorderBufferRestoreChange(ReorderBuffer *rb, ReorderBufferTXN *txn,
 				break;
 			}
 		case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM:
+		case REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT:
 		case REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID:
 		case REORDER_BUFFER_CHANGE_INTERNAL_TUPLECID:
 			break;
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 0c6e9d1..9ff0986 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -63,6 +63,7 @@ enum ReorderBufferChangeType
 	REORDER_BUFFER_CHANGE_INTERNAL_TUPLECID,
 	REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT,
 	REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
+	REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
 	REORDER_BUFFER_CHANGE_TRUNCATE
 };
 
-- 
1.8.3.1

From 68c672ec05b1867e3b170435f929321e43d05020 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Wed, 2 Jun 2021 11:16:34 +0530
Subject: [PATCH v3] Bug fix in case of leftover hash after spec abort

Basically, if the toast table insertion for spec insert followup
by spec abort then the toast hash was not cleaned up immeditely
and there was some assert and error based on that situation so
just ignore them.
---
 src/backend/replication/logical/reorderbuffer.c | 32 +++++++++++++++++--------
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 2d9e127..5c20e13 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -437,6 +437,9 @@ ReorderBufferReturnTXN(ReorderBuffer *rb, ReorderBufferTXN *txn)
 		txn->tuplecid_hash = NULL;
 	}
 
+	/* cleanup the toast hash */
+	ReorderBufferToastReset(rb, txn);
+
 	if (txn->invalidations)
 	{
 		pfree(txn->invalidations);
@@ -4583,6 +4586,25 @@ ReorderBufferToastReplace(ReorderBuffer *rb, ReorderBufferTXN *txn,
 		return;
 
 	/*
+	 * If the current change is not a INSERT or UPDATE then it might be the
+	 * leftover toast hash from previous spec insert without spec confirm.  So
+	 * we can just ignore it.
+	 */
+	if (change->data.tp.newtuple == NULL)
+		return;
+
+	desc = RelationGetDescr(relation);
+
+	/*
+	 * If the current relation doesn't have a toast relation then it might be
+	 * the leftover toast hash from previous spec insert without spec confirm.
+	 * So we can just ignore it.
+	 */
+	toast_rel = RelationIdGetRelation(relation->rd_rel->reltoastrelid);
+	if (!RelationIsValid(toast_rel))
+		return;
+
+	/*
 	 * We're going to modify the size of the change, so to make sure the
 	 * accounting is correct we'll make it look like we're removing the change
 	 * now (with the old size), and then re-add it at the end.
@@ -4591,16 +4613,6 @@ ReorderBufferToastReplace(ReorderBuffer *rb, ReorderBufferTXN *txn,
 
 	oldcontext = MemoryContextSwitchTo(rb->context);
 
-	/* we should only have toast tuples in an INSERT or UPDATE */
-	Assert(change->data.tp.newtuple);
-
-	desc = RelationGetDescr(relation);
-
-	toast_rel = RelationIdGetRelation(relation->rd_rel->reltoastrelid);
-	if (!RelationIsValid(toast_rel))
-		elog(ERROR, "could not open relation with OID %u",
-			 relation->rd_rel->reltoastrelid);
-
 	toast_desc = RelationGetDescr(toast_rel);
 
 	/* should we allocate from stack instead? */
-- 
1.8.3.1

Reply via email to