Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2021-11-23 Thread Bossart, Nathan
The archives seem unhappy with my attempt to revive this old thread,
so here is a link to it in case anyone is looking for more context:


https://www.postgresql.org/message-id/flat/3476708e-7919-20cb-ca45-6603470565f7%40amazon.com

Nathan



Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2021-11-23 Thread Mark Dilger



> On Nov 23, 2021, at 4:26 PM, Bossart, Nathan  wrote:
> 
> I've finally gotten started on this, and I've attached a work-in-
> progress patch to gather some early feedback.  I'm specifically
> wondering if there are other places it'd be good to check for these
> unsupported combinations and whether we should use the
> HEAP_XMAX_IS_LOCKED_ONLY macro or just look for the
> HEAP_XMAX_LOCK_ONLY bit. 

I have to wonder if, when corruption is reported for conditions like this:

+   if ((ctx->tuphdr->t_infomask & HEAP_XMAX_COMMITTED) &&
+   HEAP_XMAX_IS_LOCKED_ONLY(ctx->tuphdr->t_infomask))

if the first thing we're going to want to know is which branch of the 
HEAP_XMAX_IS_LOCKED_ONLY macro evaluated true?  Should we split this check to 
do each branch of the macro separately, such as:

if (ctx->tuphdr->t_infomask & HEAP_XMAX_COMMITTED)
{   
if (ctx->tuphdr->t_infomask & HEAP_XMAX_LOCK_ONLY)
... report something ...
else if ((ctx->tuphdr->t_infomask & (HEAP_XMAX_IS_MULTI | HEAP_LOCK_MASK)) 
== HEAP_XMAX_EXCL_LOCK)
... report a different thing ...
}

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2021-11-23 Thread Bossart, Nathan
On 11/23/21, 4:36 PM, "Mark Dilger"  wrote:
> I have to wonder if, when corruption is reported for conditions like this:
>
> +   if ((ctx->tuphdr->t_infomask & HEAP_XMAX_COMMITTED) &&
> +   HEAP_XMAX_IS_LOCKED_ONLY(ctx->tuphdr->t_infomask))
>
> if the first thing we're going to want to know is which branch of the 
> HEAP_XMAX_IS_LOCKED_ONLY macro evaluated true?  Should we split this check to 
> do each branch of the macro separately, such as:
>
> if (ctx->tuphdr->t_infomask & HEAP_XMAX_COMMITTED)
> {
> if (ctx->tuphdr->t_infomask & HEAP_XMAX_LOCK_ONLY)
> ... report something ...
> else if ((ctx->tuphdr->t_infomask & (HEAP_XMAX_IS_MULTI | 
> HEAP_LOCK_MASK)) == HEAP_XMAX_EXCL_LOCK)
> ... report a different thing ...
> }

This is a good point.  Right now, you'd have to manually inspect the
infomask field to determine the exact form of the invalid combination.
My only worry with this is that we'd want to make sure these checks
stayed consistent with the definition of HEAP_XMAX_IS_LOCKED_ONLY in
htup_details.h.  I'm guessing HEAP_XMAX_IS_LOCKED_ONLY is unlikely to
change all that often, though.

Nathan



Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2021-11-23 Thread Mark Dilger



> On Nov 23, 2021, at 4:51 PM, Bossart, Nathan  wrote:
> 
> This is a good point.  Right now, you'd have to manually inspect the
> infomask field to determine the exact form of the invalid combination.
> My only worry with this is that we'd want to make sure these checks
> stayed consistent with the definition of HEAP_XMAX_IS_LOCKED_ONLY in
> htup_details.h.  I'm guessing HEAP_XMAX_IS_LOCKED_ONLY is unlikely to
> change all that often, though.

I expect that your patch will contain some addition to the amcheck (or 
pg_amcheck) tests, so if we ever allow some currently disallowed bit 
combination, we'd be reminded of the need to update amcheck.  So I'm not too 
worried about that aspect of this.

I prefer not to have to get a page (or full file) from a customer when the 
check reports corruption.  Even assuming they are comfortable giving that, 
which they may not be, it is an extra round trip to the customer asking for 
stuff.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2021-11-24 Thread Bossart, Nathan
On 11/23/21, 4:59 PM, "Mark Dilger"  wrote:
>> On Nov 23, 2021, at 4:51 PM, Bossart, Nathan  wrote:
>>
>> This is a good point.  Right now, you'd have to manually inspect the
>> infomask field to determine the exact form of the invalid combination.
>> My only worry with this is that we'd want to make sure these checks
>> stayed consistent with the definition of HEAP_XMAX_IS_LOCKED_ONLY in
>> htup_details.h.  I'm guessing HEAP_XMAX_IS_LOCKED_ONLY is unlikely to
>> change all that often, though.
>
> I expect that your patch will contain some addition to the amcheck (or 
> pg_amcheck) tests, so if we ever allow some currently disallowed bit 
> combination, we'd be reminded of the need to update amcheck.  So I'm not too 
> worried about that aspect of this.
>
> I prefer not to have to get a page (or full file) from a customer when the 
> check reports corruption.  Even assuming they are comfortable giving that, 
> which they may not be, it is an extra round trip to the customer asking for 
> stuff.

Another option we might consider is only checking for the
HEAP_XMAX_LOCK_ONLY bit instead of everything in
HEAP_XMAX_IS_LOCKED_ONLY.  IIUC everything else is only expected to
happen for upgrades from v9.2 or earlier, so it might be pretty rare
at this point.  Otherwise, I'll extract the exact bit pattern for the
error message as you suggest.

Nathan



Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2021-11-25 Thread Mark Dilger



> On Nov 24, 2021, at 12:53 PM, Bossart, Nathan  wrote:
> 
> Another option we might consider is only checking for the
> HEAP_XMAX_LOCK_ONLY bit instead of everything in
> HEAP_XMAX_IS_LOCKED_ONLY.  IIUC everything else is only expected to
> happen for upgrades from v9.2 or earlier, so it might be pretty rare
> at this point.  Otherwise, I'll extract the exact bit pattern for the
> error message as you suggest.

I would prefer to detect and report any "can't happen" bit patterns without 
regard for how likely the pattern may be.  The difficulty is in proving that a 
bit pattern is disallowed.  Just because you can't find a code path in the 
current code base that would create a pattern doesn't mean it won't have 
legitimately been created by some past release or upgrade path.  As such, any 
prohibitions explicitly in the backend, such as Asserts around a condition, are 
really valuable.  You can know that the pattern is disallowed, since the server 
would Assert on it if encountered.

Aside from that, I don't really buy the argument that databases upgraded from 
v9.2 or earlier are rare.  Even if servers *running* v9.2 or earlier are (or 
become) rare, servers initialized that far back which have been upgraded one or 
more times since then may be common.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2021-11-29 Thread Bossart, Nathan
On 11/25/21, 9:16 AM, "Mark Dilger"  wrote:
>> On Nov 24, 2021, at 12:53 PM, Bossart, Nathan  wrote:
>>
>> Another option we might consider is only checking for the
>> HEAP_XMAX_LOCK_ONLY bit instead of everything in
>> HEAP_XMAX_IS_LOCKED_ONLY.  IIUC everything else is only expected to
>> happen for upgrades from v9.2 or earlier, so it might be pretty rare
>> at this point.  Otherwise, I'll extract the exact bit pattern for the
>> error message as you suggest.
>
>I would prefer to detect and report any "can't happen" bit patterns without 
>regard for how likely the pattern may be.  The difficulty is in proving that a 
>bit pattern is disallowed.  Just because you can't find a code path in the 
>current code base that would create a pattern doesn't mean it won't have 
>legitimately been created by some past release or upgrade path.  As such, any 
>prohibitions explicitly in the backend, such as Asserts around a condition, 
>are really valuable.  You can know that the pattern is disallowed, since the 
>server would Assert on it if encountered.
>
> Aside from that, I don't really buy the argument that databases upgraded from 
> v9.2 or earlier are rare.  Even if servers *running* v9.2 or earlier are (or 
> become) rare, servers initialized that far back which have been upgraded one 
> or more times since then may be common.

Okay, I'll do it that way in the next revision.

Nathan



Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2021-12-21 Thread Mark Dilger



> On Dec 1, 2021, at 10:59 AM, Bossart, Nathan  wrote:
> 
> here is a v3

It took a while for me to get to this

@@ -1304,33 +1370,46 @@ HeapTupleSatisfiesVacuumHorizon(HeapTuple htup, Buffer 
buffer, TransactionId *de

if (HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask))
{
+   if (unlikely(tuple->t_infomask & HEAP_XMAX_COMMITTED))
+   {
+   if (tuple->t_infomask & HEAP_XMAX_LOCK_ONLY)
+   ereport(ERROR,
+   (errcode(ERRCODE_DATA_CORRUPTED),
+errmsg_internal("found tuple with HEAP_XMAX_COMMITTED "
+"and HEAP_XMAX_LOCK_ONLY")));
+
+   /* pre-v9.3 lock-only bit pattern */
+   ereport(ERROR,
+   (errcode(ERRCODE_DATA_CORRUPTED),
+errmsg_internal("found tuple with HEAP_XMAX_COMMITTED and"
+"HEAP_XMAX_EXCL_LOCK set and "
+"HEAP_XMAX_IS_MULTI unset")));
+   }
+

I find this bit hard to understand.  Does the comment mean to suggest that the 
*upgrade* process should have eliminated all pre-v9.3 bit patterns, and 
therefore any such existing patterns are certainly corruption, or does it mean 
that data written by pre-v9.3 servers (and not subsequently updated) is defined 
as corrupt, or  ?

I am not complaining that the logic is wrong, just trying to wrap my head 
around what the comment means.


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2022-03-22 Thread Nathan Bossart
On Thu, Mar 17, 2022 at 04:45:28PM -0700, Nathan Bossart wrote:
> I think this one requires some more work, and it needn't be a priority for
> v15, so I've adjusted the commitfest entry to v16 and moved it to the next
> commitfest.

Here is a new patch.  The main differences from v3 are in
heapam_visibility.c.  Specifically, instead of trying to work the infomask
checks into the visibility logic, I added a new function that does a couple
of assertions.  This function is called at the beginning of each visibility
function.

What do folks think?  The options I've considered are 1) not adding any
such checks to heapam_visibility.c, 2) only adding assertions like the
attached patch, or 3) actually using elog(ERROR, ...) when the invalid bit
patterns are detected.  AFAICT (1) is more in line with existing invalid
bit patterns (e.g., XMAX_COMMITTED + XMAX_IS_MULTI).  There are a couple of
scattered assertions, but most code paths don't check for it.  (2) adds
additional checks, but only for --enable-cassert builds.  (3) would add
checks even for non-assert builds, but there would presumably be some
performance cost involved.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 2d6b372cf61782e0fd52590b57b1c914b0ed7a4c Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 22 Mar 2022 15:35:34 -0700
Subject: [PATCH v4 1/1] disallow XMAX_COMMITTED + XMAX_LOCK_ONLY

---
 contrib/amcheck/verify_heapam.c | 19 
 src/backend/access/heap/README.tuplock  |  2 +-
 src/backend/access/heap/heapam.c| 10 +++
 src/backend/access/heap/heapam_visibility.c | 97 ++---
 src/backend/access/heap/hio.c   |  2 +
 5 files changed, 96 insertions(+), 34 deletions(-)

diff --git a/contrib/amcheck/verify_heapam.c b/contrib/amcheck/verify_heapam.c
index e5f7355dcb..f30b9c234f 100644
--- a/contrib/amcheck/verify_heapam.c
+++ b/contrib/amcheck/verify_heapam.c
@@ -665,6 +665,25 @@ check_tuple_header(HeapCheckContext *ctx)
 		 */
 	}
 
+	if ((ctx->tuphdr->t_infomask & HEAP_XMAX_COMMITTED) &&
+		(ctx->tuphdr->t_infomask & HEAP_XMAX_LOCK_ONLY))
+	{
+		report_corruption(ctx,
+		  pstrdup("locked-only should not be marked committed"));
+
+		/* As above, do not skip further checks. */
+	}
+
+	/* also check for pre-v9.3 lock-only bit pattern */
+	if ((ctx->tuphdr->t_infomask & HEAP_XMAX_COMMITTED) &&
+		HEAP_XMAX_IS_LOCKED_ONLY(ctx->tuphdr->t_infomask))
+	{
+		report_corruption(ctx,
+		  pstrdup("tuple with HEAP_XMAX_EXCL_LOCK set and HEAP_XMAX_IS_MULTI unset should not be marked committed"));
+
+		/* As above, do not skip further checks. */
+	}
+
 	if (infomask & HEAP_HASNULL)
 		expected_hoff = MAXALIGN(SizeofHeapTupleHeader + BITMAPLEN(ctx->natts));
 	else
diff --git a/src/backend/access/heap/README.tuplock b/src/backend/access/heap/README.tuplock
index 6441e8baf0..4e565e60ab 100644
--- a/src/backend/access/heap/README.tuplock
+++ b/src/backend/access/heap/README.tuplock
@@ -152,4 +152,4 @@ The following infomask bits are applicable:
   whether the XMAX is a TransactionId or a MultiXactId.
 
 We currently never set the HEAP_XMAX_COMMITTED when the HEAP_XMAX_IS_MULTI bit
-is set.
+or the HEAP_XMAX_LOCK_ONLY bit is set.
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 3746336a09..3f0b4cd754 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -5151,6 +5151,16 @@ l5:
 		MultiXactStatus status;
 		MultiXactStatus new_status;
 
+		/*
+		 * Currently we don't allow XMAX_LOCK_ONLY and XMAX_COMMITTED to both be
+		 * set in a tuple header, so cross-check.
+		 *
+		 * Note that this also checks for the special locked-only bit pattern
+		 * that may be present on servers that were pg_upgraded from pre-v9.3
+		 * versions.
+		 */
+		Assert(!HEAP_XMAX_IS_LOCKED_ONLY(old_infomask));
+
 		if (old_infomask2 & HEAP_KEYS_UPDATED)
 			status = MultiXactStatusUpdate;
 		else
diff --git a/src/backend/access/heap/heapam_visibility.c b/src/backend/access/heap/heapam_visibility.c
index ff0b8a688d..7d6fb66b0d 100644
--- a/src/backend/access/heap/heapam_visibility.c
+++ b/src/backend/access/heap/heapam_visibility.c
@@ -78,6 +78,31 @@
 #include "utils/snapmgr.h"
 
 
+/*
+ * InfomaskAssertions()
+ *
+ * Checks for invalid infomask bit patterns.
+ */
+static inline void
+InfomaskAssertions(HeapTupleHeader tuple)
+{
+	/*
+	 * XMAX_COMMITTED and XMAX_LOCK_ONLY cannot both be set in a tuple header.
+	 *
+	 * Note that this also checks for the special locked-only bit pattern that
+	 * may be present on servers that were pg_upgraded from pre-v9.3 versions.
+	 */
+	Assert(!((tuple->t_infomask & HEAP_XMAX_COMMITTED) &&
+			 HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask)));
+
+	/*
+	 * XMAX_COMMITTED and XMAX_IS_MULTI cannot be be set in a tuple header.
+	 */
+	Assert(!((tuple->t_infomask & HEAP_XMAX_COMMITTED) &&
+			 (tuple->t_infomask & HEAP_XMAX_IS_MULTI)));
+}
+
+
 /*
  * SetHintBits()
  *
@@ -113,6 +138,8 @@ static inline v

Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2022-03-22 Thread Andres Freund
Hi,

On 2022-03-22 16:06:40 -0700, Nathan Bossart wrote:
> On Thu, Mar 17, 2022 at 04:45:28PM -0700, Nathan Bossart wrote:
> > I think this one requires some more work, and it needn't be a priority for
> > v15, so I've adjusted the commitfest entry to v16 and moved it to the next
> > commitfest.
> 
> Here is a new patch.  The main differences from v3 are in
> heapam_visibility.c.  Specifically, instead of trying to work the infomask
> checks into the visibility logic, I added a new function that does a couple
> of assertions.  This function is called at the beginning of each visibility
> function.
> 
> What do folks think?  The options I've considered are 1) not adding any
> such checks to heapam_visibility.c, 2) only adding assertions like the
> attached patch, or 3) actually using elog(ERROR, ...) when the invalid bit
> patterns are detected.  AFAICT (1) is more in line with existing invalid
> bit patterns (e.g., XMAX_COMMITTED + XMAX_IS_MULTI).  There are a couple of
> scattered assertions, but most code paths don't check for it.  (2) adds
> additional checks, but only for --enable-cassert builds.  (3) would add
> checks even for non-assert builds, but there would presumably be some
> performance cost involved.

> From 2d6b372cf61782e0fd52590b57b1c914b0ed7a4c Mon Sep 17 00:00:00 2001
> From: Nathan Bossart 
> Date: Tue, 22 Mar 2022 15:35:34 -0700
> Subject: [PATCH v4 1/1] disallow XMAX_COMMITTED + XMAX_LOCK_ONLY

Just skimming this thread quickly, I really have no idea what this is trying
to achieve and the commit message doesn't help either...  I didn't read the
referenced thread, but I shouldn't have to, to get a basic idea.

Greetings,

Andres Freund




Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2022-03-22 Thread Nathan Bossart
On Tue, Mar 22, 2022 at 04:13:47PM -0700, Andres Freund wrote:
> Just skimming this thread quickly, I really have no idea what this is trying
> to achieve and the commit message doesn't help either...  I didn't read the
> referenced thread, but I shouldn't have to, to get a basic idea.

Ah, my bad.  I should've made sure the context was carried over better.  I
updated the commit message with some basic information about the intent.
Please let me know if there is anything else that needs to be cleared up.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 739ec6e42183280d57d867157cfe1b37d127ca54 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 22 Mar 2022 15:35:34 -0700
Subject: [PATCH v5 1/1] Disallow setting XMAX_COMMITTED and XMAX_LOCK_ONLY
 together.

Even though Postgres doesn't set both the XMAX_COMMITTED and
XMAX_LOCK_ONLY infomask bits on the same tuple header, the heap
visibility logic still accepts it.  However, other functions like
compute_new_xmax_infomask() seem to assume that this bit pattern
is not possible.

This change marks this bit pattern as disallowed, removes the heap
visibility logic that handles it, and adds checks like those for
other disallowed infomask bit combinations (e.g., XMAX_COMMITTED
and XMAX_IS_MULTI).  Besides simplifying the heap visibility logic
a bit, this change aims to reduce ambiguity about the legal tuple
header states.

Note that this change also disallows XMAX_COMMITTED together with
the special pre-v9.3 locked-only bit pattern that
HEAP_XMAX_IS_LOCKED_ONLY checks for.  This locked-only bit pattern
may still be present on servers pg_upgraded from pre-v9.3 versions.
---
 contrib/amcheck/verify_heapam.c | 19 
 src/backend/access/heap/README.tuplock  |  2 +-
 src/backend/access/heap/heapam.c| 10 +++
 src/backend/access/heap/heapam_visibility.c | 97 ++---
 src/backend/access/heap/hio.c   |  2 +
 5 files changed, 96 insertions(+), 34 deletions(-)

diff --git a/contrib/amcheck/verify_heapam.c b/contrib/amcheck/verify_heapam.c
index e5f7355dcb..f30b9c234f 100644
--- a/contrib/amcheck/verify_heapam.c
+++ b/contrib/amcheck/verify_heapam.c
@@ -665,6 +665,25 @@ check_tuple_header(HeapCheckContext *ctx)
 		 */
 	}
 
+	if ((ctx->tuphdr->t_infomask & HEAP_XMAX_COMMITTED) &&
+		(ctx->tuphdr->t_infomask & HEAP_XMAX_LOCK_ONLY))
+	{
+		report_corruption(ctx,
+		  pstrdup("locked-only should not be marked committed"));
+
+		/* As above, do not skip further checks. */
+	}
+
+	/* also check for pre-v9.3 lock-only bit pattern */
+	if ((ctx->tuphdr->t_infomask & HEAP_XMAX_COMMITTED) &&
+		HEAP_XMAX_IS_LOCKED_ONLY(ctx->tuphdr->t_infomask))
+	{
+		report_corruption(ctx,
+		  pstrdup("tuple with HEAP_XMAX_EXCL_LOCK set and HEAP_XMAX_IS_MULTI unset should not be marked committed"));
+
+		/* As above, do not skip further checks. */
+	}
+
 	if (infomask & HEAP_HASNULL)
 		expected_hoff = MAXALIGN(SizeofHeapTupleHeader + BITMAPLEN(ctx->natts));
 	else
diff --git a/src/backend/access/heap/README.tuplock b/src/backend/access/heap/README.tuplock
index 6441e8baf0..4e565e60ab 100644
--- a/src/backend/access/heap/README.tuplock
+++ b/src/backend/access/heap/README.tuplock
@@ -152,4 +152,4 @@ The following infomask bits are applicable:
   whether the XMAX is a TransactionId or a MultiXactId.
 
 We currently never set the HEAP_XMAX_COMMITTED when the HEAP_XMAX_IS_MULTI bit
-is set.
+or the HEAP_XMAX_LOCK_ONLY bit is set.
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 3746336a09..3f0b4cd754 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -5151,6 +5151,16 @@ l5:
 		MultiXactStatus status;
 		MultiXactStatus new_status;
 
+		/*
+		 * Currently we don't allow XMAX_LOCK_ONLY and XMAX_COMMITTED to both be
+		 * set in a tuple header, so cross-check.
+		 *
+		 * Note that this also checks for the special locked-only bit pattern
+		 * that may be present on servers that were pg_upgraded from pre-v9.3
+		 * versions.
+		 */
+		Assert(!HEAP_XMAX_IS_LOCKED_ONLY(old_infomask));
+
 		if (old_infomask2 & HEAP_KEYS_UPDATED)
 			status = MultiXactStatusUpdate;
 		else
diff --git a/src/backend/access/heap/heapam_visibility.c b/src/backend/access/heap/heapam_visibility.c
index ff0b8a688d..7d6fb66b0d 100644
--- a/src/backend/access/heap/heapam_visibility.c
+++ b/src/backend/access/heap/heapam_visibility.c
@@ -78,6 +78,31 @@
 #include "utils/snapmgr.h"
 
 
+/*
+ * InfomaskAssertions()
+ *
+ * Checks for invalid infomask bit patterns.
+ */
+static inline void
+InfomaskAssertions(HeapTupleHeader tuple)
+{
+	/*
+	 * XMAX_COMMITTED and XMAX_LOCK_ONLY cannot both be set in a tuple header.
+	 *
+	 * Note that this also checks for the special locked-only bit pattern that
+	 * may be present on servers that were pg_upgraded from pre-v9.3 versions.
+	 */
+	Assert(!((tuple->t_infomask & HEAP_XMAX_COMMITTED) &&
+			 HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask

Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2022-10-20 Thread Corey Huinker
On Tue, Sep 20, 2022 at 2:32 PM Nathan Bossart 
wrote:

> Here is a rebased patch for cfbot.
>
>
>
Applies, passes make check world.

Patch is straightforward, but the previous code is less so. It purported to
set XMAX_COMMITTED _or_ XMAX_INVALID, but never seemed to un-set
XMAX_COMMITTED, was that the source of the double-setting?


Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2023-02-02 Thread Andres Freund
Hi,

On 2022-09-20 11:32:02 -0700, Nathan Bossart wrote:
> Note that this change also disallows XMAX_COMMITTED together with
> the special pre-v9.3 locked-only bit pattern that
> HEAP_XMAX_IS_LOCKED_ONLY checks for.  This locked-only bit pattern
> may still be present on servers pg_upgraded from pre-v9.3 versions.

Given that fact, that aspect at least seems to be not viable?

Greetings,

Andres Freund




Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2023-02-02 Thread Nathan Bossart
On Thu, Feb 02, 2023 at 06:59:51AM -0800, Andres Freund wrote:
> On 2022-09-20 11:32:02 -0700, Nathan Bossart wrote:
>> Note that this change also disallows XMAX_COMMITTED together with
>> the special pre-v9.3 locked-only bit pattern that
>> HEAP_XMAX_IS_LOCKED_ONLY checks for.  This locked-only bit pattern
>> may still be present on servers pg_upgraded from pre-v9.3 versions.
> 
> Given that fact, that aspect at least seems to be not viable?

AFAICT from looking at the v9.2 code, the same idea holds true for this
special bit pattern.  I only see HEAP_XMAX_INVALID set when one of the
infomask lock bits is set, and those bits now correspond to
HEAP_XMAX_LOCK_ONLY and HEAP_XMAX_EXCL_LOCK (which are both covered by the
HEAP_XMAX_IS_LOCKED_ONLY macro).  Of course, I could be missing something.
Do you think we should limit this to the v9.3+ bit pattern?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2022-01-06 Thread Bossart, Nathan
On 12/21/21, 11:42 AM, "Mark Dilger"  wrote:
> +   /* pre-v9.3 lock-only bit pattern */
> +   ereport(ERROR,
> +   (errcode(ERRCODE_DATA_CORRUPTED),
> +errmsg_internal("found tuple with HEAP_XMAX_COMMITTED 
> and"
> +"HEAP_XMAX_EXCL_LOCK set and "
> +"HEAP_XMAX_IS_MULTI unset")));
> +   }
> +
>
> I find this bit hard to understand.  Does the comment mean to suggest that 
> the *upgrade* process should have eliminated all pre-v9.3 bit patterns, and 
> therefore any such existing patterns are certainly corruption, or does it 
> mean that data written by pre-v9.3 servers (and not subsequently updated) is 
> defined as corrupt, or  ?
>
> I am not complaining that the logic is wrong, just trying to wrap my head 
> around what the comment means.

This is just another way that a tuple may be marked locked-only, and
we want to explicitly disallow locked-only + xmax-committed.  This bit
pattern may be present on servers that were pg_upgraded from pre-v9.3
versions.  See commits 0ac5ad5 and 74ebba8 for more detail.

Nathan



Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2020-08-26 Thread Alvaro Herrera
On 2020-Aug-20, Jeremy Schneider wrote:

> While working with Nathan Bossart on an extension, we came across an
> interesting quirk and possible inconsistency in the PostgreSQL code
> around infomask flags.  I'd like to know if there's something I'm
> misunderstanding here or if this really is a correctness/robustness gap
> in the code.
> 
> At the root of it is the relationship between the XMAX_LOCK_ONLY and
> XMAX_COMMITTED infomask bits.

I spent a lot of time wondering about XMAX_COMMITTED.  It seemed to me
that it would make no sense to have xacts that were lock-only yet have
XMAX_COMMITTED set; so I looked hard to make sure no place would ever
cause such a situation to arise.  However, I still made my best effort
to make the code cope with such a combination correctly if it ever
showed up.

And I may have missed assumptions such as this one, even if they seem
obvious in retrospect, such as in compute_new_xmax_infomask (which, as
you'll notice, changed considerably from what was initially committed):

> But then there's one place in heapam.c where an assumption appears that
> this state will never happen - the compute_new_xmax_infomask() function:
> 
> https://github.com/postgres/postgres/blob/9168793d7275b4b318c153d607fba55d14098c19/src/backend/access/heap/heapam.c#L4800
> 
> else if (old_infomask & HEAP_XMAX_COMMITTED)
> {
> ...
> if (old_infomask2 & HEAP_KEYS_UPDATED)
> status = MultiXactStatusUpdate;
> else
> status = MultiXactStatusNoKeyUpdate;
> new_status = get_mxact_status_for_lock(mode, is_update);
> ...
> new_xmax = MultiXactIdCreate(xmax, status, add_to_xmax, new_status);
> 
> When that code sees XMAX_COMMITTED, it assumes the xmax can't possibly
> be LOCK_ONLY and it sets the status to something sufficiently high
> enough to guarantee that ISUPDATE_from_mxstatus() returns true. That
> means that when you try to update this tuple and
> compute_new_xmax_infomask() is called with an "update" status, two
> "update" statuses are sent to MultiXactIdCreate() which then fails
> (thank goodness) with the error "new multixact has more than one
> updating member".
> 
> https://github.com/postgres/postgres/blob/cd5e82256de5895595cdd99ecb03aea15b346f71/src/backend/access/transam/multixact.c#L784

Have you ever observed this error case hit?  I've never seen a report of
that.

> The UpdateXmaxHintBits() code to always set the INVALID bit wasn't in
> any patches on the mailing list but it was committed and it seems to
> have worked very well. As a result it seems nearly impossible to get
> into the state where you have both XMAX_LOCK_ONLY and XMAX_COMMITTED
> bits set; thus it seems we've avoided problems in
> compute_new_xmax_infomask().

Phew.

(I guess I should fully expect that somebody, given sufficient time,
would carefully inspect the committed code against submitted patches ...
especially given that I do such inspections myself.)

> But nonetheless it seems worth making the code more robust by having the
> compute_new_xmax_infomask() function handle this state correctly just as
> the visibility code does. It should only require a simple patch like
> this (credit to Nathan Bossart):
> 
> diff --git a/src/backend/access/heap/heapam.c
> b/src/backend/access/heap/heapam.c
> index d881f4cd46..371e3e4f61 100644
> --- a/src/backend/access/heap/heapam.c
> +++ b/src/backend/access/heap/heapam.c
> @@ -4695,7 +4695,9 @@ compute_new_xmax_infomask(TransactionId xmax,
> uint16 old_infomask,
>  l5:
> new_infomask = 0;
> new_infomask2 = 0;
> -   if (old_infomask & HEAP_XMAX_INVALID)
> +   if (old_infomask & HEAP_XMAX_INVALID ||
> +   (old_infomask & HEAP_XMAX_COMMITTED &&
> +HEAP_XMAX_IS_LOCKED_ONLY(old_infomask)))
> {
> /*
>  * No previous locker; we just insert our own TransactionId.

We could do this in stable branches, if there were any reports that
this inconsistency is happening in real world databases.

> Alternatively, if we don't want to take this approach, then I'd argue
> that we should update README.tuplock to explicitly state that
> XMAX_LOCK_ONLY and XMAX_COMMITTED are incompatible (just as it already
> states for HEAP_XMAX_IS_MULTI and HEAP_XMAX_COMMITTED) and clean up the
> code in heapam_visibility.c for consistency.

Yeah, I like this approach better for the master branch; not just clean
up as in remove the cases that handle it, but also actively elog(ERROR)
if the condition ever occurs (hopefully with some known way to fix the
problem; maybe by "WITH tup AS (DELETE FROM tab WHERE .. RETURNING *)
INSERT * INTO tab FROM tup" or similar.)

> Might be worth adding a note to README.tuplock either way about
> valid/invalid combinations of infomask flags. Might help avoid some
> confusion as people approach the code base.

Absolutely.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Suppor

Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2020-08-26 Thread Bossart, Nathan
On 8/26/20, 12:16 PM, "Alvaro Herrera"  wrote:
> On 2020-Aug-20, Jeremy Schneider wrote:
>> Alternatively, if we don't want to take this approach, then I'd argue
>> that we should update README.tuplock to explicitly state that
>> XMAX_LOCK_ONLY and XMAX_COMMITTED are incompatible (just as it already
>> states for HEAP_XMAX_IS_MULTI and HEAP_XMAX_COMMITTED) and clean up the
>> code in heapam_visibility.c for consistency.
>
> Yeah, I like this approach better for the master branch; not just clean
> up as in remove the cases that handle it, but also actively elog(ERROR)
> if the condition ever occurs (hopefully with some known way to fix the
> problem; maybe by "WITH tup AS (DELETE FROM tab WHERE .. RETURNING *)
> INSERT * INTO tab FROM tup" or similar.)

+1.  I wouldn't mind picking this up, but it might be some time before
I can get to it.

Nathan



Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2020-08-27 Thread Peter Geoghegan
On Wed, Aug 26, 2020 at 12:16 PM Alvaro Herrera
 wrote:
> We could do this in stable branches, if there were any reports that
> this inconsistency is happening in real world databases.

I hope that the new heapam amcheck stuff eventually leads to our
having total (or near total) certainty about what correct on-disk
states are possible, regardless of the exact pg_upgrade + minor
version paths. We should take a strict line on this stuff where
possible. If that turns out to be wrong in some detail, then it's
relatively easy to fix as a bug in amcheck itself.

There is a high cost to allowing ambiguity about what heapam states
are truly legal/possible. It makes future development projects harder.

-- 
Peter Geoghegan




Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2020-08-27 Thread Mark Dilger



> On Aug 27, 2020, at 4:47 PM, Peter Geoghegan  wrote:
> 
> On Wed, Aug 26, 2020 at 12:16 PM Alvaro Herrera
>  wrote:
>> We could do this in stable branches, if there were any reports that
>> this inconsistency is happening in real world databases.
> 
> I hope that the new heapam amcheck stuff eventually leads to our
> having total (or near total) certainty about what correct on-disk
> states are possible, regardless of the exact pg_upgrade + minor
> version paths. We should take a strict line on this stuff where
> possible. If that turns out to be wrong in some detail, then it's
> relatively easy to fix as a bug in amcheck itself.
> 
> There is a high cost to allowing ambiguity about what heapam states
> are truly legal/possible. It makes future development projects harder.

The amcheck patch has Asserts in hio.c that purport to disallow writing invalid 
header bits to disk.  The combination being discussed here is not disallowed, 
but if there is consensus that it is an illegal combination, we could certainly 
add it:

diff --git a/src/backend/access/heap/hio.c b/src/backend/access/heap/hio.c
index aa3f14c019..ca357410a2 100644
--- a/src/backend/access/heap/hio.c
+++ b/src/backend/access/heap/hio.c
@@ -47,6 +47,17 @@ RelationPutHeapTuple(Relation relation,
 */
Assert(!token || HeapTupleHeaderIsSpeculative(tuple->t_data));
 
+   /*
+* Do not allow tuples with invalid combinations of hint bits to be 
placed
+* on a page.  These combinations are detected as corruption by the
+* contrib/amcheck logic, so if you disable one or both of these
+* assertions, make corresponding changes there.
+*/
+   Assert(!((tuple->t_data->t_infomask & HEAP_XMAX_LOCK_ONLY) &&
+(tuple->t_data->t_infomask2 & HEAP_KEYS_UPDATED)));
+   Assert(!((tuple->t_data->t_infomask & HEAP_XMAX_COMMITTED) &&
+(tuple->t_data->t_infomask & HEAP_XMAX_IS_MULTI)));
+
/* Add the tuple to the page */
pageHeader = BufferGetPage(buffer);
 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2020-08-27 Thread Peter Geoghegan
On Thu, Aug 27, 2020 at 4:57 PM Mark Dilger
 wrote:
> The amcheck patch has Asserts in hio.c that purport to disallow writing 
> invalid header bits to disk.

Can it also be a runtime check for the verification process? I think
that we can easily afford to be fairly exhaustive about stuff like
this.

-- 
Peter Geoghegan




Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2020-08-27 Thread Mark Dilger



> On Aug 27, 2020, at 4:58 PM, Peter Geoghegan  wrote:
> 
> On Thu, Aug 27, 2020 at 4:57 PM Mark Dilger
>  wrote:
>> The amcheck patch has Asserts in hio.c that purport to disallow writing 
>> invalid header bits to disk.
> 
> Can it also be a runtime check for the verification process? I think
> that we can easily afford to be fairly exhaustive about stuff like
> this.

These two are both checked in verify_heapam.c.  The point is that the system 
will also refuse to write out pages that have this corruption.  The Asserts 
could be converted to panics or whatever, but that has other more serious 
consequences.  Did you have something else in mind?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2020-08-27 Thread Peter Geoghegan
On Thu, Aug 27, 2020 at 5:06 PM Mark Dilger
 wrote:
> These two are both checked in verify_heapam.c.  The point is that the system 
> will also refuse to write out pages that have this corruption.  The Asserts 
> could be converted to panics or whatever, but that has other more serious 
> consequences.  Did you have something else in mind?

Oh, I see -- you propose to add both an assert to hio.c, as well as a
check to amcheck itself. That seems like the right thing to do.

Thanks

-- 
Peter Geoghegan




Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2022-09-20 Thread Nathan Bossart
Here is a rebased patch for cfbot.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 9ae1f5409ddeee17b99a9565247da885cbb86be9 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 22 Mar 2022 15:35:34 -0700
Subject: [PATCH v6 1/1] Disallow setting XMAX_COMMITTED and XMAX_LOCK_ONLY
 together.

Even though Postgres doesn't set both the XMAX_COMMITTED and
XMAX_LOCK_ONLY infomask bits on the same tuple header, the heap
visibility logic still accepts it.  However, other functions like
compute_new_xmax_infomask() seem to assume that this bit pattern
is not possible.

This change marks this bit pattern as disallowed, removes the heap
visibility logic that handles it, and adds checks like those for
other disallowed infomask bit combinations (e.g., XMAX_COMMITTED
and XMAX_IS_MULTI).  Besides simplifying the heap visibility logic
a bit, this change aims to reduce ambiguity about the legal tuple
header states.

Note that this change also disallows XMAX_COMMITTED together with
the special pre-v9.3 locked-only bit pattern that
HEAP_XMAX_IS_LOCKED_ONLY checks for.  This locked-only bit pattern
may still be present on servers pg_upgraded from pre-v9.3 versions.
---
 contrib/amcheck/verify_heapam.c | 19 
 src/backend/access/heap/README.tuplock  |  2 +-
 src/backend/access/heap/heapam.c| 10 +++
 src/backend/access/heap/heapam_visibility.c | 97 ++---
 src/backend/access/heap/hio.c   |  2 +
 5 files changed, 96 insertions(+), 34 deletions(-)

diff --git a/contrib/amcheck/verify_heapam.c b/contrib/amcheck/verify_heapam.c
index d33f33f170..f74f88afc5 100644
--- a/contrib/amcheck/verify_heapam.c
+++ b/contrib/amcheck/verify_heapam.c
@@ -663,6 +663,25 @@ check_tuple_header(HeapCheckContext *ctx)
 		 */
 	}
 
+	if ((ctx->tuphdr->t_infomask & HEAP_XMAX_COMMITTED) &&
+		(ctx->tuphdr->t_infomask & HEAP_XMAX_LOCK_ONLY))
+	{
+		report_corruption(ctx,
+		  pstrdup("locked-only should not be marked committed"));
+
+		/* As above, do not skip further checks. */
+	}
+
+	/* also check for pre-v9.3 lock-only bit pattern */
+	if ((ctx->tuphdr->t_infomask & HEAP_XMAX_COMMITTED) &&
+		HEAP_XMAX_IS_LOCKED_ONLY(ctx->tuphdr->t_infomask))
+	{
+		report_corruption(ctx,
+		  pstrdup("tuple with HEAP_XMAX_EXCL_LOCK set and HEAP_XMAX_IS_MULTI unset should not be marked committed"));
+
+		/* As above, do not skip further checks. */
+	}
+
 	if (infomask & HEAP_HASNULL)
 		expected_hoff = MAXALIGN(SizeofHeapTupleHeader + BITMAPLEN(ctx->natts));
 	else
diff --git a/src/backend/access/heap/README.tuplock b/src/backend/access/heap/README.tuplock
index 6441e8baf0..4e565e60ab 100644
--- a/src/backend/access/heap/README.tuplock
+++ b/src/backend/access/heap/README.tuplock
@@ -152,4 +152,4 @@ The following infomask bits are applicable:
   whether the XMAX is a TransactionId or a MultiXactId.
 
 We currently never set the HEAP_XMAX_COMMITTED when the HEAP_XMAX_IS_MULTI bit
-is set.
+or the HEAP_XMAX_LOCK_ONLY bit is set.
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index eb811d751e..616df576c3 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -5115,6 +5115,16 @@ l5:
 		MultiXactStatus status;
 		MultiXactStatus new_status;
 
+		/*
+		 * Currently we don't allow XMAX_LOCK_ONLY and XMAX_COMMITTED to both be
+		 * set in a tuple header, so cross-check.
+		 *
+		 * Note that this also checks for the special locked-only bit pattern
+		 * that may be present on servers that were pg_upgraded from pre-v9.3
+		 * versions.
+		 */
+		Assert(!HEAP_XMAX_IS_LOCKED_ONLY(old_infomask));
+
 		if (old_infomask2 & HEAP_KEYS_UPDATED)
 			status = MultiXactStatusUpdate;
 		else
diff --git a/src/backend/access/heap/heapam_visibility.c b/src/backend/access/heap/heapam_visibility.c
index 6e33d1c881..e6ee8ff1fa 100644
--- a/src/backend/access/heap/heapam_visibility.c
+++ b/src/backend/access/heap/heapam_visibility.c
@@ -78,6 +78,31 @@
 #include "utils/snapmgr.h"
 
 
+/*
+ * InfomaskAssertions()
+ *
+ * Checks for invalid infomask bit patterns.
+ */
+static inline void
+InfomaskAssertions(HeapTupleHeader tuple)
+{
+	/*
+	 * XMAX_COMMITTED and XMAX_LOCK_ONLY cannot both be set in a tuple header.
+	 *
+	 * Note that this also checks for the special locked-only bit pattern that
+	 * may be present on servers that were pg_upgraded from pre-v9.3 versions.
+	 */
+	Assert(!((tuple->t_infomask & HEAP_XMAX_COMMITTED) &&
+			 HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_infomask)));
+
+	/*
+	 * XMAX_COMMITTED and XMAX_IS_MULTI cannot be be set in a tuple header.
+	 */
+	Assert(!((tuple->t_infomask & HEAP_XMAX_COMMITTED) &&
+			 (tuple->t_infomask & HEAP_XMAX_IS_MULTI)));
+}
+
+
 /*
  * SetHintBits()
  *
@@ -113,6 +138,8 @@ static inline void
 SetHintBits(HeapTupleHeader tuple, Buffer buffer,
 			uint16 infomask, TransactionId xid)
 {
+	InfomaskAssertions(tuple);
+
 	if (TransactionIdIsValid(xid))
 	{
 		/* NB: xid must be known committed here! 

Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2022-03-17 Thread Nathan Bossart
I think this one requires some more work, and it needn't be a priority for
v15, so I've adjusted the commitfest entry to v16 and moved it to the next
commitfest.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com