Re: Why doesn't Vacuum FULL update the VM

2023-09-03 Thread Melanie Plageman
On Fri, Sep 1, 2023 at 8:38 PM Peter Geoghegan  wrote:
>
> On Fri, Sep 1, 2023 at 12:34 PM Melanie Plageman
>  wrote:
> > I don't see why the visibility map shouldn't be updated so that all of
> > the pages show all visible and all frozen for this relation after the
> > vacuum full.
>
> There was a similar issue with COPY FREEZE. It was fixed relatively
> recently -- see commit 7db0cd21.

Thanks for digging that up for me!

My first thought after looking a bit at the vacuum full/cluster code
is that we could add an all_visible flag to the RewriteState and set
it to false in heapam_relation_copy_for_cluster() in roughly the same
cases as heap_page_is_all_visible(), then, if rwstate->all_visible is
true in raw_heap_insert(), when we need to advance to the next block,
we set the page all visible and update the VM. Either way, we reset
all_visible to true since we are advancing to the next block.

I wrote a rough outline of that idea in the attached patches. It
doesn't emit WAL for the VM update or handle toast tables or anything
(it is just a rough sketch), but I just wondered if this was in the
right direction.

- Melanie
From eda40534e169e0bac0d11d89947130b44653b925 Mon Sep 17 00:00:00 2001
From: Melanie Plageman 
Date: Sun, 3 Sep 2023 16:17:28 -0400
Subject: [PATCH v0 3/3] set all_visible in VM vac full

---
 src/backend/access/heap/heapam_handler.c | 15 ++-
 src/backend/access/heap/rewriteheap.c| 16 
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 5a17112c91..a37be8cfef 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -791,6 +791,7 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 		HeapTuple	tuple;
 		Buffer		buf;
 		bool		isdead;
+		TransactionId xmin;
 
 		CHECK_FOR_INTERRUPTS();
 
@@ -853,10 +854,18 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 break;
 			case HEAPTUPLE_RECENTLY_DEAD:
 *tups_recently_dead += 1;
-/* fall through */
+isdead = false;
+rwstate->all_visible = false;
+break;
 			case HEAPTUPLE_LIVE:
 /* Live or recently dead, must copy it */
+xmin = HeapTupleHeaderGetXmin(tuple->t_data);
 isdead = false;
+if (!HeapTupleHeaderXminCommitted(tuple->t_data))
+	rwstate->all_visible = false;
+
+else if (!TransactionIdPrecedes(xmin, OldestXmin))
+	rwstate->all_visible = false;
 break;
 			case HEAPTUPLE_INSERT_IN_PROGRESS:
 
@@ -873,6 +882,7 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 	elog(WARNING, "concurrent insert in progress within table \"%s\"",
 		 RelationGetRelationName(OldHeap));
 /* treat as live */
+rwstate->all_visible = false;
 isdead = false;
 break;
 			case HEAPTUPLE_DELETE_IN_PROGRESS:
@@ -886,6 +896,7 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 		 RelationGetRelationName(OldHeap));
 /* treat as recently dead */
 *tups_recently_dead += 1;
+rwstate->all_visible = false;
 isdead = false;
 break;
 			default:
@@ -906,6 +917,8 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
 *tups_vacuumed += 1;
 *tups_recently_dead -= 1;
 			}
+			else
+rwstate->all_visible = false;
 			continue;
 		}
 
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 93e75dc7c2..b3d63200d4 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -111,6 +111,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "access/xloginsert.h"
+#include "access/visibilitymap.h"
 #include "catalog/catalog.h"
 #include "common/file_utils.h"
 #include "lib/ilist.h"
@@ -234,6 +235,7 @@ begin_heap_rewrite(Relation old_heap, Relation new_heap, TransactionId oldest_xm
 	state->rs_freeze_xid = freeze_xid;
 	state->rs_cutoff_multi = cutoff_multi;
 	state->rs_cxt = rw_cxt;
+	state->all_visible = true;
 
 	/* Initialize hash tables used to track update chains */
 	hash_ctl.keysize = sizeof(TidHashKey);
@@ -647,6 +649,8 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
 			 * enforce saveFreeSpace unconditionally.
 			 */
 
+			if (state->all_visible)
+PageSetAllVisible(page);
 			/* XLOG stuff */
 			if (RelationNeedsWAL(state->rs_new_rel))
 log_newpage(&state->rs_new_rel->rd_locator,
@@ -655,6 +659,18 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
 			page,
 			true);
 
+			if (state->all_visible)
+			{
+Buffer vmbuffer = InvalidBuffer;
+visibilitymap_pin(state->rs_new_rel, state->rs_blockno, &vmbuffer);
+visibilitymap_set_unbuffered(state->rs_new_rel, state->rs_blockno, vmbuffer,
+	VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN);
+if (BufferIsValid(vmbuffer))
+	ReleaseBuffer(vmbuffer);
+			}
+
+			state->all_visible = true;
+
 			/*
 			 * 

Re: Why doesn't Vacuum FULL update the VM

2023-09-01 Thread Peter Geoghegan
On Fri, Sep 1, 2023 at 12:34 PM Melanie Plageman
 wrote:
> I don't see why the visibility map shouldn't be updated so that all of
> the pages show all visible and all frozen for this relation after the
> vacuum full.

There was a similar issue with COPY FREEZE. It was fixed relatively
recently -- see commit 7db0cd21.

-- 
Peter Geoghegan




Re: Why doesn't Vacuum FULL update the VM

2023-09-01 Thread Vik Fearing

On 9/1/23 21:34, Melanie Plageman wrote:

Hi,

I noticed that VACUUM FULL actually does freeze the tuples in the
rewritten table (heap_freeze_tuple()) but then it doesn't mark them
all visible or all frozen in the visibility map. I don't understand
why. It seems like it would save us future work.


I have often wondered this as well, but obviously I haven't done 
anything about it.



I don't see why the visibility map shouldn't be updated so that all of
the pages show all visible and all frozen for this relation after the
vacuum full.


It cannot just blindly mark everything all visible and all frozen 
because it will copy over dead tuples that concurrent transactions are 
still allowed to see.

--
Vik Fearing





Why doesn't Vacuum FULL update the VM

2023-09-01 Thread Melanie Plageman
Hi,

I noticed that VACUUM FULL actually does freeze the tuples in the
rewritten table (heap_freeze_tuple()) but then it doesn't mark them
all visible or all frozen in the visibility map. I don't understand
why. It seems like it would save us future work.

Here is an example:

create extension pg_visibility;
drop table if exists foo;
create table foo(a int) with (autovacuum_enabled=false);
insert into foo select i%3 from generate_series(1,300)i;
update foo set a = 5 where a = 2;
select * from pg_visibility_map_summary('foo');
vacuum (verbose) foo;
select * from pg_visibility_map_summary('foo');
vacuum (full, verbose) foo;
select * from pg_visibility_map_summary('foo');

I don't see why the visibility map shouldn't be updated so that all of
the pages show all visible and all frozen for this relation after the
vacuum full.

- Melanie