On Thu, Mar 28, 2024 at 4:49 AM Heikki Linnakangas <hlinn...@iki.fi> wrote:
>
> On 28/03/2024 03:53, Melanie Plageman wrote:
> > On Thu, Mar 28, 2024 at 01:04:04AM +0200, Heikki Linnakangas wrote:
> >> One change with this is that live_tuples and many of the other fields are
> >> now again updated, even if the caller doesn't need them. It was hard to 
> >> skip
> >> them in a way that would save any cycles, with the other refactorings.
> >
> > I am worried we are writing checks that are going to have to come out of
> > SELECT queries' bank accounts, but I'll do some benchmarking when we're
> > all done with major refactoring.
>
> Sounds good, thanks.

Actually, after having looked at it again, there really are only a few
more increments of various counters, the memory consumed by revisit,
and the additional setting of offsets in marked. I think a few
carefully constructed queries which do on-access pruning could test
the impact of this (as opposed to a bigger benchmarking endeavor).

I also wonder if there would be any actual impact of marking the
various record_*() routines inline.

> >> @@ -728,10 +832,12 @@ htsv_get_valid_status(int status)
> >>    * DEAD, our visibility test is just too coarse to detect it.
> >>    *
> >>    * In general, pruning must never leave behind a DEAD tuple that still 
> >> has
> >> - * tuple storage.  VACUUM isn't prepared to deal with that case.  That's 
> >> why
> >> + * tuple storage.  VACUUM isn't prepared to deal with that case (FIXME: 
> >> it no longer cares, right?).
> >> + * That's why
> >>    * VACUUM prunes the same heap page a second time (without dropping its 
> >> lock
> >>    * in the interim) when it sees a newly DEAD tuple that we initially saw 
> >> as
> >> - * in-progress.  Retrying pruning like this can only happen when an 
> >> inserting
> >> + * in-progress (FIXME: Really? Does it still do that?).
> >
> > Yea, I think all that is no longer true. I missed this comment back
> > then.
>
> Committed a patch to remove it.
>
> >> @@ -981,7 +1069,18 @@ heap_prune_chain(Buffer buffer, OffsetNumber 
> >> rootoffnum,
> >>               * RECENTLY_DEAD tuples just in case there's a DEAD one after 
> >> them;
> >>               * but we can't advance past anything else.  We have to make 
> >> sure that
> >>               * we don't miss any DEAD tuples, since DEAD tuples that 
> >> still have
> >> -             * tuple storage after pruning will confuse VACUUM.
> >> +             * tuple storage after pruning will confuse VACUUM (FIXME: 
> >> not anymore
> >> +             * I think?).
> >
> > Meaning, it won't confuse vacuum anymore or there won't be DEAD tuples
> > with storage after pruning anymore?
>
> I meant that it won't confuse VACUUM anymore. lazy_scan_prune() doesn't
> loop through the items on the page checking their visibility anymore.
>
> Hmm, one confusion remains though: In the 2nd phase of vacuum, we remove
> all the dead line pointers that we have now removed from the indexes.
> When we do that, we assume them all to be dead line pointers, without
> storage, rather than normal tuples that happen to be HEAPTUPLE_DEAD. So
> it's important that if pruning would leave behind HEAPTUPLE_DEAD tuples,
> they are not included in 'deadoffsets'.

It seems like master only adds items it is marking LP_DEAD to
deadoffsets. And things marked LP_DEAD have lp_len set to 0.

> In any case, let's just make sure that pruning doesn't leave
> HEAPTUPLE_DEAD tuples. There's no reason it should.

Maybe worth adding an assert to

static void
heap_prune_record_unchanged_lp_dead(ItemId itemid, PruneState
*prstate, OffsetNumber offnum)
{
...
    Assert(!ItemIdHasStorage(itemid));
    prstate->deadoffsets[prstate->lpdead_items++] = offnum;
}

By the way, I wasn't quite sure about the purpose of
heap_prune_record_unchanged_lp_dead(). It is called in
heap_prune_chain() in a place where we didn't add things to
deadoffsets before, no?

        /*
         * Likewise, a dead line pointer can't be part of the chain. (We
         * already eliminated the case of dead root tuple outside this
         * function.)
         */
        if (ItemIdIsDead(lp))
        {
            /*
             * If the caller set PRUNE_DO_MARK_UNUSED_NOW, we can set dead
             * line pointers LP_UNUSED now.
             */
            if (unlikely(prstate->actions & PRUNE_DO_MARK_UNUSED_NOW))
                heap_prune_record_unused(prstate, offnum, false);
            else
                heap_prune_record_unchanged_lp_dead(lp, prstate, offnum);
            break;
        }

> >> @@ -1224,10 +1327,21 @@ heap_prune_record_live_or_recently_dead(Page page, 
> >> PruneState *prstate, OffsetNu
> >>       * ensure the math works out. The assumption that the transaction will
> >>       * commit and update the counters after we report is a bit shaky; but 
> >> it
> >>       * is what acquire_sample_rows() does, so we do the same to be 
> >> consistent.
> >> +     *
> >> +     * HEAPTUPLE_DEAD are handled by the other heap_prune_record_*()
> >> +     * subroutines.  They don't count dead items like 
> >> acquire_sample_rows()
> >> +     * does, because we assume that all dead items will become LP_UNUSED
> >> +     * before VACUUM finishes.  This difference is only superficial.  
> >> VACUUM
> >> +     * effectively agrees with ANALYZE about DEAD items, in the end.  
> >> VACUUM
> >> +     * won't remember LP_DEAD items, but only because they're not 
> >> supposed to
> >> +     * be left behind when it is done. (Cases where we bypass index 
> >> vacuuming
> >> +     * will violate this optimistic assumption, but the overall impact of 
> >> that
> >> +     * should be negligible.) FIXME: I don't understand that last 
> >> sentence in
> >> +     * parens. It was copied from elsewhere.
> >
> > If we bypass index vacuuming, there will be LP_DEAD items left behind
> > when we are done because we didn't do index vacuuming and then reaping
> > of those dead items. All of these comments are kind of a copypasta,
> > though.
>
> Ah, gotcha, makes sense now. I didn't remember that we sometimes by pass
> index vacuuming if there are very few dead items. I thought this talked
> about the case that there are no indexes, but that case is OK.

These comments could use another pass. I had added some extra
(probably redundant) content when I thought I was refactoring it a
certain way and then changed my mind.

Attached is a diff with some ideas I had for a bit of code simplification.

Are you working on cleaning this patch up or should I pick it up?

I wonder if it makes sense to move some of the changes to
heap_prune_chain() with setting things in marked/revisited to the
start of the patch set (to be committed first).

- Melanie
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index a8ed11a1858..2f477aa43b1 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -143,7 +143,7 @@ static void heap_prune_record_dead_or_unused(PruneState *prstate, OffsetNumber o
 static void heap_prune_record_unused(PruneState *prstate, OffsetNumber offnum, bool was_normal);
 
 static void heap_prune_record_unchanged(Page page, PruneState *prstate, OffsetNumber offnum);
-static void heap_prune_record_unchanged_lp_dead(PruneState *prstate, OffsetNumber offnum);
+static void heap_prune_record_unchanged_lp_dead(ItemId itemid, PruneState *prstate, OffsetNumber offnum);
 static void heap_prune_record_unchanged_lp_redirect(PruneState *prstate, OffsetNumber offnum);
 
 static void page_verify_redirects(Page page);
@@ -766,7 +766,8 @@ heap_page_prune_and_freeze(Relation relation, Buffer buffer,
 	 * we can advance relfrozenxid and relminmxid to the values in
 	 * pagefrz->FreezePageRelfrozenXid and pagefrz->FreezePageRelminMxid.
 	 * MFIXME: which one should be pick if presult->nfrozen == 0 and
-	 * presult->all_frozen = True.
+	 * presult->all_frozen = True. MTODO: see Peter's response here
+	 * https://www.postgresql.org/message-id/CAH2-Wz%3DLmOs%3DiJ%3DFfCERnma0q7QjaNSnCgWEp7zOK7hD24YC_w%40mail.gmail.com
 	 */
 	if (new_relfrozen_xid)
 	{
@@ -868,9 +869,12 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum,
 	OffsetNumber latestdead = InvalidOffsetNumber,
 				maxoff = PageGetMaxOffsetNumber(dp),
 				offnum;
+
+	/* TODO: while maybe self-explanatory, I would prefer if chainitems and */
+	/* nchain had a comment up here */
 	OffsetNumber chainitems[MaxHeapTuplesPerPage];
-	int			nchain = 0,
-				i;
+	int			nchain = 0;
+	int			i = 0;
 
 	rootlp = PageGetItemId(dp, rootoffnum);
 
@@ -943,6 +947,7 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum,
 			 */
 			prstate->revisit[prstate->nrevisit++] = rootoffnum;
 
+			/* TODO: I don't like this comment now */
 			/* Nothing more to do */
 			return;
 		}
@@ -1020,7 +1025,7 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum,
 			if (unlikely(prstate->actions & PRUNE_DO_MARK_UNUSED_NOW))
 				heap_prune_record_unused(prstate, offnum, false);
 			else
-				heap_prune_record_unchanged_lp_dead(prstate, offnum);
+				heap_prune_record_unchanged_lp_dead(lp, prstate, offnum);
 			break;
 		}
 
@@ -1044,6 +1049,7 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum,
 		 */
 		tupdead = recent_dead = false;
 
+		/* TODO: maybe this should just be an if statement now */
 		switch (htsv_get_valid_status(prstate->htsv[offnum]))
 		{
 			case HEAPTUPLE_DEAD:
@@ -1132,42 +1138,34 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum,
 		 * dead and the root line pointer can be marked dead.  Otherwise just
 		 * redirect the root to the correct chain member.
 		 */
-		if (i >= nchain)
-			heap_prune_record_dead_or_unused(prstate, rootoffnum, ItemIdIsNormal(rootlp));
-		else
-		{
+		if (i < nchain)
 			heap_prune_record_redirect(prstate, rootoffnum, chainitems[i], ItemIdIsNormal(rootlp));
-
-			/* the rest of tuples in the chain are normal, unchanged tuples */
-			for (; i < nchain; i++)
-				heap_prune_record_unchanged(dp, prstate, chainitems[i]);
-		}
+		else
+			heap_prune_record_dead_or_unused(prstate, rootoffnum, ItemIdIsNormal(rootlp));
 	}
-	else
+	else if ((i = ItemIdIsRedirected(rootlp)))
 	{
-		i = 0;
-		if (ItemIdIsRedirected(rootlp))
+		if (i < nchain)
 		{
-			if (nchain < 2)
-			{
-				/*
-				 * We found a redirect item that doesn't point to a valid
-				 * follow-on item.  This can happen if the loop in
-				 * heap_page_prune_and_freeze() caused us to visit the dead
-				 * successor of a redirect item before visiting the redirect
-				 * item.  We can clean up by setting the redirect item to DEAD
-				 * state or LP_UNUSED if the caller indicated.
-				 */
-				heap_prune_record_dead_or_unused(prstate, rootoffnum, false);
-			}
-			else
-				heap_prune_record_unchanged_lp_redirect(prstate, rootoffnum);
-			i++;
+			heap_prune_record_unchanged_lp_redirect(prstate, rootoffnum);
+		}
+		else
+		{
+			/*
+			 * We found a redirect item that doesn't point to a valid
+			 * follow-on item.  This can happen if the loop in
+			 * heap_page_prune_and_freeze() caused us to visit the dead
+			 * successor of a redirect item before visiting the redirect item.
+			 * We can clean up by setting the redirect item to DEAD state or
+			 * LP_UNUSED if the caller indicated.
+			 */
+			heap_prune_record_dead_or_unused(prstate, rootoffnum, false);
 		}
-		/* the rest of tuples in the chain are normal, unchanged tuples */
-		for (; i < nchain; i++)
-			heap_prune_record_unchanged(dp, prstate, chainitems[i]);
 	}
+
+	/* the rest of tuples in the chain are normal, unchanged tuples */
+	for (; i < nchain; i++)
+		heap_prune_record_unchanged(dp, prstate, chainitems[i]);
 }
 
 /* Record lowest soon-prunable XID */
@@ -1486,7 +1484,7 @@ heap_prune_record_unchanged(Page page, PruneState *prstate, OffsetNumber offnum)
  * Record line pointer that was already LP_DEAD and is left unchanged.
  */
 static void
-heap_prune_record_unchanged_lp_dead(PruneState *prstate, OffsetNumber offnum)
+heap_prune_record_unchanged_lp_dead(ItemId itemid, PruneState *prstate, OffsetNumber offnum)
 {
 	/*
 	 * Deliberately don't set hastup for LP_DEAD items.  We make the soft
@@ -1506,6 +1504,7 @@ heap_prune_record_unchanged_lp_dead(PruneState *prstate, OffsetNumber offnum)
 	Assert(!prstate->marked[offnum]);
 	prstate->marked[offnum] = true;
 
+	Assert(!ItemIdHasStorage(itemid));
 	prstate->deadoffsets[prstate->lpdead_items++] = offnum;
 }
 

Reply via email to