On Mon, Mar 26, 2018 at 2:46 PM, Claudio Freire <klaussfre...@gmail.com> wrote:
> On Mon, Mar 26, 2018 at 11:26 AM, Claudio Freire <klaussfre...@gmail.com> 
> wrote:
>> On Mon, Mar 26, 2018 at 11:19 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>>> Claudio Freire <klaussfre...@gmail.com> writes:
>>>> On Sat, Mar 24, 2018 at 4:17 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>>>>> I hadn't paid any attention to this patch previously, so maybe I'm
>>>>> missing something ... but this sure seems like a very bizarre approach
>>>>> to the problem.  If the idea is to fix the FSM's upper levels after
>>>>> vacuuming a known sub-range of the table, why would you not proceed
>>>>> by teaching FreeSpaceMapVacuum to recurse only within that sub-range
>>>>> of page numbers?  This setup with a threshold seems entirely Rube
>>>>> Goldbergian.  It's dependent on a magic threshold number that you can
>>>>> only select by trial and error, and it's inevitably going to spend time
>>>>> updating segments of the FSM tree that have nothing to do with the part
>>>>> that's been vacuumed.
>>>
>>>> Well, the point is to not only update the range we know we've
>>>> vacuumed, but also to finish the updates done by a potential
>>>> previously cancelled autovacuum.
>>>
>>> I think that's not an important consideration, or at least would stop
>>> being one after a patch like this.  The reason it's a problem now is
>>> precisely that we don't try to vacuum the FSM till the very end; if
>>> we did FSM cleanup every X pages --- in particular, before not after
>>> the final relation-truncation attempt --- there wouldn't be risk of
>>> skipping so much FSM work that we need to worry about forcing the
>>> issue just in case there'd been a prior cancellation.
>>
>> I'm thinking that in conjunction with the high MWM patch for vacuum,
>> it could still happen that considerable amount of vacuuming is left
>> unexposed upon cancellation.
>>
>> The "bizarre" approach provides some relief.
>>
>> I'll see about implementing the FSM range vacuuming operation for
>> non-initial runs, there seems to be consensus that it's a good idea.
>>
>> But I still think this partial run at the very beginning is useful still.
>
> Attached patches, rebased and modified as discussed:
>
> 1 no longer does tree pruning, it just vacuums a range of the FSM
>
> 2 reintroduces tree pruning for the initial FSM vacuum
>
> 3 and 4 remain as they were, but rebased

Sorry, ignore patch number 3 in the earlier mail, I selected the wrong
patch to attach.

Attached now is the correct number 3
From c069f7590fd0ff24aa77d8025eda7e17527bca71 Mon Sep 17 00:00:00 2001
From: Claudio Freire <klaussfre...@gmail.com>
Date: Mon, 26 Feb 2018 13:23:59 -0300
Subject: [PATCH 3/4] FSM: Fix relation extension FSM update

When updating the FSM recursively during relation extension,
the update could step on higher-level entries with an incorrect
value, by always bubbling up the leaf value instead of the root
value that is the result of the bubble up in fsm_set_avail.

If somehow the root contained a different value from the new value
set by fsm_update_recursive, upper levels would be set to the
wrong value instead.

This didn't happen often since the value set during relation
extension is usually pretty high and unlikely to be beaten by
other pre-existing values, but the possibility existed nonetheless,
especially during bulk loads.
---
 src/backend/storage/freespace/freespace.c | 36 ++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index 2f48092..80f5e80 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -107,6 +107,8 @@ static Size fsm_space_cat_to_avail(uint8 cat);
 /* workhorse functions for various operations */
 static int fsm_set_and_search(Relation rel, FSMAddress addr, uint16 slot,
 				   uint8 newValue, uint8 minValue);
+static uint8 fsm_set_and_get_maxval(Relation rel, FSMAddress addr, uint16 slot,
+									uint8 newValue);
 static BlockNumber fsm_search(Relation rel, uint8 min_cat);
 static uint8 fsm_vacuum_page(Relation rel, FSMAddress addr, uint8 threshold, bool *eof,
 							 BlockNumber start, BlockNumber end);
@@ -719,6 +721,33 @@ fsm_set_and_search(Relation rel, FSMAddress addr, uint16 slot,
 }
 
 /*
+ * Set value in given FSM page and slot.
+ *
+ * The new value at the root node is returned.
+ */
+static uint8
+fsm_set_and_get_maxval(Relation rel, FSMAddress addr, uint16 slot, uint8 newValue)
+{
+	Buffer		buf;
+	Page		page;
+	uint8		newmax = 0;
+
+	buf = fsm_readbuf(rel, addr, true);
+	LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
+
+	page = BufferGetPage(buf);
+
+	if (fsm_set_avail(page, slot, newValue))
+		MarkBufferDirtyHint(buf, false);
+
+	newmax = fsm_get_avail(page, 0);
+
+	UnlockReleaseBuffer(buf);
+
+	return newmax;
+}
+
+/*
  * Search the tree for a heap page with at least min_cat of free space
  */
 static BlockNumber
@@ -953,6 +982,11 @@ fsm_update_recursive(Relation rel, FSMAddress addr, uint8 new_cat)
 	 * information in that.
 	 */
 	parent = fsm_get_parent(addr, &parentslot);
-	fsm_set_and_search(rel, parent, parentslot, new_cat, 0);
+	new_cat = fsm_set_and_get_maxval(rel, parent, parentslot, new_cat);
+
+	/*
+	 * Bubble up, not the value we just set, but the one now in the root
+	 * node of the just-updated page, which is the page's highest value.
+	 */
 	fsm_update_recursive(rel, parent, new_cat);
 }
-- 
1.8.4.5

Reply via email to