On Fri, 21 Aug 2020 at 22:25, Ashutosh Sharma <[email protected]> wrote:
>
> Hi Masahiko-san,
>
> Please find the updated patch with the following new changes:
>
Thank you for updating the patch!
> 1) It adds the code changes in heap_force_kill function to clear an
> all-visible bit on the visibility map corresponding to the page that
> is marked all-visible. Along the way it also clears PD_ALL_VISIBLE
> flag on the page header.
I think we need to clear all visibility map bits by using
VISIBILITYMAP_VALID_BITS. Otherwise, the page has all-frozen bit but
not all-visible bit, which is not a valid state.
>
> 2) It adds the code changes in heap_force_freeze function to reset the
> ctid value in a tuple header if it is corrupted.
>
> 3) It adds several notes and examples in the documentation stating
> when and how we need to use the functions provided by this module.
>
> Please have a look and let me know for any other concern.
>
And here are small comments on the heap_surgery.c:
+ /*
+ * Get the offset numbers from the tids belonging to one particular page
+ * and process them one by one.
+ */
+ blkno = tids_same_page_fetch_offnums(tids, ntids, &next_start_ptr,
+ offnos);
+
+ /* Calculate the number of offsets stored in offnos array. */
+ noffs = next_start_ptr - curr_start_ptr;
+
+ /*
+ * Update the current start pointer so that next time when
+ * tids_same_page_fetch_offnums() is called, we can calculate the number
+ * of offsets present in the offnos array.
+ */
+ curr_start_ptr = next_start_ptr;
+
+ /* Check whether the block number is valid. */
+ if (blkno >= nblocks)
+ {
+ ereport(NOTICE,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("skipping block %u for relation \"%s\"
because the block number is out of range",
+ blkno, RelationGetRelationName(rel))));
+ continue;
+ }
+
+ CHECK_FOR_INTERRUPTS();
I guess it would be better to call CHECK_FOR_INTERRUPTS() at the top
of the do loop for safety. I think it's unlikely to happen but the
user might mistakenly specify a lot of wrong block numbers.
----
+ offnos = (OffsetNumber *) palloc(ntids * sizeof(OffsetNumber));
+ noffs = curr_start_ptr = next_start_ptr = 0;
+ nblocks = RelationGetNumberOfBlocks(rel);
+
+ do
+ {
(snip)
+
+ /*
+ * Get the offset numbers from the tids belonging to one particular page
+ * and process them one by one.
+ */
+ blkno = tids_same_page_fetch_offnums(tids, ntids, &next_start_ptr,
+ offnos);
+
+ /* Calculate the number of offsets stored in offnos array. */
+ noffs = next_start_ptr - curr_start_ptr;
+
(snip)
+ /* No ereport(ERROR) from here until all the changes are logged. */
+ START_CRIT_SECTION();
+
+ for (i = 0; i < noffs; i++)
You copy all offset numbers belonging to the same page to palloc'd
array, offnos, and iterate it while processing the tuples. I might be
missing something but I think we can do that without allocating the
space for offset numbers. Is there any reason for this? I guess we can
do that by just iterating the sorted tids array.
Regards,
--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services