On Thu, Jul 2, 2015 at 1:06 AM, Fujii Masao <masao.fu...@gmail.com> wrote: > On Thu, Jul 2, 2015 at 12:13 AM, Sawada Masahiko <sawada.m...@gmail.com> > wrote: >> On Thu, May 28, 2015 at 11:34 AM, Sawada Masahiko <sawada.m...@gmail.com> >> wrote: >>> On Thu, Apr 30, 2015 at 8:07 PM, Sawada Masahiko <sawada.m...@gmail.com> >>> wrote: >>>> On Fri, Apr 24, 2015 at 11:21 AM, Sawada Masahiko <sawada.m...@gmail.com> >>>> wrote: >>>>> On Fri, Apr 24, 2015 at 1:31 AM, Jim Nasby <jim.na...@bluetreble.com> >>>>> wrote: >>>>>> On 4/23/15 11:06 AM, Petr Jelinek wrote: >>>>>>> >>>>>>> On 23/04/15 17:45, Bruce Momjian wrote: >>>>>>>> >>>>>>>> On Thu, Apr 23, 2015 at 09:45:38AM -0400, Robert Haas wrote: >>>>>>>> Agreed, no extra file, and the same write volume as currently. It >>>>>>>> would >>>>>>>> also match pg_clog, which uses two bits per transaction --- maybe we >>>>>>>> can >>>>>>>> reuse some of that code. >>>>>>>> >>>>>>> >>>>>>> Yeah, this approach seems promising. We probably can't reuse code from >>>>>>> clog because the usage pattern is different (key for clog is xid, while >>>>>>> for visibility/freeze map ctid is used). But visibility map storage >>>>>>> layer is pretty simple so it should be easy to extend it for this use. >>>>>> >>>>>> >>>>>> Actually, there may be some bit manipulation functions we could reuse; >>>>>> things like efficiently counting how many things in a byte are set. >>>>>> Probably >>>>>> doesn't make sense to fully refactor it, but at least CLOG is a good >>>>>> source >>>>>> for cut/paste/whack. >>>>>> >>>>> >>>>> I agree with adding a bit that indicates corresponding page is >>>>> all-frozen into VM, just like CLOG. >>>>> I'll change the patch as second version patch. >>>>> >>>> >>>> The second patch is attached. >>>> >>>> In second patch, I added a bit that indicates all tuples in page are >>>> completely frozen into visibility map. >>>> The visibility map became a bitmap with two bit per heap page: >>>> all-visible and all-frozen. >>>> The logics around vacuum, insert/update/delete heap are almost same as >>>> previous version. >>>> >>>> This patch lack some point: documentation, comment in source code, >>>> etc, so it's WIP patch yet, >>>> but I think that it's enough to discuss about this. >>>> >>> >>> The previous patch is no longer applied cleanly to HEAD. >>> The attached v2 patch is latest version. >>> >>> Please review it. >> >> Attached new rebased version patch. >> Please give me comments! > > Now we should review your design and approach rather than code, > but since I got an assertion error while trying the patch, I report it. > > "initdb -D test -k" caused the following assertion failure. > > vacuuming database template1 ... TRAP: > FailedAssertion("!((((PageHeader) (heapPage))->pd_flags & 0x0004))", > File: "visibilitymap.c", Line: 328) > sh: line 1: 83785 Abort trap: 6 > "/dav/000_add_frozen_bit_into_visibilitymap_v3/bin/postgres" --single > -F -O -c search_path=pg_catalog -c exit_on_error=true template1 > > /dev/null > child process exited with exit code 134 > initdb: removing data directory "test"
Thank you for bug report, and comments. Fixed version is attached, and source code comment is also updated. Please review it. And I explain again here about what this patch does, current design. - A additional bit for visibility map. I added additional bit, say all-frozen bit, which indicates whether the all pages of corresponding page are frozen, to visibility map. This structure is similar to CLOG. So the size of VM grew as twice as today. Also, the flags of each heap page header might be set PD_ALL_FROZEN, as well as all-visible - Set and clear a all-frozen bit Update and delete and insert(multi insert) operation would clear a bit of that page, and clear flags of page header at same time. Only vauum operation can set a bit if all tuple of a page are frozen. - Anti-wrapping vacuum We have to scan whole table for XID anti-warring today, and it's really quite expensive because disk I/O. The main benefit of this proposal is to reduce and avoid such extremely large quantity I/O even when anti-wrapping vacuum is executed. We have to scan whole table for XID anti-warring today, and it's really quite expensive. In lazy_scan_heap() function, I added a such logic for experimental. There were several another idea on previous discussion such as read-only table, frozen map. But advantage of this direction is that we don't need additional heap file, and can use the matured VM mechanism. Regards, -- Sawada Masahiko
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 86a2e6b..835d714 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -88,7 +88,8 @@ static HeapTuple heap_prepare_insert(Relation relation, HeapTuple tup, static XLogRecPtr log_heap_update(Relation reln, Buffer oldbuf, Buffer newbuf, HeapTuple oldtup, HeapTuple newtup, HeapTuple old_key_tup, - bool all_visible_cleared, bool new_all_visible_cleared); + bool all_visible_cleared, bool new_all_visible_cleared, + bool all_frozen_cleared, bool new_all_frozen_cleared); static void HeapSatisfiesHOTandKeyUpdate(Relation relation, Bitmapset *hot_attrs, Bitmapset *key_attrs, Bitmapset *id_attrs, @@ -2107,7 +2108,8 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid, HeapTuple heaptup; Buffer buffer; Buffer vmbuffer = InvalidBuffer; - bool all_visible_cleared = false; + bool all_visible_cleared = false, + all_frozen_cleared = false; /* * Fill in tuple header fields, assign an OID, and toast the tuple if @@ -2131,8 +2133,9 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid, CheckForSerializableConflictIn(relation, NULL, InvalidBuffer); /* - * Find buffer to insert this tuple into. If the page is all visible, - * this will also pin the requisite visibility map page. + * Find buffer to insert this tuple into. If the page is all visible + * of all frozen, this will also pin the requisite visibility map and + * frozen map page. */ buffer = RelationGetBufferForTuple(relation, heaptup->t_len, InvalidBuffer, options, bistate, @@ -2150,7 +2153,16 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid, PageClearAllVisible(BufferGetPage(buffer)); visibilitymap_clear(relation, ItemPointerGetBlockNumber(&(heaptup->t_self)), - vmbuffer); + vmbuffer, VISIBILITYMAP_ALL_VISIBLE); + } + + if (PageIsAllFrozen(BufferGetPage(buffer))) + { + all_frozen_cleared = true; + PageClearAllFrozen(BufferGetPage(buffer)); + visibilitymap_clear(relation, + ItemPointerGetBlockNumber(&(heaptup->t_self)), + vmbuffer, VISIBILITYMAP_ALL_FROZEN); } /* @@ -2199,6 +2211,8 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid, xlrec.flags = 0; if (all_visible_cleared) xlrec.flags |= XLH_INSERT_ALL_VISIBLE_CLEARED; + if (all_frozen_cleared) + xlrec.flags |= XLH_INSERT_ALL_FROZEN_CLEARED; if (options & HEAP_INSERT_SPECULATIVE) xlrec.flags |= XLH_INSERT_IS_SPECULATIVE; Assert(ItemPointerGetBlockNumber(&heaptup->t_self) == BufferGetBlockNumber(buffer)); @@ -2406,7 +2420,8 @@ heap_multi_insert(Relation relation, HeapTuple *tuples, int ntuples, { Buffer buffer; Buffer vmbuffer = InvalidBuffer; - bool all_visible_cleared = false; + bool all_visible_cleared = false, + all_frozen_cleared = false; int nthispage; CHECK_FOR_INTERRUPTS(); @@ -2451,7 +2466,16 @@ heap_multi_insert(Relation relation, HeapTuple *tuples, int ntuples, PageClearAllVisible(page); visibilitymap_clear(relation, BufferGetBlockNumber(buffer), - vmbuffer); + vmbuffer, VISIBILITYMAP_ALL_VISIBLE); + } + + if (PageIsAllFrozen(page)) + { + all_frozen_cleared = true; + PageClearAllFrozen(page); + visibilitymap_clear(relation, + BufferGetBlockNumber(buffer), + vmbuffer, VISIBILITYMAP_ALL_FROZEN); } /* @@ -2496,6 +2520,8 @@ heap_multi_insert(Relation relation, HeapTuple *tuples, int ntuples, tupledata = scratchptr; xlrec->flags = all_visible_cleared ? XLH_INSERT_ALL_VISIBLE_CLEARED : 0; + if (all_frozen_cleared) + xlrec->flags |= XLH_INSERT_ALL_FROZEN_CLEARED; xlrec->ntuples = nthispage; /* @@ -2698,7 +2724,8 @@ heap_delete(Relation relation, ItemPointer tid, new_infomask2; bool have_tuple_lock = false; bool iscombo; - bool all_visible_cleared = false; + bool all_visible_cleared = false, + all_frozen_cleared = false; HeapTuple old_key_tuple = NULL; /* replica identity of the tuple */ bool old_key_copied = false; @@ -2724,18 +2751,19 @@ heap_delete(Relation relation, ItemPointer tid, * in the middle of changing this, so we'll need to recheck after we have * the lock. */ - if (PageIsAllVisible(page)) + if (PageIsAllVisible(page) || PageIsAllFrozen(page)) visibilitymap_pin(relation, block, &vmbuffer); LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE); /* * If we didn't pin the visibility map page and the page has become all - * visible while we were busy locking the buffer, we'll have to unlock and - * re-lock, to avoid holding the buffer lock across an I/O. That's a bit - * unfortunate, but hopefully shouldn't happen often. + * visible or all frozen while we were busy locking the buffer, we'll + * have to unlock and re-lock, to avoid holding the buffer lock across an + * I/O. That's a bit unfortunate, but hopefully shouldn't happen often. */ - if (vmbuffer == InvalidBuffer && PageIsAllVisible(page)) + if (vmbuffer == InvalidBuffer && + (PageIsAllVisible(page) || PageIsAllFrozen(page))) { LockBuffer(buffer, BUFFER_LOCK_UNLOCK); visibilitymap_pin(relation, block, &vmbuffer); @@ -2925,12 +2953,22 @@ l1: */ PageSetPrunable(page, xid); + /* clear PD_ALL_VISIBLE flags */ if (PageIsAllVisible(page)) { all_visible_cleared = true; PageClearAllVisible(page); visibilitymap_clear(relation, BufferGetBlockNumber(buffer), - vmbuffer); + vmbuffer, VISIBILITYMAP_ALL_VISIBLE); + } + + /* clear PD_ALL_FROZEN flags */ + if (PageIsAllFrozen(page)) + { + all_frozen_cleared = true; + PageClearAllFrozen(page); + visibilitymap_clear(relation, BufferGetBlockNumber(buffer), + vmbuffer, VISIBILITYMAP_ALL_FROZEN); } /* store transaction information of xact deleting the tuple */ @@ -2962,6 +3000,8 @@ l1: log_heap_new_cid(relation, &tp); xlrec.flags = all_visible_cleared ? XLH_DELETE_ALL_VISIBLE_CLEARED : 0; + if (all_frozen_cleared) + xlrec.flags |= XLH_DELETE_ALL_FROZEN_CLEARED; xlrec.infobits_set = compute_infobits(tp.t_data->t_infomask, tp.t_data->t_infomask2); xlrec.offnum = ItemPointerGetOffsetNumber(&tp.t_self); @@ -3159,6 +3199,8 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup, bool key_intact; bool all_visible_cleared = false; bool all_visible_cleared_new = false; + bool all_frozen_cleared = false; + bool all_frozen_cleared_new = false; bool checked_lockers; bool locker_remains; TransactionId xmax_new_tuple, @@ -3202,12 +3244,12 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup, page = BufferGetPage(buffer); /* - * Before locking the buffer, pin the visibility map page if it appears to - * be necessary. Since we haven't got the lock yet, someone else might be + * Before locking the buffer, pin the visibility map if it appears to be + * necessary. Since we haven't got the lock yet, someone else might be * in the middle of changing this, so we'll need to recheck after we have * the lock. */ - if (PageIsAllVisible(page)) + if (PageIsAllVisible(page) || PageIsAllFrozen(page)) visibilitymap_pin(relation, block, &vmbuffer); LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE); @@ -3490,21 +3532,23 @@ l2: UnlockTupleTuplock(relation, &(oldtup.t_self), *lockmode); if (vmbuffer != InvalidBuffer) ReleaseBuffer(vmbuffer); + bms_free(hot_attrs); bms_free(key_attrs); return result; } /* - * If we didn't pin the visibility map page and the page has become all - * visible while we were busy locking the buffer, or during some - * subsequent window during which we had it unlocked, we'll have to unlock - * and re-lock, to avoid holding the buffer lock across an I/O. That's a - * bit unfortunate, especially since we'll now have to recheck whether the - * tuple has been locked or updated under us, but hopefully it won't - * happen very often. + * If we didn't pin the visibility map page and the page has + * become all visible(and frozen) while we were busy locking the buffer, + * or during some subsequent window during which we had it unlocked, + * we'll have to unlock and re-lock, to avoid holding the buffer lock + * across an I/O. That's a bit unfortunate, especially since we'll now + * have to recheck whether the tuple has been locked or updated under us, + * but hopefully it won't happen very often. */ - if (vmbuffer == InvalidBuffer && PageIsAllVisible(page)) + if (vmbuffer == InvalidBuffer && + (PageIsAllVisible(page) || PageIsAllFrozen(page))) { LockBuffer(buffer, BUFFER_LOCK_UNLOCK); visibilitymap_pin(relation, block, &vmbuffer); @@ -3803,14 +3847,30 @@ l2: all_visible_cleared = true; PageClearAllVisible(BufferGetPage(buffer)); visibilitymap_clear(relation, BufferGetBlockNumber(buffer), - vmbuffer); + vmbuffer, VISIBILITYMAP_ALL_VISIBLE); } if (newbuf != buffer && PageIsAllVisible(BufferGetPage(newbuf))) { all_visible_cleared_new = true; PageClearAllVisible(BufferGetPage(newbuf)); visibilitymap_clear(relation, BufferGetBlockNumber(newbuf), - vmbuffer_new); + vmbuffer_new, VISIBILITYMAP_ALL_VISIBLE); + } + + /* clear PD_ALL_FROZEN flags */ + if (PageIsAllFrozen(BufferGetPage(buffer))) + { + all_frozen_cleared = true; + PageClearAllFrozen(BufferGetPage(buffer)); + visibilitymap_clear(relation, BufferGetBlockNumber(buffer), + vmbuffer, VISIBILITYMAP_ALL_FROZEN); + } + if (newbuf != buffer && PageIsAllFrozen(BufferGetPage(newbuf))) + { + all_frozen_cleared_new = true; + PageClearAllFrozen(BufferGetPage(newbuf)); + visibilitymap_clear(relation, BufferGetBlockNumber(newbuf), + vmbuffer_new, VISIBILITYMAP_ALL_FROZEN); } if (newbuf != buffer) @@ -3836,7 +3896,9 @@ l2: newbuf, &oldtup, heaptup, old_key_tuple, all_visible_cleared, - all_visible_cleared_new); + all_visible_cleared_new, + all_frozen_cleared, + all_frozen_cleared_new); if (newbuf != buffer) { PageSetLSN(BufferGetPage(newbuf), recptr); @@ -6893,7 +6955,7 @@ log_heap_freeze(Relation reln, Buffer buffer, TransactionId cutoff_xid, */ XLogRecPtr log_heap_visible(RelFileNode rnode, Buffer heap_buffer, Buffer vm_buffer, - TransactionId cutoff_xid) + TransactionId cutoff_xid, uint8 vmflags) { xl_heap_visible xlrec; XLogRecPtr recptr; @@ -6903,6 +6965,7 @@ log_heap_visible(RelFileNode rnode, Buffer heap_buffer, Buffer vm_buffer, Assert(BufferIsValid(vm_buffer)); xlrec.cutoff_xid = cutoff_xid; + xlrec.flags = vmflags; XLogBeginInsert(); XLogRegisterData((char *) &xlrec, SizeOfHeapVisible); @@ -6926,7 +6989,8 @@ static XLogRecPtr log_heap_update(Relation reln, Buffer oldbuf, Buffer newbuf, HeapTuple oldtup, HeapTuple newtup, HeapTuple old_key_tuple, - bool all_visible_cleared, bool new_all_visible_cleared) + bool all_visible_cleared, bool new_all_visible_cleared, + bool all_frozen_cleared, bool new_all_frozen_cleared) { xl_heap_update xlrec; xl_heap_header xlhdr; @@ -7009,6 +7073,10 @@ log_heap_update(Relation reln, Buffer oldbuf, xlrec.flags |= XLH_UPDATE_OLD_ALL_VISIBLE_CLEARED; if (new_all_visible_cleared) xlrec.flags |= XLH_UPDATE_NEW_ALL_VISIBLE_CLEARED; + if (all_frozen_cleared) + xlrec.flags |= XLH_UPDATE_OLD_ALL_FROZEN_CLEARED; + if (new_all_frozen_cleared) + xlrec.flags |= XLH_UPDATE_NEW_ALL_FROZEN_CLEARED; if (prefixlen > 0) xlrec.flags |= XLH_UPDATE_PREFIX_FROM_OLD; if (suffixlen > 0) @@ -7492,8 +7560,14 @@ heap_xlog_visible(XLogReaderState *record) * the subsequent update won't be replayed to clear the flag. */ page = BufferGetPage(buffer); - PageSetAllVisible(page); + + if (xlrec->flags & VISIBILITYMAP_ALL_VISIBLE) + PageSetAllVisible(page); + if (xlrec->flags & VISIBILITYMAP_ALL_FROZEN) + PageSetAllFrozen(page); + MarkBufferDirty(buffer); + } else if (action == BLK_RESTORED) { @@ -7544,7 +7618,7 @@ heap_xlog_visible(XLogReaderState *record) */ if (lsn > PageGetLSN(vmpage)) visibilitymap_set(reln, blkno, InvalidBuffer, lsn, vmbuffer, - xlrec->cutoff_xid); + xlrec->cutoff_xid, xlrec->flags); ReleaseBuffer(vmbuffer); FreeFakeRelcacheEntry(reln); @@ -7656,13 +7730,20 @@ heap_xlog_delete(XLogReaderState *record) * The visibility map may need to be fixed even if the heap page is * already up-to-date. */ - if (xlrec->flags & XLH_DELETE_ALL_VISIBLE_CLEARED) + if (xlrec->flags & (XLH_DELETE_ALL_VISIBLE_CLEARED | XLH_DELETE_ALL_FROZEN_CLEARED)) { Relation reln = CreateFakeRelcacheEntry(target_node); Buffer vmbuffer = InvalidBuffer; + uint8 flags = 0; + + /* set flags for either clear one flags or both */ + flags |= xlrec->flags & XLH_DELETE_ALL_VISIBLE_CLEARED ? + VISIBILITYMAP_ALL_VISIBLE : 0; + flags |= xlrec->flags & XLH_DELETE_ALL_FROZEN_CLEARED ? + VISIBILITYMAP_ALL_FROZEN : 0; visibilitymap_pin(reln, blkno, &vmbuffer); - visibilitymap_clear(reln, blkno, vmbuffer); + visibilitymap_clear(reln, blkno, vmbuffer, flags); ReleaseBuffer(vmbuffer); FreeFakeRelcacheEntry(reln); } @@ -7734,13 +7815,20 @@ heap_xlog_insert(XLogReaderState *record) * The visibility map may need to be fixed even if the heap page is * already up-to-date. */ - if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED) + if (xlrec->flags & (XLH_INSERT_ALL_VISIBLE_CLEARED | XLH_INSERT_ALL_FROZEN_CLEARED)) { Relation reln = CreateFakeRelcacheEntry(target_node); Buffer vmbuffer = InvalidBuffer; + uint8 flags = 0; + + /* set flags for either clear one flags or both */ + flags |= xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED ? + VISIBILITYMAP_ALL_VISIBLE : 0; + flags |= xlrec->flags & XLH_INSERT_ALL_FROZEN_CLEARED ? + VISIBILITYMAP_ALL_FROZEN : 0; visibilitymap_pin(reln, blkno, &vmbuffer); - visibilitymap_clear(reln, blkno, vmbuffer); + visibilitymap_clear(reln, blkno, vmbuffer, flags); ReleaseBuffer(vmbuffer); FreeFakeRelcacheEntry(reln); } @@ -7800,6 +7888,9 @@ heap_xlog_insert(XLogReaderState *record) if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED) PageClearAllVisible(page); + if (xlrec->flags & XLH_INSERT_ALL_FROZEN_CLEARED) + PageClearAllFrozen(page); + MarkBufferDirty(buffer); } if (BufferIsValid(buffer)) @@ -7854,13 +7945,20 @@ heap_xlog_multi_insert(XLogReaderState *record) * The visibility map may need to be fixed even if the heap page is * already up-to-date. */ - if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED) + if (xlrec->flags & (XLH_INSERT_ALL_VISIBLE_CLEARED | XLH_INSERT_ALL_FROZEN_CLEARED)) { Relation reln = CreateFakeRelcacheEntry(rnode); Buffer vmbuffer = InvalidBuffer; + uint8 flags = 0; + + /* set flags for either clear one flags or both */ + flags |= xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED ? + VISIBILITYMAP_ALL_VISIBLE : 0; + flags |= xlrec->flags & XLH_INSERT_ALL_FROZEN_CLEARED ? + VISIBILITYMAP_ALL_FROZEN : 0; visibilitymap_pin(reln, blkno, &vmbuffer); - visibilitymap_clear(reln, blkno, vmbuffer); + visibilitymap_clear(reln, blkno, vmbuffer, flags); ReleaseBuffer(vmbuffer); FreeFakeRelcacheEntry(reln); } @@ -7938,6 +8036,8 @@ heap_xlog_multi_insert(XLogReaderState *record) if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED) PageClearAllVisible(page); + if (xlrec->flags & XLH_INSERT_ALL_FROZEN_CLEARED) + PageClearAllFrozen(page); MarkBufferDirty(buffer); } @@ -8009,13 +8109,20 @@ heap_xlog_update(XLogReaderState *record, bool hot_update) * The visibility map may need to be fixed even if the heap page is * already up-to-date. */ - if (xlrec->flags & XLH_UPDATE_OLD_ALL_VISIBLE_CLEARED) + if (xlrec->flags & (XLH_UPDATE_OLD_ALL_VISIBLE_CLEARED | XLH_UPDATE_OLD_ALL_FROZEN_CLEARED)) { Relation reln = CreateFakeRelcacheEntry(rnode); Buffer vmbuffer = InvalidBuffer; + uint8 flags = 0; + + /* set flags for either clear one flags or both */ + flags |= xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED ? + VISIBILITYMAP_ALL_VISIBLE : 0; + flags |= xlrec->flags & XLH_INSERT_ALL_FROZEN_CLEARED ? + VISIBILITYMAP_ALL_FROZEN : 0; visibilitymap_pin(reln, oldblk, &vmbuffer); - visibilitymap_clear(reln, oldblk, vmbuffer); + visibilitymap_clear(reln, oldblk, vmbuffer, flags); ReleaseBuffer(vmbuffer); FreeFakeRelcacheEntry(reln); } @@ -8066,6 +8173,8 @@ heap_xlog_update(XLogReaderState *record, bool hot_update) if (xlrec->flags & XLH_UPDATE_OLD_ALL_VISIBLE_CLEARED) PageClearAllVisible(page); + if (xlrec->flags & XLH_UPDATE_OLD_ALL_FROZEN_CLEARED) + PageClearAllFrozen(page); PageSetLSN(page, lsn); MarkBufferDirty(obuffer); @@ -8093,13 +8202,20 @@ heap_xlog_update(XLogReaderState *record, bool hot_update) * The visibility map may need to be fixed even if the heap page is * already up-to-date. */ - if (xlrec->flags & XLH_UPDATE_NEW_ALL_VISIBLE_CLEARED) + if (xlrec->flags & (XLH_UPDATE_NEW_ALL_VISIBLE_CLEARED | XLH_UPDATE_NEW_ALL_FROZEN_CLEARED)) { Relation reln = CreateFakeRelcacheEntry(rnode); Buffer vmbuffer = InvalidBuffer; + uint8 flags = 0; + + /* set flags for either clear one flags or both */ + flags |= xlrec->flags & XLH_UPDATE_NEW_ALL_VISIBLE_CLEARED ? + VISIBILITYMAP_ALL_VISIBLE : 0; + flags |= xlrec->flags & XLH_UPDATE_NEW_ALL_FROZEN_CLEARED ? + VISIBILITYMAP_ALL_FROZEN : 0; visibilitymap_pin(reln, newblk, &vmbuffer); - visibilitymap_clear(reln, newblk, vmbuffer); + visibilitymap_clear(reln, newblk, vmbuffer, flags); ReleaseBuffer(vmbuffer); FreeFakeRelcacheEntry(reln); } @@ -8201,6 +8317,8 @@ heap_xlog_update(XLogReaderState *record, bool hot_update) if (xlrec->flags & XLH_UPDATE_NEW_ALL_VISIBLE_CLEARED) PageClearAllVisible(page); + if (xlrec->flags & XLH_UPDATE_NEW_ALL_FROZEN_CLEARED) + PageClearAllFrozen(page); freespace = PageGetHeapFreeSpace(page); /* needed to update FSM below */ diff --git a/src/backend/access/heap/hio.c b/src/backend/access/heap/hio.c index 6db73bf..4e19f9c 100644 --- a/src/backend/access/heap/hio.c +++ b/src/backend/access/heap/hio.c @@ -327,7 +327,8 @@ RelationGetBufferForTuple(Relation relation, Size len, { /* easy case */ buffer = ReadBufferBI(relation, targetBlock, bistate); - if (PageIsAllVisible(BufferGetPage(buffer))) + if (PageIsAllVisible(BufferGetPage(buffer)) || + PageIsAllFrozen(BufferGetPage(buffer))) visibilitymap_pin(relation, targetBlock, vmbuffer); LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE); } @@ -335,7 +336,8 @@ RelationGetBufferForTuple(Relation relation, Size len, { /* also easy case */ buffer = otherBuffer; - if (PageIsAllVisible(BufferGetPage(buffer))) + if (PageIsAllVisible(BufferGetPage(buffer)) || + PageIsAllFrozen(BufferGetPage(buffer))) visibilitymap_pin(relation, targetBlock, vmbuffer); LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE); } @@ -343,7 +345,8 @@ RelationGetBufferForTuple(Relation relation, Size len, { /* lock other buffer first */ buffer = ReadBuffer(relation, targetBlock); - if (PageIsAllVisible(BufferGetPage(buffer))) + if (PageIsAllVisible(BufferGetPage(buffer)) || + PageIsAllFrozen(BufferGetPage(buffer))) visibilitymap_pin(relation, targetBlock, vmbuffer); LockBuffer(otherBuffer, BUFFER_LOCK_EXCLUSIVE); LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE); @@ -352,7 +355,8 @@ RelationGetBufferForTuple(Relation relation, Size len, { /* lock target buffer first */ buffer = ReadBuffer(relation, targetBlock); - if (PageIsAllVisible(BufferGetPage(buffer))) + if (PageIsAllVisible(BufferGetPage(buffer)) || + PageIsAllFrozen(BufferGetPage(buffer))) visibilitymap_pin(relation, targetBlock, vmbuffer); LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE); LockBuffer(otherBuffer, BUFFER_LOCK_EXCLUSIVE); diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c index 7c38772..cc2c274 100644 --- a/src/backend/access/heap/visibilitymap.c +++ b/src/backend/access/heap/visibilitymap.c @@ -21,11 +21,14 @@ * * NOTES * - * The visibility map is a bitmap with one bit per heap page. A set bit means - * that all tuples on the page are known visible to all transactions, and - * therefore the page doesn't need to be vacuumed. The map is conservative in - * the sense that we make sure that whenever a bit is set, we know the - * condition is true, but if a bit is not set, it might or might not be true. + * The visibility map is a bitmap with two bits (all-visible and all-frozen + * per heap page. A set all-visible bit means that all tuples on the page are + * known visible to all transactions, and therefore the page doesn't need to + * be vacuumed. A set all-frozen bit means that all tuples on the page are + * completely frozen, so the page doesn't need to be vacuumed even if whole + * table scanning vacuum is required. The map is conservative in the sense that + * we make sure that whenever a bit is set, we know the condition is true, + * but if a bit is not set, it might or might not be true. * * Clearing a visibility map bit is not separately WAL-logged. The callers * must make sure that whenever a bit is cleared, the bit is cleared on WAL @@ -33,21 +36,25 @@ * * When we *set* a visibility map during VACUUM, we must write WAL. This may * seem counterintuitive, since the bit is basically a hint: if it is clear, - * it may still be the case that every tuple on the page is visible to all - * transactions; we just don't know that for certain. The difficulty is that - * there are two bits which are typically set together: the PD_ALL_VISIBLE bit - * on the page itself, and the visibility map bit. If a crash occurs after the - * visibility map page makes it to disk and before the updated heap page makes - * it to disk, redo must set the bit on the heap page. Otherwise, the next - * insert, update, or delete on the heap page will fail to realize that the - * visibility map bit must be cleared, possibly causing index-only scans to - * return wrong answers. + * it may still be the case that every tuple on the page is visible or frozen + * to all transactions; we just don't know that for certain. The difficulty is + * that there are two bits which are typically set together: the PD_ALL_VISIBLE + * or PD_ALL_FROZEN bit on the page itself, and the visibility map bit. If a + * crash occurs after the visibility map page makes it to disk and before the + * updated heap page makes it to disk, redo must set the bit on the heap page. + * Otherwise, the next insert, update, or delete on the heap page will fail to + * realize that the visibility map bit must be cleared, possibly causing index-only + * scans to return wrong answers. * - * VACUUM will normally skip pages for which the visibility map bit is set; + * VACUUM will normally skip pages for which the visibility map either bit is set; * such pages can't contain any dead tuples and therefore don't need vacuuming. - * The visibility map is not used for anti-wraparound vacuums, because + * The visibility map is not used for anti-wraparound vacuums before 9.5, because * an anti-wraparound vacuum needs to freeze tuples and observe the latest xid * present in the table, even on pages that don't have any dead tuples. + * 9.6 or later, the visibility map has a additional bit which indicates all tuple + * on single page has been completely forzen, so the visibility map is also used for + * anti-wraparound vacuums. + * * * LOCKING * @@ -58,14 +65,14 @@ * section that logs the page modification. However, we don't want to hold * the buffer lock over any I/O that may be required to read in the visibility * map page. To avoid this, we examine the heap page before locking it; - * if the page-level PD_ALL_VISIBLE bit is set, we pin the visibility map - * bit. Then, we lock the buffer. But this creates a race condition: there - * is a possibility that in the time it takes to lock the buffer, the - * PD_ALL_VISIBLE bit gets set. If that happens, we have to unlock the - * buffer, pin the visibility map page, and relock the buffer. This shouldn't - * happen often, because only VACUUM currently sets visibility map bits, - * and the race will only occur if VACUUM processes a given page at almost - * exactly the same time that someone tries to further modify it. + * if the page-level PD_ALL_VISIBLE or PD_ALL_FROZEN bit is set, we pin the + * visibility map bit. Then, we lock the buffer. But this creates a race + * condition: there is a possibility that in the time it takes to lock the + * buffer, the PD_ALL_VISIBLE or PD_ALL_FROZEN bit gets set. If that happens, + * we have to unlock the buffer, pin the visibility map page, and relock the + * buffer. This shouldn't happen often, because only VACUUM currently sets + * visibility map bits, and the race will only occur if VACUUM processes a given + * page at almost exactly the same time that someone tries to further modify it. * * To set a bit, you need to hold a lock on the heap page. That prevents * the race condition where VACUUM sees that all tuples on the page are @@ -92,7 +99,7 @@ #include "utils/inval.h" -/*#define TRACE_VISIBILITYMAP */ +#define TRACE_VISIBILITYMAP /* * Size of the bitmap on each visibility map page, in bytes. There's no @@ -101,11 +108,14 @@ */ #define MAPSIZE (BLCKSZ - MAXALIGN(SizeOfPageHeaderData)) -/* Number of bits allocated for each heap block. */ -#define BITS_PER_HEAPBLOCK 1 +/* + * Number of bits allocated for each heap block. + * One for all-visible, other for all-frozen. +*/ +#define BITS_PER_HEAPBLOCK 2 /* Number of heap blocks we can represent in one byte. */ -#define HEAPBLOCKS_PER_BYTE 8 +#define HEAPBLOCKS_PER_BYTE 4 /* Number of heap blocks we can represent in one visibility map page. */ #define HEAPBLOCKS_PER_PAGE (MAPSIZE * HEAPBLOCKS_PER_BYTE) @@ -115,24 +125,42 @@ #define HEAPBLK_TO_MAPBYTE(x) (((x) % HEAPBLOCKS_PER_PAGE) / HEAPBLOCKS_PER_BYTE) #define HEAPBLK_TO_MAPBIT(x) ((x) % HEAPBLOCKS_PER_BYTE) -/* table for fast counting of set bits */ -static const uint8 number_of_ones[256] = { - 0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, - 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, - 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, - 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, - 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, - 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, - 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, - 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, - 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, - 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, - 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, - 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, - 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, - 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, - 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, - 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8 +/* tables for fast counting of set bits for visible and freeze */ +static const uint8 number_of_ones_for_visible[256] = { + 0, 1, 0, 1, 1, 2, 1, 2, 0, 1, 0, 1, 1, 2, 1, 2, + 1, 2, 1, 2, 2, 3, 2, 3, 1, 2, 1, 2, 2, 3, 2, 3, + 0, 1, 0, 1, 1, 2, 1, 2, 0, 1, 0, 1, 1, 2, 1, 2, + 1, 2, 1, 2, 2, 3, 2, 3, 1, 2, 1, 2, 2, 3, 2, 3, + 1, 2, 1, 2, 2, 3, 2, 3, 1, 2, 1, 2, 2, 3, 2, 3, + 2, 3, 2, 3, 3, 4, 3, 4, 2, 3, 2, 3, 3, 4, 3, 4, + 1, 2, 1, 2, 2, 3, 2, 3, 1, 2, 1, 2, 2, 3, 2, 3, + 2, 3, 2, 3, 3, 4, 3, 4, 2, 3, 2, 3, 3, 4, 3, 4, + 0, 1, 0, 1, 1, 2, 1, 2, 0, 1, 0, 1, 1, 2, 1, 2, + 1, 2, 1, 2, 2, 3, 2, 3, 1, 2, 1, 2, 2, 3, 2, 3, + 0, 1, 0, 1, 1, 2, 1, 2, 0, 1, 0, 1, 1, 2, 1, 2, + 1, 2, 1, 2, 2, 3, 2, 3, 1, 2, 1, 2, 2, 3, 2, 3, + 1, 2, 1, 2, 2, 3, 2, 3, 1, 2, 1, 2, 2, 3, 2, 3, + 2, 3, 2, 3, 3, 4, 3, 4, 2, 3, 2, 3, 3, 4, 3, 4, + 1, 2, 1, 2, 2, 3, 2, 3, 1, 2, 1, 2, 2, 3, 2, 3, + 2, 3, 2, 3, 3, 4, 3, 4, 2, 3, 2, 3, 3, 4, 3, 4 +}; +static const uint8 number_of_ones_for_freeze[256] = { + 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 2, 2, 1, 1, 2, 2, + 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 2, 2, 1, 1, 2, 2, + 1, 1, 2, 2, 1, 1, 2, 2, 2, 2, 3, 3, 2, 2, 3, 3, + 1, 1, 2, 2, 1, 1, 2, 2, 2, 2, 3, 3, 2, 2, 3, 3, + 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 2, 2, 1, 1, 2, 2, + 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 2, 2, 1, 1, 2, 2, + 1, 1, 2, 2, 1, 1, 2, 2, 2, 2, 3, 3, 2, 2, 3, 3, + 1, 1, 2, 2, 1, 1, 2, 2, 2, 2, 3, 3, 2, 2, 3, 3, + 1, 1, 2, 2, 1, 1, 2, 2, 2, 2, 3, 3, 2, 2, 3, 3, + 1, 1, 2, 2, 1, 1, 2, 2, 2, 2, 3, 3, 2, 2, 3, 3, + 2, 2, 3, 3, 2, 2, 3, 3, 3, 3, 4, 4, 3, 3, 4, 4, + 2, 2, 3, 3, 2, 2, 3, 3, 3, 3, 4, 4, 3, 3, 4, 4, + 1, 1, 2, 2, 1, 1, 2, 2, 2, 2, 3, 3, 2, 2, 3, 3, + 1, 1, 2, 2, 1, 1, 2, 2, 2, 2, 3, 3, 2, 2, 3, 3, + 2, 2, 3, 3, 2, 2, 3, 3, 3, 3, 4, 4, 3, 3, 4, 4, + 2, 2, 3, 3, 2, 2, 3, 3, 3, 3, 4, 4, 3, 3, 4, 4 }; /* prototypes for internal routines */ @@ -141,23 +169,23 @@ static void vm_extend(Relation rel, BlockNumber nvmblocks); /* - * visibilitymap_clear - clear a bit in visibility map + * visibilitymap_clear - clear bits in visibility map * * You must pass a buffer containing the correct map page to this function. * Call visibilitymap_pin first to pin the right one. This function doesn't do - * any I/O. + * any I/O. Caller must pass flags which indicates what flags we want to clear. */ void -visibilitymap_clear(Relation rel, BlockNumber heapBlk, Buffer buf) +visibilitymap_clear(Relation rel, BlockNumber heapBlk, Buffer buf, uint8 flags) { BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk); int mapByte = HEAPBLK_TO_MAPBYTE(heapBlk); int mapBit = HEAPBLK_TO_MAPBIT(heapBlk); - uint8 mask = 1 << mapBit; + uint8 mask = flags << (BITS_PER_HEAPBLOCK * mapBit); char *map; #ifdef TRACE_VISIBILITYMAP - elog(DEBUG1, "vm_clear %s %d", RelationGetRelationName(rel), heapBlk); + elog(DEBUG1, "vm_clear %s %d %u", RelationGetRelationName(rel), heapBlk, flags); #endif if (!BufferIsValid(buf) || BufferGetBlockNumber(buf) != mapBlock) @@ -225,7 +253,7 @@ visibilitymap_pin_ok(BlockNumber heapBlk, Buffer buf) } /* - * visibilitymap_set - set a bit on a previously pinned page + * visibilitymap_set - set bits on a previously pinned page * * recptr is the LSN of the XLOG record we're replaying, if we're in recovery, * or InvalidXLogRecPtr in normal running. The page LSN is advanced to the @@ -234,10 +262,11 @@ visibilitymap_pin_ok(BlockNumber heapBlk, Buffer buf) * marked all-visible; it is needed for Hot Standby, and can be * InvalidTransactionId if the page contains no tuples. * - * Caller is expected to set the heap page's PD_ALL_VISIBLE bit before calling - * this function. Except in recovery, caller should also pass the heap - * buffer. When checksums are enabled and we're not in recovery, we must add - * the heap buffer to the WAL chain to protect it from being torn. + * Caller is expected to set the heap page's PD_ALL_VISIBLE or PD_ALL_FROZEN + * bit before calling this function. Except in recovery, caller should also + * pass the heap buffer and flags which indicates what flag we want to set. + * When checksums are enabled and we're not in recovery, we must add the heap + * buffer to the WAL chain to protect it from being torn. * * You must pass a buffer containing the correct map page to this function. * Call visibilitymap_pin first to pin the right one. This function doesn't do @@ -245,7 +274,8 @@ visibilitymap_pin_ok(BlockNumber heapBlk, Buffer buf) */ void visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf, - XLogRecPtr recptr, Buffer vmBuf, TransactionId cutoff_xid) + XLogRecPtr recptr, Buffer vmBuf, TransactionId cutoff_xid, + uint8 flags) { BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk); uint32 mapByte = HEAPBLK_TO_MAPBYTE(heapBlk); @@ -254,7 +284,7 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf, char *map; #ifdef TRACE_VISIBILITYMAP - elog(DEBUG1, "vm_set %s %d", RelationGetRelationName(rel), heapBlk); + elog(DEBUG1, "vm_set %s %d %u", RelationGetRelationName(rel), heapBlk, flags); #endif Assert(InRecovery || XLogRecPtrIsInvalid(recptr)); @@ -272,11 +302,11 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf, map = PageGetContents(page); LockBuffer(vmBuf, BUFFER_LOCK_EXCLUSIVE); - if (!(map[mapByte] & (1 << mapBit))) + if (!(map[mapByte] & (flags << (BITS_PER_HEAPBLOCK * mapBit)))) { START_CRIT_SECTION(); - map[mapByte] |= (1 << mapBit); + map[mapByte] |= (flags << (BITS_PER_HEAPBLOCK * mapBit)); MarkBufferDirty(vmBuf); if (RelationNeedsWAL(rel)) @@ -285,7 +315,7 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf, { Assert(!InRecovery); recptr = log_heap_visible(rel->rd_node, heapBuf, vmBuf, - cutoff_xid); + cutoff_xid, flags); /* * If data checksums are enabled (or wal_log_hints=on), we @@ -295,11 +325,15 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf, { Page heapPage = BufferGetPage(heapBuf); - /* caller is expected to set PD_ALL_VISIBLE first */ - Assert(PageIsAllVisible(heapPage)); + /* + * caller is expected to set PD_ALL_VISIBLE or + * PD_ALL_FROZEN first. + */ + Assert(PageIsAllVisible(heapPage) || PageIsAllFrozen(heapPage)); PageSetLSN(heapPage, recptr); } } + PageSetLSN(page, recptr); } @@ -310,15 +344,16 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf, } /* - * visibilitymap_test - test if a bit is set + * visibilitymap_test - test if bits are set * - * Are all tuples on heapBlk visible to all, according to the visibility map? + * Are all tuples on heapBlk visible or frozen to all, according to the visibility map? * * On entry, *buf should be InvalidBuffer or a valid buffer returned by an * earlier call to visibilitymap_pin or visibilitymap_test on the same * relation. On return, *buf is a valid buffer with the map page containing * the bit for heapBlk, or InvalidBuffer. The caller is responsible for - * releasing *buf after it's done testing and setting bits. + * releasing *buf after it's done testing and setting bits, and must set flags + * which indicates what flag we want to test. * * NOTE: This function is typically called without a lock on the heap page, * so somebody else could change the bit just after we look at it. In fact, @@ -328,7 +363,7 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf, * all concurrency issues! */ bool -visibilitymap_test(Relation rel, BlockNumber heapBlk, Buffer *buf) +visibilitymap_test(Relation rel, BlockNumber heapBlk, Buffer *buf, uint8 flags) { BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk); uint32 mapByte = HEAPBLK_TO_MAPBYTE(heapBlk); @@ -337,7 +372,7 @@ visibilitymap_test(Relation rel, BlockNumber heapBlk, Buffer *buf) char *map; #ifdef TRACE_VISIBILITYMAP - elog(DEBUG1, "vm_test %s %d", RelationGetRelationName(rel), heapBlk); + elog(DEBUG1, "vm_test %s %d %u", RelationGetRelationName(rel), heapBlk, flags); #endif /* Reuse the old pinned buffer if possible */ @@ -360,11 +395,11 @@ visibilitymap_test(Relation rel, BlockNumber heapBlk, Buffer *buf) map = PageGetContents(BufferGetPage(*buf)); /* - * A single-bit read is atomic. There could be memory-ordering effects + * A single or double bit read is atomic. There could be memory-ordering effects * here, but for performance reasons we make it the caller's job to worry * about that. */ - result = (map[mapByte] & (1 << mapBit)) ? true : false; + result = (map[mapByte] & (flags << (BITS_PER_HEAPBLOCK * mapBit))) ? true : false; return result; } @@ -374,10 +409,12 @@ visibilitymap_test(Relation rel, BlockNumber heapBlk, Buffer *buf) * * Note: we ignore the possibility of race conditions when the table is being * extended concurrently with the call. New pages added to the table aren't - * going to be marked all-visible, so they won't affect the result. + * going to be marked all-visible or all-frozen, so they won't affect the result. + * if for_visible is true, we count the number of all-visible flag. If false, + * we count the number of all-frozen flag. */ BlockNumber -visibilitymap_count(Relation rel) +visibilitymap_count(Relation rel, bool for_visible) { BlockNumber result = 0; BlockNumber mapBlock; @@ -406,7 +443,8 @@ visibilitymap_count(Relation rel) for (i = 0; i < MAPSIZE; i++) { - result += number_of_ones[map[i]]; + result += for_visible ? + number_of_ones_for_visible[map[i]] : number_of_ones_for_freeze[map[i]]; } ReleaseBuffer(mapBuffer); diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index 4246554..65753d9 100644 --- a/src/backend/catalog/index.c +++ b/src/backend/catalog/index.c @@ -1919,11 +1919,18 @@ index_update_stats(Relation rel, { BlockNumber relpages = RelationGetNumberOfBlocks(rel); BlockNumber relallvisible; + BlockNumber relallfrozen; if (rd_rel->relkind != RELKIND_INDEX) - relallvisible = visibilitymap_count(rel); + { + relallvisible = visibilitymap_count(rel, true); + relallfrozen = visibilitymap_count(rel, false); + } else /* don't bother for indexes */ + { relallvisible = 0; + relallfrozen = 0; + } if (rd_rel->relpages != (int32) relpages) { diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c index 861048f..1eaf2da 100644 --- a/src/backend/commands/analyze.c +++ b/src/backend/commands/analyze.c @@ -572,7 +572,8 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params, vac_update_relstats(onerel, relpages, totalrows, - visibilitymap_count(onerel), + visibilitymap_count(onerel, true), + visibilitymap_count(onerel, false), hasindex, InvalidTransactionId, InvalidMultiXactId, @@ -595,6 +596,7 @@ do_analyze_rel(Relation onerel, int options, VacuumParams *params, RelationGetNumberOfBlocks(Irel[ind]), totalindexrows, 0, + 0, false, InvalidTransactionId, InvalidMultiXactId, diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c index 7ab4874..d3725dd 100644 --- a/src/backend/commands/cluster.c +++ b/src/backend/commands/cluster.c @@ -22,6 +22,7 @@ #include "access/rewriteheap.h" #include "access/transam.h" #include "access/tuptoaster.h" +#include "access/visibilitymap.h" #include "access/xact.h" #include "access/xlog.h" #include "catalog/catalog.h" diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index baf66f1..d68c7c4 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -744,6 +744,7 @@ void vac_update_relstats(Relation relation, BlockNumber num_pages, double num_tuples, BlockNumber num_all_visible_pages, + BlockNumber num_all_frozen_pages, bool hasindex, TransactionId frozenxid, MultiXactId minmulti, bool in_outer_xact) @@ -781,6 +782,11 @@ vac_update_relstats(Relation relation, pgcform->relallvisible = (int32) num_all_visible_pages; dirty = true; } + if (pgcform->relallfrozen != (int32) num_all_frozen_pages) + { + pgcform->relallfrozen = (int32) num_all_frozen_pages; + dirty = true; + } /* Apply DDL updates, but not inside an outer transaction (see above) */ diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index a01cfb4..e4e60eb 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -106,6 +106,8 @@ typedef struct LVRelStats BlockNumber rel_pages; /* total number of pages */ BlockNumber scanned_pages; /* number of pages we examined */ BlockNumber pinskipped_pages; /* # of pages we skipped due to a pin */ + BlockNumber vmskipped_pages; /* # of pages we skipped by all-frozen bit + of visibility map */ double scanned_tuples; /* counts only tuples on scanned pages */ double old_rel_tuples; /* previous value of pg_class.reltuples */ double new_rel_tuples; /* new estimated total # of tuples */ @@ -188,7 +190,8 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, MultiXactId mxactFullScanLimit; BlockNumber new_rel_pages; double new_rel_tuples; - BlockNumber new_rel_allvisible; + BlockNumber new_rel_allvisible, + new_rel_allfrozen; double new_live_tuples; TransactionId new_frozen_xid; MultiXactId new_min_multi; @@ -222,6 +225,8 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, * than or equal to the requested Xid full-table scan limit; or if the * table's minimum MultiXactId is older than or equal to the requested * mxid full-table scan limit. + * Even if scan_all is set so far, we could skip to scan some pages + * according by frozen map. */ scan_all = TransactionIdPrecedesOrEquals(onerel->rd_rel->relfrozenxid, xidFullScanLimit); @@ -253,14 +258,16 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, * NB: We need to check this before truncating the relation, because that * will change ->rel_pages. */ - if (vacrelstats->scanned_pages < vacrelstats->rel_pages) + if ((vacrelstats->scanned_pages + vacrelstats->vmskipped_pages) + < vacrelstats->rel_pages) { - Assert(!scan_all); scanned_all = false; } else scanned_all = true; + scanned_all |= scan_all; + /* * Optionally truncate the relation. * @@ -301,10 +308,16 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, new_rel_tuples = vacrelstats->old_rel_tuples; } - new_rel_allvisible = visibilitymap_count(onerel); + /* true means that count for all-visible */ + new_rel_allvisible = visibilitymap_count(onerel, true); if (new_rel_allvisible > new_rel_pages) new_rel_allvisible = new_rel_pages; + /* false means that count for all-frozen */ + new_rel_allfrozen = visibilitymap_count(onerel, false); + if (new_rel_allfrozen > new_rel_pages) + new_rel_allfrozen = new_rel_pages; + new_frozen_xid = scanned_all ? FreezeLimit : InvalidTransactionId; new_min_multi = scanned_all ? MultiXactCutoff : InvalidMultiXactId; @@ -312,6 +325,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, new_rel_pages, new_rel_tuples, new_rel_allvisible, + new_rel_allfrozen, vacrelstats->hasindex, new_frozen_xid, new_min_multi, @@ -486,9 +500,13 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, * consecutive pages. Since we're reading sequentially, the OS should be * doing readahead for us, so there's no gain in skipping a page now and * then; that's likely to disable readahead and so be counterproductive. - * Also, skipping even a single page means that we can't update - * relfrozenxid, so we only want to do it if we can skip a goodly number - * of pages. + * Also, skipping even a single page accorind to all-visible bit of + * visibility map means that we can't update relfrozenxid, so we only want + * to do it if we can skip a goodly number. On the other hand, we count + * both how many pages we skipped according to all-frozen bit of visibility + * map and how many pages we freeze page, so we can update relfrozenxid if + * the sum of their is as many as tuples per page. + * XXX : We use only all-visible bit to determine skip page for now. * * Before entering the main loop, establish the invariant that * next_not_all_visible_block is the next block number >= blkno that's not @@ -515,7 +533,8 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, next_not_all_visible_block < nblocks; next_not_all_visible_block++) { - if (!visibilitymap_test(onerel, next_not_all_visible_block, &vmbuffer)) + if (!visibilitymap_test(onerel, next_not_all_visible_block, &vmbuffer, + VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN)) break; vacuum_delay_point(); } @@ -534,6 +553,8 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, hastup; int prev_dead_count; int nfrozen; + int already_nfrozen; /* # of tuples already frozen */ + int ntup_in_blk; /* # of tuples in single page */ Size freespace; bool all_visible_according_to_vm; bool all_visible; @@ -548,7 +569,8 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, next_not_all_visible_block++) { if (!visibilitymap_test(onerel, next_not_all_visible_block, - &vmbuffer)) + &vmbuffer, + VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN)) break; vacuum_delay_point(); } @@ -566,9 +588,20 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, } else { - /* Current block is all-visible */ + /* + * Current block is all-visible. + * If visibility map represents that it's all frozen, we can + * skip to vacuum page unconditionally. + */ + if (visibilitymap_test(onerel, blkno, &vmbuffer, VISIBILITYMAP_ALL_FROZEN)) + { + vacrelstats->vmskipped_pages++; + continue; + } + if (skipping_all_visible_blocks && !scan_all) continue; + all_visible_according_to_vm = true; } @@ -740,7 +773,8 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, PageSetAllVisible(page); visibilitymap_set(onerel, blkno, buf, InvalidXLogRecPtr, - vmbuffer, InvalidTransactionId); + vmbuffer, InvalidTransactionId, + VISIBILITYMAP_ALL_VISIBLE); END_CRIT_SECTION(); } @@ -764,6 +798,8 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, all_visible = true; has_dead_tuples = false; nfrozen = 0; + already_nfrozen = 0; + ntup_in_blk = 0; hastup = false; prev_dead_count = vacrelstats->num_dead_tuples; maxoff = PageGetMaxOffsetNumber(page); @@ -918,8 +954,13 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, else { num_tuples += 1; + ntup_in_blk += 1; hastup = true; + /* If current tuple is already frozen, count it up */ + if (HeapTupleHeaderXminFrozen(tuple.t_data)) + already_nfrozen += 1; + /* * Each non-removable tuple must be checked to see if it needs * freezing. Note we already have exclusive buffer lock. @@ -931,11 +972,12 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, } /* scan along page */ /* - * If we froze any tuples, mark the buffer dirty, and write a WAL - * record recording the changes. We must log the changes to be - * crash-safe against future truncation of CLOG. + * If we froze any tuples or any tuples are already frozen, + * mark the buffer dirty, and write a WAL record recording the changes. + * We must log the changes to be crash-safe against future truncation + * of CLOG. */ - if (nfrozen > 0) + if (nfrozen > 0 || already_nfrozen > 0) { START_CRIT_SECTION(); @@ -953,8 +995,20 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, heap_execute_freeze_tuple(htup, &frozen[i]); } + /* + * As a result of scanning a page, we ensure that all tuples + * are completely frozen. Set VISIBILITYMAP_ALL_FROZEN bit on + * visibility map and PD_ALL_FROZEN flag on page. + */ + if (ntup_in_blk == (nfrozen + already_nfrozen)) + { + PageSetAllFrozen(page); + visibilitymap_set(onerel, blkno, buf, InvalidXLogRecPtr, vmbuffer, + InvalidTransactionId, VISIBILITYMAP_ALL_FROZEN); + } + /* Now WAL-log freezing if necessary */ - if (RelationNeedsWAL(onerel)) + if (nfrozen > 0 && RelationNeedsWAL(onerel)) { XLogRecPtr recptr; @@ -1007,7 +1061,7 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, PageSetAllVisible(page); MarkBufferDirty(buf); visibilitymap_set(onerel, blkno, buf, InvalidXLogRecPtr, - vmbuffer, visibility_cutoff_xid); + vmbuffer, visibility_cutoff_xid, VISIBILITYMAP_ALL_VISIBLE); } /* @@ -1018,11 +1072,11 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, * that something bad has happened. */ else if (all_visible_according_to_vm && !PageIsAllVisible(page) - && visibilitymap_test(onerel, blkno, &vmbuffer)) + && visibilitymap_test(onerel, blkno, &vmbuffer, VISIBILITYMAP_ALL_VISIBLE)) { elog(WARNING, "page is not marked all-visible but visibility map bit is set in relation \"%s\" page %u", relname, blkno); - visibilitymap_clear(onerel, blkno, vmbuffer); + visibilitymap_clear(onerel, blkno, vmbuffer, VISIBILITYMAP_ALL_VISIBLE); } /* @@ -1044,7 +1098,7 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, relname, blkno); PageClearAllVisible(page); MarkBufferDirty(buf); - visibilitymap_clear(onerel, blkno, vmbuffer); + visibilitymap_clear(onerel, blkno, vmbuffer, VISIBILITYMAP_ALL_VISIBLE); } UnlockReleaseBuffer(buf); @@ -1078,7 +1132,7 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, num_tuples); /* - * Release any remaining pin on visibility map page. + * Release any remaining pin on visibility map and frozen map page. */ if (BufferIsValid(vmbuffer)) { @@ -1285,11 +1339,11 @@ lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer, * flag is now set, also set the VM bit. */ if (PageIsAllVisible(page) && - !visibilitymap_test(onerel, blkno, vmbuffer)) + !visibilitymap_test(onerel, blkno, vmbuffer, VISIBILITYMAP_ALL_VISIBLE)) { Assert(BufferIsValid(*vmbuffer)); visibilitymap_set(onerel, blkno, buffer, InvalidXLogRecPtr, *vmbuffer, - visibility_cutoff_xid); + visibility_cutoff_xid, VISIBILITYMAP_ALL_VISIBLE); } return tupindex; @@ -1408,6 +1462,7 @@ lazy_cleanup_index(Relation indrel, stats->num_pages, stats->num_index_tuples, 0, + 0, false, InvalidTransactionId, InvalidMultiXactId, diff --git a/src/backend/executor/nodeIndexonlyscan.c b/src/backend/executor/nodeIndexonlyscan.c index 9f54c46..08df289 100644 --- a/src/backend/executor/nodeIndexonlyscan.c +++ b/src/backend/executor/nodeIndexonlyscan.c @@ -116,7 +116,7 @@ IndexOnlyNext(IndexOnlyScanState *node) */ if (!visibilitymap_test(scandesc->heapRelation, ItemPointerGetBlockNumber(tid), - &node->ioss_VMBuffer)) + &node->ioss_VMBuffer, VISIBILITYMAP_ALL_VISIBLE)) { /* * Rats, we have to visit the heap to check visibility. diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c index 874ca6a..376841a 100644 --- a/src/backend/executor/nodeModifyTable.c +++ b/src/backend/executor/nodeModifyTable.c @@ -127,7 +127,7 @@ ExecCheckPlanOutput(Relation resultRel, List *targetList) if (attno != resultDesc->natts) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), - errmsg("table row type and query-specified row type do not match"), + errmsg("table row type and query-specified row type do not match"), errdetail("Query has too few columns."))); } diff --git a/src/include/access/heapam_xlog.h b/src/include/access/heapam_xlog.h index caa0f14..d2f083b 100644 --- a/src/include/access/heapam_xlog.h +++ b/src/include/access/heapam_xlog.h @@ -64,9 +64,10 @@ */ /* PD_ALL_VISIBLE was cleared */ #define XLH_INSERT_ALL_VISIBLE_CLEARED (1<<0) -#define XLH_INSERT_LAST_IN_MULTI (1<<1) -#define XLH_INSERT_IS_SPECULATIVE (1<<2) -#define XLH_INSERT_CONTAINS_NEW_TUPLE (1<<3) +#define XLH_INSERT_ALL_FROZEN_CLEARED (1<<1) +#define XLH_INSERT_LAST_IN_MULTI (1<<2) +#define XLH_INSERT_IS_SPECULATIVE (1<<3) +#define XLH_INSERT_CONTAINS_NEW_TUPLE (1<<4) /* * xl_heap_update flag values, 8 bits are available. @@ -75,11 +76,15 @@ #define XLH_UPDATE_OLD_ALL_VISIBLE_CLEARED (1<<0) /* PD_ALL_VISIBLE was cleared in the 2nd page */ #define XLH_UPDATE_NEW_ALL_VISIBLE_CLEARED (1<<1) -#define XLH_UPDATE_CONTAINS_OLD_TUPLE (1<<2) -#define XLH_UPDATE_CONTAINS_OLD_KEY (1<<3) -#define XLH_UPDATE_CONTAINS_NEW_TUPLE (1<<4) -#define XLH_UPDATE_PREFIX_FROM_OLD (1<<5) -#define XLH_UPDATE_SUFFIX_FROM_OLD (1<<6) +/* PD_FROZEN_VISIBLE was cleared */ +#define XLH_UPDATE_OLD_ALL_FROZEN_CLEARED (1<<2) +/* PD_FROZEN_VISIBLE was cleared in the 2nd page */ +#define XLH_UPDATE_NEW_ALL_FROZEN_CLEARED (1<<3) +#define XLH_UPDATE_CONTAINS_OLD_TUPLE (1<<4) +#define XLH_UPDATE_CONTAINS_OLD_KEY (1<<5) +#define XLH_UPDATE_CONTAINS_NEW_TUPLE (1<<6) +#define XLH_UPDATE_PREFIX_FROM_OLD (1<<7) +#define XLH_UPDATE_SUFFIX_FROM_OLD (1<<8) /* convenience macro for checking whether any form of old tuple was logged */ #define XLH_UPDATE_CONTAINS_OLD \ @@ -90,9 +95,10 @@ */ /* PD_ALL_VISIBLE was cleared */ #define XLH_DELETE_ALL_VISIBLE_CLEARED (1<<0) -#define XLH_DELETE_CONTAINS_OLD_TUPLE (1<<1) -#define XLH_DELETE_CONTAINS_OLD_KEY (1<<2) -#define XLH_DELETE_IS_SUPER (1<<3) +#define XLH_DELETE_ALL_FROZEN_CLEARED (1<<1) +#define XLH_DELETE_CONTAINS_OLD_TUPLE (1<<2) +#define XLH_DELETE_CONTAINS_OLD_KEY (1<<3) +#define XLH_DELETE_IS_SUPER (1<<4) /* convenience macro for checking whether any form of old tuple was logged */ #define XLH_DELETE_CONTAINS_OLD \ @@ -320,9 +326,10 @@ typedef struct xl_heap_freeze_page typedef struct xl_heap_visible { TransactionId cutoff_xid; + uint8 flags; } xl_heap_visible; -#define SizeOfHeapVisible (offsetof(xl_heap_visible, cutoff_xid) + sizeof(TransactionId)) +#define SizeOfHeapVisible (offsetof(xl_heap_visible, flags) + sizeof(uint8)) typedef struct xl_heap_new_cid { @@ -382,6 +389,8 @@ extern XLogRecPtr log_heap_clean(Relation reln, Buffer buffer, extern XLogRecPtr log_heap_freeze(Relation reln, Buffer buffer, TransactionId cutoff_xid, xl_heap_freeze_tuple *tuples, int ntuples); +extern XLogRecPtr log_heap_frozenmap(RelFileNode rnode, Buffer heap_buffer, + Buffer fm_buffer); extern bool heap_prepare_freeze_tuple(HeapTupleHeader tuple, TransactionId cutoff_xid, TransactionId cutoff_multi, @@ -389,6 +398,6 @@ extern bool heap_prepare_freeze_tuple(HeapTupleHeader tuple, extern void heap_execute_freeze_tuple(HeapTupleHeader tuple, xl_heap_freeze_tuple *xlrec_tp); extern XLogRecPtr log_heap_visible(RelFileNode rnode, Buffer heap_buffer, - Buffer vm_buffer, TransactionId cutoff_xid); + Buffer vm_buffer, TransactionId cutoff_xid, uint8 flags); #endif /* HEAPAM_XLOG_H */ diff --git a/src/include/access/visibilitymap.h b/src/include/access/visibilitymap.h index 0c0e0ef..53d8103 100644 --- a/src/include/access/visibilitymap.h +++ b/src/include/access/visibilitymap.h @@ -19,15 +19,21 @@ #include "storage/buf.h" #include "utils/relcache.h" +/* Flags for bit map */ +#define VISIBILITYMAP_ALL_VISIBLE 0x01 +#define VISIBILITYMAP_ALL_FROZEN 0x02 + extern void visibilitymap_clear(Relation rel, BlockNumber heapBlk, - Buffer vmbuf); + Buffer vmbuf, uint8 flags); extern void visibilitymap_pin(Relation rel, BlockNumber heapBlk, Buffer *vmbuf); extern bool visibilitymap_pin_ok(BlockNumber heapBlk, Buffer vmbuf); extern void visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf, - XLogRecPtr recptr, Buffer vmBuf, TransactionId cutoff_xid); -extern bool visibilitymap_test(Relation rel, BlockNumber heapBlk, Buffer *vmbuf); -extern BlockNumber visibilitymap_count(Relation rel); + XLogRecPtr recptr, Buffer vmBuf, TransactionId cutoff_xid, + uint8 flags); +extern bool visibilitymap_test(Relation rel, BlockNumber heapBlk, Buffer *vmbuf, + uint8 flags); +extern BlockNumber visibilitymap_count(Relation rel, bool for_visible); extern void visibilitymap_truncate(Relation rel, BlockNumber nheapblocks); #endif /* VISIBILITYMAP_H */ diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h index e526cd9..ea0f7c1 100644 --- a/src/include/catalog/pg_class.h +++ b/src/include/catalog/pg_class.h @@ -47,6 +47,8 @@ CATALOG(pg_class,1259) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83) BKI_SCHEMA_MACRO float4 reltuples; /* # of tuples (not always up-to-date) */ int32 relallvisible; /* # of all-visible blocks (not always * up-to-date) */ + int32 relallfrozen; /* # of all-frozen blocks (not always + up-to-date) */ Oid reltoastrelid; /* OID of toast table; 0 if none */ bool relhasindex; /* T if has (or has had) any indexes */ bool relisshared; /* T if shared across databases */ @@ -95,7 +97,7 @@ typedef FormData_pg_class *Form_pg_class; * ---------------- */ -#define Natts_pg_class 30 +#define Natts_pg_class 31 #define Anum_pg_class_relname 1 #define Anum_pg_class_relnamespace 2 #define Anum_pg_class_reltype 3 @@ -107,25 +109,26 @@ typedef FormData_pg_class *Form_pg_class; #define Anum_pg_class_relpages 9 #define Anum_pg_class_reltuples 10 #define Anum_pg_class_relallvisible 11 -#define Anum_pg_class_reltoastrelid 12 -#define Anum_pg_class_relhasindex 13 -#define Anum_pg_class_relisshared 14 -#define Anum_pg_class_relpersistence 15 -#define Anum_pg_class_relkind 16 -#define Anum_pg_class_relnatts 17 -#define Anum_pg_class_relchecks 18 -#define Anum_pg_class_relhasoids 19 -#define Anum_pg_class_relhaspkey 20 -#define Anum_pg_class_relhasrules 21 -#define Anum_pg_class_relhastriggers 22 -#define Anum_pg_class_relhassubclass 23 -#define Anum_pg_class_relrowsecurity 24 -#define Anum_pg_class_relispopulated 25 -#define Anum_pg_class_relreplident 26 -#define Anum_pg_class_relfrozenxid 27 -#define Anum_pg_class_relminmxid 28 -#define Anum_pg_class_relacl 29 -#define Anum_pg_class_reloptions 30 +#define Anum_pg_class_relallfrozen 12 +#define Anum_pg_class_reltoastrelid 13 +#define Anum_pg_class_relhasindex 14 +#define Anum_pg_class_relisshared 15 +#define Anum_pg_class_relpersistence 16 +#define Anum_pg_class_relkind 17 +#define Anum_pg_class_relnatts 18 +#define Anum_pg_class_relchecks 19 +#define Anum_pg_class_relhasoids 20 +#define Anum_pg_class_relhaspkey 21 +#define Anum_pg_class_relhasrules 22 +#define Anum_pg_class_relhastriggers 23 +#define Anum_pg_class_relhassubclass 24 +#define Anum_pg_class_relrowsecurity 25 +#define Anum_pg_class_relispopulated 26 +#define Anum_pg_class_relreplident 27 +#define Anum_pg_class_relfrozenxid 28 +#define Anum_pg_class_relminmxid 29 +#define Anum_pg_class_relacl 30 +#define Anum_pg_class_reloptions 31 /* ---------------- * initial contents of pg_class @@ -140,13 +143,13 @@ typedef FormData_pg_class *Form_pg_class; * Note: "3" in the relfrozenxid column stands for FirstNormalTransactionId; * similarly, "1" in relminmxid stands for FirstMultiXactId */ -DATA(insert OID = 1247 ( pg_type PGNSP 71 0 PGUID 0 0 0 0 0 0 0 f f p r 30 0 t f f f f f t n 3 1 _null_ _null_ )); +DATA(insert OID = 1247 ( pg_type PGNSP 71 0 PGUID 0 0 0 0 0 0 0 0 f f p r 30 0 t f f f f f t n 3 1 _null_ _null_ )); DESCR(""); -DATA(insert OID = 1249 ( pg_attribute PGNSP 75 0 PGUID 0 0 0 0 0 0 0 f f p r 21 0 f f f f f f t n 3 1 _null_ _null_ )); +DATA(insert OID = 1249 ( pg_attribute PGNSP 75 0 PGUID 0 0 0 0 0 0 0 0 f f p r 21 0 f f f f f f t n 3 1 _null_ _null_ )); DESCR(""); -DATA(insert OID = 1255 ( pg_proc PGNSP 81 0 PGUID 0 0 0 0 0 0 0 f f p r 28 0 t f f f f f t n 3 1 _null_ _null_ )); +DATA(insert OID = 1255 ( pg_proc PGNSP 81 0 PGUID 0 0 0 0 0 0 0 0 f f p r 28 0 t f f f f f t n 3 1 _null_ _null_ )); DESCR(""); -DATA(insert OID = 1259 ( pg_class PGNSP 83 0 PGUID 0 0 0 0 0 0 0 f f p r 30 0 t f f f f f t n 3 1 _null_ _null_ )); +DATA(insert OID = 1259 ( pg_class PGNSP 83 0 PGUID 0 0 0 0 0 0 0 0 f f p r 31 0 t f f f f f t n 3 1 _null_ _null_ )); DESCR(""); diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h index e3a31af..d2bae2d 100644 --- a/src/include/commands/vacuum.h +++ b/src/include/commands/vacuum.h @@ -172,6 +172,7 @@ extern void vac_update_relstats(Relation relation, BlockNumber num_pages, double num_tuples, BlockNumber num_all_visible_pages, + BlockNumber num_all_frozen_pages, bool hasindex, TransactionId frozenxid, MultiXactId minmulti, diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h index a2f78ee..7bf2718 100644 --- a/src/include/storage/bufpage.h +++ b/src/include/storage/bufpage.h @@ -178,8 +178,10 @@ typedef PageHeaderData *PageHeader; * tuple? */ #define PD_ALL_VISIBLE 0x0004 /* all tuples on page are visible to * everyone */ +#define PD_ALL_FROZEN 0x0008 /* all tuples on page are completely + frozen */ -#define PD_VALID_FLAG_BITS 0x0007 /* OR of all valid pd_flags bits */ +#define PD_VALID_FLAG_BITS 0x000F /* OR of all valid pd_flags bits */ /* * Page layout version number 0 is for pre-7.3 Postgres releases. @@ -369,6 +371,13 @@ typedef PageHeaderData *PageHeader; #define PageClearAllVisible(page) \ (((PageHeader) (page))->pd_flags &= ~PD_ALL_VISIBLE) +#define PageIsAllFrozen(page) \ + (((PageHeader) (page))->pd_flags & PD_ALL_FROZEN) +#define PageSetAllFrozen(page) \ + (((PageHeader) (page))->pd_flags |= PD_ALL_FROZEN) +#define PageClearAllFrozen(page) \ + (((PageHeader) (page))->pd_flags &= ~PD_ALL_FROZEN) + #define PageIsPrunable(page, oldestxmin) \ ( \ AssertMacro(TransactionIdIsNormal(oldestxmin)), \
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers