On Wed, 2013-05-01 at 11:33 -0400, Robert Haas wrote:
> >> The only time the VM and the data page are out of sync during vacuum is
> >> after a crash, right? If that's the case, I didn't think it was a big
> >> deal to dirty one extra page (should be extremely rare). Am I missing
> >> something?
> >>
> >> The reason I removed that special case was just code
> >> complexity/readability. I tried preserving the previous behavior, and
> >> it's not so bad, but it seemed unnecessarily ugly for the benefit of a
> >> rare case.
> >
> > Well, I don't find it that ugly, but then again
> 
> ...I've done a fair amount of hacking on this code.  The scenario that
> I find problematic here is that you could easily lose a couple of
> visibility map pages in a crash.  Each one you lose, with this patch,
> potentially involves half a gigabyte of unnecessary disk writes, if
> the relation is being vacuumed at the time.  For the few extra lines
> of code it takes to protect against that scenario, I think we should
> protect against it.

I'm still unclear on how we'd lose so many VM bits at once, because they
are logged. I see how we can lose one if the heap page makes it to disk
before the VM bit's WAL is flushed, but that seems to be bounded to me.

Regardless, you have a reasonable claim that my patch had effects that
were not necessary. I have attached a draft patch to remedy that. Only
rudimentary testing was done.

Regards,
        Jeff Davis

*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
***************
*** 5755,5762 **** log_heap_freeze(Relation reln, Buffer buffer,
   * corresponding visibility map block.	Both should have already been modified
   * and dirtied.
   *
!  * If checksums are enabled, we also add the heap_buffer to the chain to
!  * protect it from being torn.
   */
  XLogRecPtr
  log_heap_visible(RelFileNode rnode, Buffer heap_buffer, Buffer vm_buffer,
--- 5755,5762 ----
   * corresponding visibility map block.	Both should have already been modified
   * and dirtied.
   *
!  * If checksums are enabled and the heap buffer was changed (PD_ALL_VISIBLE
!  * set), we add it to the chain to protect it from being torn.
   */
  XLogRecPtr
  log_heap_visible(RelFileNode rnode, Buffer heap_buffer, Buffer vm_buffer,
***************
*** 5784,5790 **** log_heap_visible(RelFileNode rnode, Buffer heap_buffer, Buffer vm_buffer,
  	rdata[1].buffer_std = false;
  	rdata[1].next = NULL;
  
! 	if (DataChecksumsEnabled())
  	{
  		rdata[1].next = &(rdata[2]);
  
--- 5784,5790 ----
  	rdata[1].buffer_std = false;
  	rdata[1].next = NULL;
  
! 	if (DataChecksumsEnabled() && BufferIsValid(heap_buffer))
  	{
  		rdata[1].next = &(rdata[2]);
  
*** a/src/backend/access/heap/visibilitymap.c
--- b/src/backend/access/heap/visibilitymap.c
***************
*** 233,242 **** visibilitymap_pin_ok(BlockNumber heapBlk, Buffer buf)
   * marked all-visible; it is needed for Hot Standby, and can be
   * InvalidTransactionId if the page contains no tuples.
   *
!  * Caller is expected to set the heap page's PD_ALL_VISIBLE bit before calling
!  * this function. Except in recovery, caller should also pass the heap
!  * buffer. When checksums are enabled and we're not in recovery, we must add
!  * the heap buffer to the WAL chain to protect it from being torn.
   *
   * You must pass a buffer containing the correct map page to this function.
   * Call visibilitymap_pin first to pin the right one. This function doesn't do
--- 233,243 ----
   * marked all-visible; it is needed for Hot Standby, and can be
   * InvalidTransactionId if the page contains no tuples.
   *
!  * The caller should supply heapBuf if and only if it set PD_ALL_VISIBLE on the
!  * heap page, and we're not in recovery.  In that case, the caller should have
!  * already set PD_ALL_VISIBLE and marked the page dirty.  When checksums are
!  * enabled and heapBuf is valid, this function will add heapBuf to the WAL
!  * chain to protect it from being torn.
   *
   * You must pass a buffer containing the correct map page to this function.
   * Call visibilitymap_pin first to pin the right one. This function doesn't do
***************
*** 257,263 **** visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
  #endif
  
  	Assert(InRecovery || XLogRecPtrIsInvalid(recptr));
- 	Assert(InRecovery || BufferIsValid(heapBuf));
  
  	/* Check that we have the right heap page pinned, if present */
  	if (BufferIsValid(heapBuf) && BufferGetBlockNumber(heapBuf) != heapBlk)
--- 258,263 ----
***************
*** 290,296 **** visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
  				 * If data checksums are enabled, we need to protect the heap
  				 * page from being torn.
  				 */
! 				if (DataChecksumsEnabled())
  				{
  					Page heapPage = BufferGetPage(heapBuf);
  
--- 290,296 ----
  				 * If data checksums are enabled, we need to protect the heap
  				 * page from being torn.
  				 */
! 				if (DataChecksumsEnabled() && BufferIsValid(heapBuf))
  				{
  					Page heapPage = BufferGetPage(heapBuf);
  
*** a/src/backend/commands/vacuumlazy.c
--- b/src/backend/commands/vacuumlazy.c
***************
*** 894,918 **** lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
  		freespace = PageGetHeapFreeSpace(page);
  
  		/* mark page all-visible, if appropriate */
! 		if (all_visible && !all_visible_according_to_vm)
  		{
! 			/*
! 			 * It should never be the case that the visibility map page is set
! 			 * while the page-level bit is clear, but the reverse is allowed
! 			 * (if checksums are not enabled).  Regardless, set the both bits
! 			 * so that we get back in sync.
! 			 *
! 			 * NB: If the heap page is all-visible but the VM bit is not set,
! 			 * we don't need to dirty the heap page.  However, if checksums are
! 			 * enabled, we do need to make sure that the heap page is dirtied
! 			 * before passing it to visibilitymap_set(), because it may be
! 			 * logged.  Given that this situation should only happen in rare
! 			 * cases after a crash, it is not worth optimizing.
! 			 */
! 			PageSetAllVisible(page);
! 			MarkBufferDirty(buf);
! 			visibilitymap_set(onerel, blkno, buf, InvalidXLogRecPtr,
! 							  vmbuffer, visibility_cutoff_xid);
  		}
  
  		/*
--- 894,920 ----
  		freespace = PageGetHeapFreeSpace(page);
  
  		/* mark page all-visible, if appropriate */
! 		if (all_visible)
  		{
! 			if (!PageIsAllVisible(page))
! 			{
! 				PageSetAllVisible(page);
! 				MarkBufferDirty(buf);
! 				visibilitymap_set(onerel, blkno, buf, InvalidXLogRecPtr,
! 								  vmbuffer, visibility_cutoff_xid);
! 			}
! 			else if (!all_visible_according_to_vm)
! 			{
! 				/*
! 				 * It should never be the case that the visibility map page is
! 				 * set while the page-level bit is clear, but the reverse is
! 				 * allowed.  Set the visibility map bit as well so that we get
! 				 * back in sync.
! 				 */
! 				visibilitymap_set(onerel, blkno, InvalidBuffer,
! 								  InvalidXLogRecPtr, vmbuffer,
! 								  visibility_cutoff_xid);
! 			}
  		}
  
  		/*
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to