On 15.08.2012 09:50, Heikki Linnakangas wrote:
On 15.08.2012 01:02, Zdeněk Jílovec wrote:
Hello,

I use PostgreSQL 9.2beta3 with PostGIS 2.0.1 and if I try create GIST
index
on column geometry(Point,2065) I get error:

test=> CREATE INDEX places_point ON places USING GIST(def_point);
ERROR: failed to re-find parent for block 18097

It works on 9.1

Hmm, I bet this is a bug in the new GiST buffering build code. There was
an earlier bug that led to "failed to re-find parent" errors that I
fixed back in May, but maybe I missed some corner case.

Zdeněk sent me the dump and instructions off-list, and I was able to reproduce and diagnose the bug. Many thanks for that! It was indeed a corner-case in the parent tracking logic.

During the build, we maintain a hash table of the parent of each page. The hash table is used to find the parent of a page, when a page is split and we have to insert the downlinks of the new pages to the parent. In a regular GiST insertion, we always descend the tree from the root to leaf, and we get the parent pointers from the stack. During a buffered build, we don't have such a stack available, because we can start the descend from a buffer in the middle of the tree. So we use the parent map instead.

However, the parent hash table does not track the immediate parents of leaf pages. That's not required, because even though we can begin the descend somewhere in the middle of the tree, when we descend to a leaf page we know the immediate parent where we came from. Not tracking the leaf level saves a considerable amount of memory.

But just before we descend to the leaf page, we check if the downlink needs to be adjusted to accommodate the new tuple, and replace it with an updated tuple if so. The bug arises when updating the downlink of the leaf splits the parent page, and the downlink is moved to a right sibling. When we then descend to the leaf page, the parent of the leaf page is incorrect, the real parent is somewhere to the right of where we think it is.

In a normal index insert that case is covered by the logic to move right if the downlink is not found on the expected page. In the buffering build, we don't do that because we think we know exactly what the parent of each page is.

I committed the attached patch to fix that. With the patch, when the downlink is updated in the parent page, the gistbufferinginserttuples() function returns the block where the updated tuple was placed, so that when we descend to the leaf, we know the parent of the leaf correctly.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
*** a/src/backend/access/gist/gist.c
--- b/src/backend/access/gist/gist.c
***************
*** 148,163 **** gistinsert(PG_FUNCTION_ARGS)
--- 148,169 ----
   * pages are released; note that new tuple(s) are *not* on the root page
   * but in one of the new child pages.
   *
+  * If 'newblkno' is not NULL, returns the block number of page the first
+  * new/updated tuple was inserted to. Usually it's the given page, but could
+  * be its right sibling if the page was split.
+  *
   * Returns 'true' if the page was split, 'false' otherwise.
   */
  bool
  gistplacetopage(Relation rel, Size freespace, GISTSTATE *giststate,
  				Buffer buffer,
  				IndexTuple *itup, int ntup, OffsetNumber oldoffnum,
+ 				BlockNumber *newblkno,
  				Buffer leftchildbuf,
  				List **splitinfo,
  				bool markfollowright)
  {
+ 	BlockNumber blkno = BufferGetBlockNumber(buffer);
  	Page		page = BufferGetPage(buffer);
  	bool		is_leaf = (GistPageIsLeaf(page)) ? true : false;
  	XLogRecPtr	recptr;
***************
*** 199,205 **** gistplacetopage(Relation rel, Size freespace, GISTSTATE *giststate,
  		BlockNumber oldrlink = InvalidBlockNumber;
  		GistNSN		oldnsn = 0;
  		SplitedPageLayout rootpg;
- 		BlockNumber blkno = BufferGetBlockNumber(buffer);
  		bool		is_rootsplit;
  
  		is_rootsplit = (blkno == GIST_ROOT_BLKNO);
--- 205,210 ----
***************
*** 319,327 **** gistplacetopage(Relation rel, Size freespace, GISTSTATE *giststate,
  
  			for (i = 0; i < ptr->block.num; i++)
  			{
! 				if (PageAddItem(ptr->page, (Item) data, IndexTupleSize((IndexTuple) data), i + FirstOffsetNumber, false, false) == InvalidOffsetNumber)
  					elog(ERROR, "failed to add item to index page in \"%s\"", RelationGetRelationName(rel));
! 				data += IndexTupleSize((IndexTuple) data);
  			}
  
  			/* Set up rightlinks */
--- 324,342 ----
  
  			for (i = 0; i < ptr->block.num; i++)
  			{
! 				IndexTuple	thistup = (IndexTuple) data;
! 
! 				if (PageAddItem(ptr->page, (Item) data, IndexTupleSize(thistup), i + FirstOffsetNumber, false, false) == InvalidOffsetNumber)
  					elog(ERROR, "failed to add item to index page in \"%s\"", RelationGetRelationName(rel));
! 
! 				/*
! 				 * If this is the first inserted/updated tuple, let the caller
! 				 * know which page it landed on.
! 				 */
! 				if (newblkno && ItemPointerEquals(&thistup->t_tid, &(*itup)->t_tid))
! 					*newblkno = ptr->block.blkno;
! 
! 				data += IndexTupleSize(thistup);
  			}
  
  			/* Set up rightlinks */
***************
*** 436,441 **** gistplacetopage(Relation rel, Size freespace, GISTSTATE *giststate,
--- 451,459 ----
  			recptr = GetXLogRecPtrForTemp();
  			PageSetLSN(page, recptr);
  		}
+ 
+ 		if (newblkno)
+ 			*newblkno = blkno;
  	}
  
  	/*
***************
*** 1074,1082 **** gistinserttuple(GISTInsertState *state, GISTInsertStack *stack,
   *	  is kept pinned.
   *	- Lock and pin on 'rightchild' are always released.
   *
!  * Returns 'true' if the page had to be split. Note that if the page had
!  * be split, the inserted/updated might've been inserted to a right sibling
!  * of stack->buffer instead of stack->buffer itself.
   */
  static bool
  gistinserttuples(GISTInsertState *state, GISTInsertStack *stack,
--- 1092,1100 ----
   *	  is kept pinned.
   *	- Lock and pin on 'rightchild' are always released.
   *
!  * Returns 'true' if the page had to be split. Note that if the page was
!  * split, the inserted/updated tuples might've been inserted to a right
!  * sibling of stack->buffer instead of stack->buffer itself.
   */
  static bool
  gistinserttuples(GISTInsertState *state, GISTInsertStack *stack,
***************
*** 1091,1097 **** gistinserttuples(GISTInsertState *state, GISTInsertStack *stack,
  	/* Insert the tuple(s) to the page, splitting the page if necessary */
  	is_split = gistplacetopage(state->r, state->freespace, giststate,
  							   stack->buffer,
! 							   tuples, ntup, oldoffnum,
  							   leftchild,
  							   &splitinfo,
  							   true);
--- 1109,1116 ----
  	/* Insert the tuple(s) to the page, splitting the page if necessary */
  	is_split = gistplacetopage(state->r, state->freespace, giststate,
  							   stack->buffer,
! 							   tuples, ntup,
! 							   oldoffnum, NULL,
  							   leftchild,
  							   &splitinfo,
  							   true);
*** a/src/backend/access/gist/gistbuild.c
--- b/src/backend/access/gist/gistbuild.c
***************
*** 85,91 **** static void gistBufferingBuildInsert(GISTBuildState *buildstate,
  						 IndexTuple itup);
  static bool gistProcessItup(GISTBuildState *buildstate, IndexTuple itup,
  				BlockNumber startblkno, int startlevel);
! static void gistbufferinginserttuples(GISTBuildState *buildstate,
  						  Buffer buffer, int level,
  						  IndexTuple *itup, int ntup, OffsetNumber oldoffnum,
  						  BlockNumber parentblk, OffsetNumber downlinkoffnum);
--- 85,91 ----
  						 IndexTuple itup);
  static bool gistProcessItup(GISTBuildState *buildstate, IndexTuple itup,
  				BlockNumber startblkno, int startlevel);
! static BlockNumber gistbufferinginserttuples(GISTBuildState *buildstate,
  						  Buffer buffer, int level,
  						  IndexTuple *itup, int ntup, OffsetNumber oldoffnum,
  						  BlockNumber parentblk, OffsetNumber downlinkoffnum);
***************
*** 621,629 **** gistProcessItup(GISTBuildState *buildstate, IndexTuple itup,
  		newtup = gistgetadjusted(indexrel, idxtuple, itup, giststate);
  		if (newtup)
  		{
! 			gistbufferinginserttuples(buildstate, buffer, level,
! 									  &newtup, 1, childoffnum,
! 									InvalidBlockNumber, InvalidOffsetNumber);
  			/* gistbufferinginserttuples() released the buffer */
  		}
  		else
--- 621,629 ----
  		newtup = gistgetadjusted(indexrel, idxtuple, itup, giststate);
  		if (newtup)
  		{
! 			blkno  = gistbufferinginserttuples(buildstate, buffer, level,
! 											   &newtup, 1, childoffnum,
! 									  InvalidBlockNumber, InvalidOffsetNumber);
  			/* gistbufferinginserttuples() released the buffer */
  		}
  		else
***************
*** 676,685 **** gistProcessItup(GISTBuildState *buildstate, IndexTuple itup,
   *
   * This is analogous with gistinserttuples() in the regular insertion code.
   *
   * Caller should hold a lock on 'buffer' on entry. This function will unlock
   * and unpin it.
   */
! static void
  gistbufferinginserttuples(GISTBuildState *buildstate, Buffer buffer, int level,
  						  IndexTuple *itup, int ntup, OffsetNumber oldoffnum,
  						  BlockNumber parentblk, OffsetNumber downlinkoffnum)
--- 676,689 ----
   *
   * This is analogous with gistinserttuples() in the regular insertion code.
   *
+  * Returns the block number of the page where the (first) new or updated tuple
+  * was inserted. Usually that's the original page, but might be a sibling page
+  * if the original page was split.
+  *
   * Caller should hold a lock on 'buffer' on entry. This function will unlock
   * and unpin it.
   */
! static BlockNumber
  gistbufferinginserttuples(GISTBuildState *buildstate, Buffer buffer, int level,
  						  IndexTuple *itup, int ntup, OffsetNumber oldoffnum,
  						  BlockNumber parentblk, OffsetNumber downlinkoffnum)
***************
*** 687,698 **** gistbufferinginserttuples(GISTBuildState *buildstate, Buffer buffer, int level,
  	GISTBuildBuffers *gfbb = buildstate->gfbb;
  	List	   *splitinfo;
  	bool		is_split;
  
  	is_split = gistplacetopage(buildstate->indexrel,
  							   buildstate->freespace,
  							   buildstate->giststate,
  							   buffer,
! 							   itup, ntup, oldoffnum,
  							   InvalidBuffer,
  							   &splitinfo,
  							   false);
--- 691,703 ----
  	GISTBuildBuffers *gfbb = buildstate->gfbb;
  	List	   *splitinfo;
  	bool		is_split;
+ 	BlockNumber	placed_to_blk = InvalidBlockNumber;
  
  	is_split = gistplacetopage(buildstate->indexrel,
  							   buildstate->freespace,
  							   buildstate->giststate,
  							   buffer,
! 							   itup, ntup, oldoffnum, &placed_to_blk,
  							   InvalidBuffer,
  							   &splitinfo,
  							   false);
***************
*** 823,828 **** gistbufferinginserttuples(GISTBuildState *buildstate, Buffer buffer, int level,
--- 828,835 ----
  	}
  	else
  		UnlockReleaseBuffer(buffer);
+ 
+ 	return placed_to_blk;
  }
  
  /*
*** a/src/include/access/gist_private.h
--- b/src/include/access/gist_private.h
***************
*** 430,436 **** typedef struct
  
  extern bool gistplacetopage(Relation rel, Size freespace, GISTSTATE *giststate,
  				Buffer buffer,
! 				IndexTuple *itup, int ntup, OffsetNumber oldoffnum,
  				Buffer leftchildbuf,
  				List **splitinfo,
  				bool markleftchild);
--- 430,437 ----
  
  extern bool gistplacetopage(Relation rel, Size freespace, GISTSTATE *giststate,
  				Buffer buffer,
! 				IndexTuple *itup, int ntup,
! 				OffsetNumber oldoffnum, BlockNumber *newblkno,
  				Buffer leftchildbuf,
  				List **splitinfo,
  				bool markleftchild);
-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Reply via email to