Tom Lane wrote:
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
Tom Lane wrote:
ISTM that we must fix the bgwriter so that ForgetDatabaseFsyncRequests
causes PendingUnlinkEntrys for the doomed DB to be thrown away too.

Because of the asynchronous nature of ForgetDatabaseFsyncRequests, this still isn't enough, I'm afraid.

Hm.  I notice that there is no bug on Windows because dropdb forces a
checkpoint before it starts to remove files.  Maybe the sanest solution
is to just do that on all platforms.

Yeah, I don't see any other simple solution.

Patch attached that does the three changes we've talked about:'
- make ForgetDatabaseFsyncRequests forget unlink requests as well
- make rmtree() not fail on ENOENT
- force checkpoint on in dropdb on all platforms

plus some comment changes reflecting what we now know. I will apply this tomorrow if there's no objections.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
Index: src/backend/commands/dbcommands.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/commands/dbcommands.c,v
retrieving revision 1.205
diff -c -r1.205 dbcommands.c
*** src/backend/commands/dbcommands.c	26 Mar 2008 21:10:37 -0000	1.205
--- src/backend/commands/dbcommands.c	17 Apr 2008 18:53:17 -0000
***************
*** 696,713 ****
  	pgstat_drop_database(db_id);
  
  	/*
! 	 * Tell bgwriter to forget any pending fsync requests for files in the
! 	 * database; else it'll fail at next checkpoint.
  	 */
  	ForgetDatabaseFsyncRequests(db_id);
  
  	/*
! 	 * On Windows, force a checkpoint so that the bgwriter doesn't hold any
! 	 * open files, which would cause rmdir() to fail.
  	 */
- #ifdef WIN32
  	RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_FORCE | CHECKPOINT_WAIT);
- #endif
  
  	/*
  	 * Remove all tablespace subdirs belonging to the database.
--- 696,714 ----
  	pgstat_drop_database(db_id);
  
  	/*
! 	 * Tell bgwriter to forget any pending fsync and unlink requests for files
! 	 * in the database; else it'll fail at next checkpoint, or worse it will
! 	 * delete files that belong to a newly created database with the same OID.
  	 */
  	ForgetDatabaseFsyncRequests(db_id);
  
  	/*
! 	 * Force a checkpoint to make sure the bgwriter has received the message
! 	 * sent by ForgetDatabaseFsyncRequests. On Windows, this also ensures that
! 	 * the bgwriter doesn't hold any open files, which would cause rmdir() to
! 	 * fail.
  	 */
  	RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_FORCE | CHECKPOINT_WAIT);
  
  	/*
  	 * Remove all tablespace subdirs belonging to the database.
Index: src/backend/storage/smgr/md.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/storage/smgr/md.c,v
retrieving revision 1.136
diff -c -r1.136 md.c
*** src/backend/storage/smgr/md.c	10 Mar 2008 20:06:27 -0000	1.136
--- src/backend/storage/smgr/md.c	17 Apr 2008 19:26:16 -0000
***************
*** 1196,1203 ****
  		if (unlink(path) < 0)
  		{
  			/*
! 			 * ENOENT shouldn't happen either, but it doesn't really matter
! 			 * because we would've deleted it now anyway.
  			 */
  			if (errno != ENOENT)
  				ereport(WARNING,
--- 1196,1206 ----
  		if (unlink(path) < 0)
  		{
  			/*
! 			 * There's a race condition, when the database is dropped at the
! 			 * same time that we process the pending unlink requests. If the
! 			 * DROP DATABASE deletes the file before we do, we will get ENOENT
! 			 * here. rmtree() also has to ignore ENOENT errors, to deal with
! 			 * the possibility that we delete the file first.
  			 */
  			if (errno != ENOENT)
  				ereport(WARNING,
***************
*** 1321,1327 ****
--- 1324,1334 ----
  		/* Remove any pending requests for the entire database */
  		HASH_SEQ_STATUS hstat;
  		PendingOperationEntry *entry;
+ 		ListCell   *cell, 
+ 				   *prev,
+ 				   *next;
  
+ 		/* Remove fsync requests */
  		hash_seq_init(&hstat, pendingOpsTable);
  		while ((entry = (PendingOperationEntry *) hash_seq_search(&hstat)) != NULL)
  		{
***************
*** 1331,1336 ****
--- 1338,1359 ----
  				entry->canceled = true;
  			}
  		}
+ 	
+ 		/* Remove unlink requests */
+ 		prev = NULL;
+ 		for (cell = list_head(pendingUnlinks); cell; cell = next)
+ 		{
+ 			PendingUnlinkEntry *entry = (PendingUnlinkEntry *) lfirst(cell);
+ 
+ 			next = lnext(cell);
+ 			if (entry->rnode.dbNode == rnode.dbNode) 
+ 			{
+ 				pendingUnlinks = list_delete_cell(pendingUnlinks, cell, prev);
+ 				pfree(entry);
+ 			}
+ 			else
+ 				prev = cell;
+ 		}
  	}
  	else if (segno == UNLINK_RELATION_REQUEST)
  	{
***************
*** 1386,1392 ****
  }
  
  /*
!  * ForgetRelationFsyncRequests -- ensure any fsyncs for a rel are forgotten
   */
  void
  ForgetRelationFsyncRequests(RelFileNode rnode)
--- 1409,1415 ----
  }
  
  /*
!  * ForgetRelationFsyncRequests -- forget any fsyncs for a rel
   */
  void
  ForgetRelationFsyncRequests(RelFileNode rnode)
***************
*** 1419,1425 ****
  }
  
  /*
!  * ForgetDatabaseFsyncRequests -- ensure any fsyncs for a DB are forgotten
   */
  void
  ForgetDatabaseFsyncRequests(Oid dbid)
--- 1442,1448 ----
  }
  
  /*
!  * ForgetDatabaseFsyncRequests -- forget any fsyncs and unlinks for a DB
   */
  void
  ForgetDatabaseFsyncRequests(Oid dbid)
Index: src/port/dirmod.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/port/dirmod.c,v
retrieving revision 1.53
diff -c -r1.53 dirmod.c
*** src/port/dirmod.c	11 Apr 2008 23:53:00 -0000	1.53
--- src/port/dirmod.c	17 Apr 2008 19:08:15 -0000
***************
*** 406,413 ****
  	{
  		snprintf(filepath, MAXPGPATH, "%s/%s", path, *filename);
  
  		if (lstat(filepath, &statbuf) != 0)
! 			goto report_and_fail;
  
  		if (S_ISDIR(statbuf.st_mode))
  		{
--- 406,429 ----
  	{
  		snprintf(filepath, MAXPGPATH, "%s/%s", path, *filename);
  
+ 		/*
+ 		 * It's ok if the file is not there anymore; we were just about to
+ 		 * delete it anyway.
+ 		 *
+ 		 * This is not an academic possibility. One scenario where this
+ 		 * happens is when bgwriter has a pending unlink request for a file
+ 		 * in a database that's being dropped. In dropdb(), we call
+ 		 * ForgetDatabaseFsyncRequests() to flush out any such pending unlink
+ 		 * requests, but because that's asynchronous, it's not guaranteed
+ 		 * that the bgwriter receives the message in time.
+ 		 */
  		if (lstat(filepath, &statbuf) != 0)
! 		{
! 			if (errno != ENOENT)
! 				goto report_and_fail;
! 			else
! 				continue;
! 		}
  
  		if (S_ISDIR(statbuf.st_mode))
  		{
***************
*** 422,428 ****
  		else
  		{
  			if (unlink(filepath) != 0)
! 				goto report_and_fail;
  		}
  	}
  
--- 438,447 ----
  		else
  		{
  			if (unlink(filepath) != 0)
! 			{
! 				if (errno != ENOENT)
! 					goto report_and_fail;
! 			}
  		}
  	}
  
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to