On Sun, 2009-12-27 at 20:11 +0100, Andres Freund wrote:
> On Tuesday 22 December 2009 11:42:30 Simon Riggs wrote:
> > On Tue, 2009-12-22 at 03:19 +0100, Andres Freund wrote:
> > > On Monday 21 December 2009 16:48:52 Simon Riggs wrote:
> > > > Giving the drop database a snapshot is not the answer. I expect Andres
> > > > to be able to fix this with a simple patch that would not effect the
> > > > case of normal running.
> > >
> > > Actually its less simply than I had thought at first - I don't think the
> > > code ever handled that correctly.
> > > I might be wrong there, my knowledge of the involved code is a bit
> > > sparse... The whole conflict resolution builds on the concept of waiting
> > > for an VXid, but an idle backend does not have a valid vxid. Thats
> > > correct, right?
> > I don't see any mileage in making Startup process wait for an idle
> > session, so no real reason to wait for others either.
> So here is a small patch implementing that behaviour.
On further testing, I received a re-connection from an automatic session
retry. That shouldn't have happened, but it indicates the need for
locking around the conflict handler. I had understood that to be placed
elsewhere but that seems wrong now.
This is a low priority item, so looking for a quick fix to allow time on
other areas.
Any objections?
--
Simon Riggs www.2ndQuadrant.com
*** a/src/backend/commands/dbcommands.c
--- b/src/backend/commands/dbcommands.c
***************
*** 1944,1950 **** dbase_redo(XLogRecPtr lsn, XLogRecord *record)
--- 1944,1958 ----
dst_path = GetDatabasePath(xlrec->db_id, xlrec->tablespace_id);
if (InHotStandby)
+ {
+ /*
+ * Lock database while we resolve conflicts to ensure that InitPostgres()
+ * cannot fully re-execute concurrently. This avoids backends re-connecting
+ * automatically to same database, which can happen in some cases.
+ */
+ LockSharedObjectForSession(DatabaseRelationId, xlrec->db_id, 0, AccessExclusiveLock);
ResolveRecoveryConflictWithDatabase(xlrec->db_id);
+ }
/* Drop pages for this database that are in the shared buffer cache */
DropDatabaseBuffers(xlrec->db_id);
***************
*** 1960,1965 **** dbase_redo(XLogRecPtr lsn, XLogRecord *record)
--- 1968,1984 ----
ereport(WARNING,
(errmsg("some useless files may be left behind in old database directory \"%s\"",
dst_path)));
+
+ if (InHotStandby)
+ {
+ /*
+ * Release locks prior to commit. XX There is a race condition here that may allow
+ * backends to reconnect, but the window for this is small because the gap between
+ * here and commit is mostly fairly small and it is unlikely that people will be
+ * dropping databases that we are trying to connect to anyway.
+ */
+ UnlockSharedObjectForSession(DatabaseRelationId, xlrec->db_id, 0, AccessExclusiveLock);
+ }
}
else
elog(PANIC, "dbase_redo: unknown op code %u", info);
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers