On Wed, Sep 27, 2006 at 04:13:34PM -0400, Tom Lane wrote: > Jon Lapham <[EMAIL PROTECTED]> writes in pgsql-general: > > If I run... > > sleep 3; echo starting; createdb bar > > ...and power off the VM while the "createdb bar" is running. > > > Upon restart, about 50% of the time I can reproduce the following error > > message: > > > [EMAIL PROTECTED] ~]$ psql bar > > psql: FATAL: database "bar" does not exist > > [EMAIL PROTECTED] ~]$ createdb bar > > createdb: database creation failed: ERROR: could not create directory > > "base/65536": File exists > > What apparently is happening here is that the same OID has been assigned > to the new database both times. Even though the createdb didn't > complete, the directory it started to build is there and so there's a > filename collision. > > > So, running "createdb bar" a second time works. > > Yeah, because the OID counter has been advanced, and so the second > createdb uses a nonconflicting OID. > > In theory this scenario should not happen, because a crash-and-restart > is supposed to guarantee that the OID counter comes up at or beyond > where it was before the crash. > > After thinking about it for awhile, I believe the problem is that > CREATE DATABASE is breaking the "WAL rule": it's allowing a data change > (specifically, creation of the new DB subdirectory) to hit disk without > having guaranteed that associated WAL entries were flushed first. > Specifically, if we generated an XLOG_NEXTOID WAL entry to record the > consumption of an OID for the database, there isn't anything ensuring > that record gets to disk before the mkdir occurs. (ie, the comment in > XLogPutNextOid is correct as far as it goes, but it fails to account > for outside-the-database effects such as creation of a directory named > after the OID.) Hence after restart the OID counter might not get > advanced as far as it should have been. > > We could fix this two different ways: > > 1. Put an XLogFlush into createdb() somewhere between making the > pg_database entry and starting to create subdirectories. > > 2. Check for conflicting database directories while assigning the OID, > comparable to what GetNewRelFileNode() does for table files. > > #2 has some appeal because it could deal with random junk in > $PGDATA/base regardless of how the junk got there. However, to do that > in a really bulletproof way we'd have to check all the tablespace > directories too, and that's starting to get a tad tedious for something > that shouldn't happen anyway. > > So I'm leaning to #1 as a suitably low-effort fix. Thoughts?
It'd be nice to clean things up, but I understand the reluctance to do so. Maybe a good compromise would be to warn about files that are present in $PGDATA but don't show up in any catalogs. Then again, if we're doing that, we could probably just nuke 'em... -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match