On Mon, Aug 27, 2012 at 07:39:35PM -0400, Tom Lane wrote: > Alvaro Herrera <alvhe...@2ndquadrant.com> writes: > > How about having it sleep for a short while, then try again? > > I could get behind that, but I don't think the delay should be more than > 100ms or so. It's important for the postmaster to acquire the lock (or > not) pretty quickly, or pg_ctl is going to get confused. If we keep it > short, we can also dispense with the log spam you were suggesting. > > (Actually, I wonder if this type of scenario isn't going to confuse > pg_ctl already --- it might think the lockfile belongs to the postmaster > *it* started, not some pre-existing one. Does that matter?)
I took Alvaro's approach of a sleep. The file test was already in a loop that went 100 times. Basically, if the lock file exists, this postmaster isn't going to succeed, so I figured there is no reason to rush in the testing. I gave it 5 tries with one second between attempts. Either the file is being populated, or it is stale and empty. I checked pg_ctl and that has a default wait of 60 second, so 5 seconds to exit out of the postmaster should be fine. Patch attached. FYI, I noticed we have a similar 5-second creation time requirement in pg_ctl: /* * The postmaster should create postmaster.pid very soon after being * started. If it's not there after we've waited 5 or more seconds, * assume startup failed and give up waiting. (Note this covers both * cases where the pidfile was never created, and where it was created * and then removed during postmaster exit.) Also, if there *is* a * file there but it appears stale, issue a suitable warning and give * up waiting. */ if (i >= 5) This is for the case where the file has an old pid, rather than it is empty. FYI, I fixed the filename problem Tom found. -- Bruce Momjian <br...@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c new file mode 100644 index 775d71f..0309494 *** a/src/backend/utils/init/miscinit.c --- b/src/backend/utils/init/miscinit.c *************** CreateLockFile(const char *filename, boo *** 766,771 **** --- 766,793 ---- filename))); close(fd); + if (len == 0) + { + /* + * An empty lock file exits; either is it from another postmaster + * that is still starting up, or left from a crash. Check for + * five seconds, then if it still empty, it must be from a crash, + * so fail and recommend lock file removal. + */ + if (ntries < 5) + { + sleep(1); + continue; + } + else + ereport(FATAL, + (errcode(ERRCODE_LOCK_FILE_EXISTS), + errmsg("lock file \"%s\" is empty", filename), + errhint( + "Empty lock file probably left from operating system crash during\n" + "database startup; file deletion suggested."))); + } + buffer[len] = '\0'; encoded_pid = atoi(buffer);
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers