Excerpts from Robert Haas's message of lun ago 27 18:02:06 -0400 2012: > On Mon, Aug 27, 2012 at 4:29 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > > Bruce Momjian <br...@momjian.us> writes: > >> I have developed the attached patch to report a zero-length file, as you > >> suggested. > > > > DIRECTORY_LOCK_FILE is entirely incorrect there. > > > > Taking a step back, I don't think this message is much better than the > > existing behavior of reporting "bogus data". Either way, it's not > > obvious to typical users what the problem is or what to do about it. > > If we're going to emit a special message I think it should be more user > > friendly than this. > > > > Perhaps something like: > > > > FATAL: lock file "foo" is empty > > HINT: This may mean that another postmaster was starting at the > > same time. If not, remove the lock file and try again. > > The problem with this is that it gives the customer only one remedy, > which they will (if experience is any guide) try whether it is > actually correct to do so or not.
How about having it sleep for a short while, then try again? I would expect that it would cause the second postmaster to fail during the second try, which is okay because the first one is then operational. The problem, of course, is how long to sleep so that this doesn't fail when load is high enough that the first postmaster still hasn't written the file after the sleep. Maybe LOG: lock file "foo" is empty, sleeping to retry -- sleep 100ms and recheck LOG: lock file "foo" is empty, sleeping to retry -- sleep, dunno, 1s, recheck LOG: lock file "foo" is empty, sleeping to retry -- sleep maybe 5s? recheck FATAL: lock file "foo" is empty HINT: Is another postmaster running on data directory "bar"? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers