Unfortunately the original mail doesn't seem to have made it to my mailbox, so tacking on to this reply.
On Wed, Apr 17, 2013 at 01:31:00AM +0400, Alexander Petrov wrote: > Hello Michael, hello devs > Don't know why upstart's finite machine ignores errors in post-start What's happening here is that the main process is started, upstart considers it "running", and then launches the post-start script which then blocks indefinitely; because the post-start script is blocking, upstart never has a chance to handle the exit of the main process and trigger a respawn. The best way to avoid this problem is to avoid polling in the post-start script /at all/. > 2) upstart kills all processes in process group so you can run postress in > background and check its status This definitely doesn't solve the problem of the post-start script never finishing. > 2013/4/16 Michael Barrett <[email protected]> > > Hi, I'm working on converting my postgresql package to using upstart. > > While doing so, I found that postgres takes a few moments after starting > > to actually start accepting connections. I decided to use a post-start job > > to test whether the postgres server was up and available before letting > > upstart believe the service was ready. > > Currently my upstart job looks like this: http://dpaste.com/1060864/ Essentially, your post-start script is a workaround for postgres not using any of the "standard" ways of indicating service readiness that upstart supports natively. That's unfortunate; it would be nice if postgresql supported daemonization, but we probably don't want to patch upstream to implement this. So a post-start script is a reasonable workaround - the main problem is that the script you're using here never terminates if the main script exits without ever successfully answering on the socket. Alexander's proposal suffers from the same problem. What you want to do here is make sure you detect when the upstart job's target has changed from start to stop. So this should do the trick (untested): post-start script while ! su -c "psql -c 'select 1'" postgres do status | grep -q 'stop/' && exit 1 sleep 1 done end script Note that if the main process "starts" and continues running but never accepts connections, the post-start script will still hang. This is probably ok; at least you'll be able to manually stop the job if necessary. > > The issue I'm running into now is that whenever there is an issue with > > the postgres server that causes it to crash, the post-start job hangs > > indefinitely (as you'd expect from the loop). I can't stop the job or > > restart it once I fix the issue. The only way I've been able to fix it > > is to bring up postgres manually, which allows the post-start to finish. > > I've tried putting a maximum # of retries in the post-start script, then > > doing an exit 1 when it exceeds that amount, but that results in the > > 'start postgresql' command exiting successfully, which causes issues in > > other places in my application (because postgres isn't actually > > running). The 'start' command is defined as returning success if the *dbus request* succeeds. For a tool that provides the interface you're after, see 'service': i.e., 'service postgresql start' will return 0 if the service is started successfully, and non-zero if not. (FWIW, 'service' is not part of upstart but is a standard interface provided on Ubuntu, with an interface derived from a tool of the same name on Red Hat.) Hope that helps, -- Steve Langasek Give me a lever long enough and a Free OS Debian Developer to set it on, and I can move the world. Ubuntu Developer http://www.debian.org/ [email protected] [email protected]
signature.asc
Description: Digital signature
-- upstart-devel mailing list [email protected] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/upstart-devel
