init script behaviour
Any opinions on this? I've had a query. What should service start do for a daemon - or more specifically, when should it return? There is inconsistency amongst different current init scripts, two general approaches: 1) fire and forget: start the daemon, return immediately 2) stop and wait: start the daemon, and wait, either: a) a short fixed period of time, or b) in a loop until the pidfile appears, with some maximum wait time Notable implication of (1) is that running e.g. service xxx status (or stop etc) may not immediately succeed after a start, nor may the service be immediately usable directly after a start returns. (2b) may have surprising failure cases of an init script waiting a long time to return - dirsrv will wait up to ten minutes, which seems rather extreme. (2a) may be unreliable, being dependant on timing/machine speed I found at least one init scripts which also has this stop-and-wait behaviour for stop (mysqld). I'd instinctively prefer (1) from a do one thing and do it well perspective; (2) starts down the road of a better/more complex form of service-monitoring/management and ends up doing it really badly in messy sh script in N places. (A logical extension of (2) would be to require not merely that the pidfile exists, but that the service is accepting connections on TCP port N, before returning from the init script start invocation) Thoughts? Regards, Joe -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel
Re: init script behaviour
On 06/15/2010 03:08 PM, Joe Orton wrote: Any opinions on this? I've had a query. What should service start do for a daemon - or more specifically, when should it return? There is inconsistency amongst different current init scripts, two general approaches: 1) fire and forget: start the daemon, return immediately which might give false positives as in service started (I've seen the OK on screen!) but not running 2) stop and wait: start the daemon, and wait, either: a) a short fixed period of time, or b) in a loop until the pidfile appears, with some maximum wait time which might give false positives as in service started (also known as pidfile exists but the process is dead) but not running Notable implication of (1) is that running e.g. service xxx status (or stop etc) may not immediately succeed after a start, nor may the service be immediately usable directly after a start returns. (2b) may have surprising failure cases of an init script waiting a long time to return - dirsrv will wait up to ten minutes, which seems rather extreme. (2a) may be unreliable, being dependant on timing/machine speed I found at least one init scripts which also has this stop-and-wait behaviour for stop (mysqld). I'd instinctively prefer (1) from a do one thing and do it well perspective; (2) starts down the road of a better/more complex form of service-monitoring/management and ends up doing it really badly in messy sh script in N places. (A logical extension of (2) would be to require not merely that the pidfile exists, but that the service is accepting connections on TCP port N, before returning from the init script start invocation) Thoughts? Well, I'd say it depends on how we define the start part. fire and forget, start and make sure it was started or start and make sure it is running. -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel
Re: init script behaviour
On Tue, Jun 15, 2010 at 8:08 AM, Joe Orton jor...@redhat.com wrote: I'd instinctively prefer (1) from a do one thing and do it well perspective; (2) starts down the road of a better/more complex form of service-monitoring/management and ends up doing it really badly in messy sh script in N places. Absolutely. The core OS doesn't need to come with a half-assed reimplementation of Nagios. service foo status should be just is the pid running, and that's how things will be with Systemd as I understand it. -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel
Re: init script behaviour
On Tue, Jun 15, 2010 at 03:30:05PM +0300, Manuel Wolfshant wrote: On 06/15/2010 03:08 PM, Joe Orton wrote: *snip* Thoughts? Well, I'd say it depends on how we define the start part. fire and forget, start and make sure it was started or start and make sure it is running. I'd say fire and forget or something close for most sysv initscripts. If you want to do better you need a modern tool like systemd/upstart/etc. Trying to do it better in bash just makes for piles of ugly, and the weird failure modes and corner cases will usually end up being worse than the problem. --CJD -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel
Re: init script behaviour
Once upon a time, Casey Dahlin cdah...@redhat.com said: I'd say fire and forget or something close for most sysv initscripts. If you want to do better you need a modern tool like systemd/upstart/etc. Trying to do it better in bash just makes for piles of ugly, and the weird failure modes and corner cases will usually end up being worse than the problem. A well-behaved daemon should be doing all the checking possible before forking to go into the background. The init scripts do check the exit code, so configuration errors, failure to bind to sockets, etc. should be (and in most cases are) caught that way. -- Chris Adams cmad...@hiwaay.net Systems and Network Administrator - HiWAAY Internet Services I don't speak for anybody but myself - that's enough trouble. -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel