init script behaviour

2010-06-15 Thread Joe Orton
Any opinions on this?  I've had a query.

What should service  start do for a daemon - or more specifically, 
when should it return?  There is inconsistency amongst different current 
init scripts, two general approaches:

1) fire and forget: start the daemon, return immediately

2) stop and wait: start the daemon, and wait, either:
 a) a short fixed period of time, or
 b) in a loop until the pidfile appears, with some maximum wait time

Notable implication of (1) is that running e.g. service xxx status (or 
stop etc) may not immediately succeed after a start, nor may the 
service be immediately usable directly after a start returns.

(2b) may have surprising failure cases of an init script waiting a long 
time to return - dirsrv will wait up to ten minutes, which seems rather 
extreme.

(2a) may be unreliable, being dependant on timing/machine speed

I found at least one init scripts which also has this stop-and-wait 
behaviour for stop (mysqld).

I'd instinctively prefer (1) from a do one thing and do it well 
perspective; (2) starts down the road of a better/more complex form of 
service-monitoring/management and ends up doing it really badly in messy 
sh script in N places.

(A logical extension of (2) would be to require not merely that the 
pidfile exists, but that the service is accepting connections on TCP 
port N, before returning from the init script start invocation)

Thoughts?

Regards, Joe
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: init script behaviour

2010-06-15 Thread Manuel Wolfshant
On 06/15/2010 03:08 PM, Joe Orton wrote:
 Any opinions on this?  I've had a query.

 What should service  start do for a daemon - or more specifically,
 when should it return?  There is inconsistency amongst different current
 init scripts, two general approaches:

 1) fire and forget: start the daemon, return immediately

which might give false positives as in service started (I've seen the 
OK on screen!) but not running

 2) stop and wait: start the daemon, and wait, either:
   a) a short fixed period of time, or
   b) in a loop until the pidfile appears, with some maximum wait time

which might give false positives as in service started (also known as 
pidfile exists but the process is dead) but not running

 Notable implication of (1) is that running e.g. service xxx status (or
 stop etc) may not immediately succeed after a start, nor may the
 service be immediately usable directly after a start returns.

 (2b) may have surprising failure cases of an init script waiting a long
 time to return - dirsrv will wait up to ten minutes, which seems rather
 extreme.

 (2a) may be unreliable, being dependant on timing/machine speed

 I found at least one init scripts which also has this stop-and-wait
 behaviour for stop (mysqld).

 I'd instinctively prefer (1) from a do one thing and do it well
 perspective; (2) starts down the road of a better/more complex form of
 service-monitoring/management and ends up doing it really badly in messy
 sh script in N places.

 (A logical extension of (2) would be to require not merely that the
 pidfile exists, but that the service is accepting connections on TCP
 port N, before returning from the init script start invocation)

 Thoughts?
Well, I'd say it depends on how we define the start part. fire and 
forget,  start and make sure it was started or start and make sure 
it is running.

-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: init script behaviour

2010-06-15 Thread Colin Walters
On Tue, Jun 15, 2010 at 8:08 AM, Joe Orton jor...@redhat.com wrote:

 I'd instinctively prefer (1) from a do one thing and do it well
 perspective; (2) starts down the road of a better/more complex form of
 service-monitoring/management and ends up doing it really badly in messy
 sh script in N places.

Absolutely.  The core OS doesn't need to come with a half-assed
reimplementation of Nagios.  service foo status should be just is
the pid running, and that's how things will be with Systemd as I
understand it.
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: init script behaviour

2010-06-15 Thread Casey Dahlin
On Tue, Jun 15, 2010 at 03:30:05PM +0300, Manuel Wolfshant wrote:
 On 06/15/2010 03:08 PM, Joe Orton wrote:
*snip*
  Thoughts?
 Well, I'd say it depends on how we define the start part. fire and 
 forget,  start and make sure it was started or start and make sure 
 it is running.
 

I'd say fire and forget or something close for most sysv initscripts. If
you want to do better you need a modern tool like systemd/upstart/etc.
Trying to do it better in bash just makes for piles of ugly, and the
weird failure modes and corner cases will usually end up being worse
than the problem.

--CJD
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel


Re: init script behaviour

2010-06-15 Thread Chris Adams
Once upon a time, Casey Dahlin cdah...@redhat.com said:
 I'd say fire and forget or something close for most sysv initscripts. If
 you want to do better you need a modern tool like systemd/upstart/etc.
 Trying to do it better in bash just makes for piles of ugly, and the
 weird failure modes and corner cases will usually end up being worse
 than the problem.

A well-behaved daemon should be doing all the checking possible before
forking to go into the background.  The init scripts do check the exit
code, so configuration errors, failure to bind to sockets, etc. should
be (and in most cases are) caught that way.

-- 
Chris Adams cmad...@hiwaay.net
Systems and Network Administrator - HiWAAY Internet Services
I don't speak for anybody but myself - that's enough trouble.
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel