On Wed, 2006-03-08 at 09:32 +0100, Thomas Mueller wrote:
> Hi Geo,
> 
> > But even working init.d scripts and working daemons can be killed or
> > stopped for lots of reasons, and unless you duplicate the work of init,
> > or manually intervene, they won't start back up.
> 
> I read that argument several times now but I still think init doesn't
> help here.
> 
> I have to monitor my processes - it doesn't matter how I start them.
> What if my webserver doesn't answer requests but is still running? Init
> doesn't care.
> That's why I use monit [1] and let it check if the process is running
> and if it serves a specific HTML, PHP, CGI page. If any test fails I
> always try to kill all the processes and restart them.
> With init I could only leave out the 'restart' part.

So using inittab _does_ help.

Init can help in another way- it can keep monit running, after all, what
happens if monit were killed?

Nonetheless, "hung" processes are another deal: "hung" processes are the
result of broken code, or a broken design.

The problem I've been pointing out is that processes can be killed for
any reason- and no amount of defensive programming in the program itself
can protect against those kills.

The only process guaranteed to be running is pid=1, aka init.

Fortunately, init can monitor processes and restart them if they die for
any reason.


> Additionally monit watches the ressources a process uses and can warn or
> restart it based on the ressource usage.

So can setrlimit/ulimit. With ulimit, the process gets killed, and init
restarts it.

I also advocate enforced limits. Enforced limits are very good. I think
daemons should setrlimit themselves (if started as root, or put in the
startup script a ulimit line) whenever possible.

Sometimes deadlock (a kind of hang) can be handled using ulimit -x,
other times (spins) the hangs can be handled using ulimit -t.

Other times still, programs like monit can be helpful in monitoring
usability of a service, but since the usability is directly impacted by
software bugs within the service provider, I wonder how long people want
to run such software...

... but I suppose that's a different discussion :)


> [1] http://www.tildeslash.com/monit/
> [2] http://www.rsbac.org/

These are good things to know about.

I don't suspect that init can replace _every_ monitoring and system
management function- I'm not that ludicrous, but I'm specifically
looking for ways that it cannot replace init.d.

As a side note, do you have any links to how to get rsbac to enforce
limits- or rather, what other kinds of limits it can enforce? I found
some links on the site referring to RES and doing essentially a sitewide
per-user or per-grandfather-pid rlimit, but are there others? Examples?

-- 
Internet Connection High Quality Web Hosting
http://www.internetconnection.net/

Reply via email to