Re: [Haskell-cafe] Quick Angel User's Survey

Alexander V Vershilov Sat, 14 Sep 2013 13:42:30 -0700

Hello, Michael.

I'm a potential angel user, and I'd like to add a possibility of optional
angel usage as a
supervisor for openrc services, when I'll have time.


Common practise is:

send SIGTERM for a couple of times, then send SIGQUIT for a couple of
times, then SIGKILL.
You will need to wait for some time between each actions. If your program
is the parent of
a service than it's easy to wait for child death otherwise your need prctl
PR_{SET,GET}_CHILD_SUBREAPER [1]
in order to correctly wait for service. Or sending 0 signal to check if
process still alive, it's
non reliable but portable solution.
As a additional possible solution (may lead to a problems) it's possible to
traverse service
tree and kill processes starting with leafs.

In any can overriding service kill functionality is vastly needed, as most
of supervision systems
have a very limited approach to it.

[1] https://lkml.org/lkml/2013/1/10/521



On 14 September 2013 23:20, Michael Xavier <mich...@michaelxavier.net>wrote:

> Hey Cafe,
>
> I am the maintainer of Angel, the process monitoring daemon. Angel's job
> is to start a configured set of processes and restart them when they go
> away. I was responding to a ticket and realized that the correct
> functionality is not obvious in one case, so I figured I'd ask the
> stakeholders: people who use Angel. From what I know, most people who use
> Angel are Haskellers so this seemed like the place.
>
> When Angel is terminated, it tries to cleanly shut down any processes it
> is monitoring. It also shuts down processes that it spawned when they are
> removed from the config and the config is reloaded via the HUP signal. It
> uses terminateProcess from System.Process which sends a SIGTERM to the
> program on *nix systems.
>
> The trouble is that SIGTERM can be intercepted and a process can still
> fail to shut down. Currently Angel issues the SIGTERM and hopes for the
> best. It also cleans pidfiles if there were any, which may send a
> misleading message. There are a couple of routes I could take:
>
> 1. Leave it how it is. Leave it to the user to make sure stubborn
> processes go away. I don't like this solution so much as it makes Angel
> harder to reason about from a user's perspective.
> 2. Send a TERM signal then wait for a certain number of seconds, then send
> an uninterruptable signal like SIGKILL.
>
> There are some caveats with #2. I think I'd prefer the timeout to be
> configurable per-process. I think I'd also prefer that if no timeout is
> specified, we assume the user does not want us to use a SIGKILL. SIGKILL
> can be very dangerous for some processes like databases. I want explicit
> user permission to do something like this. If Angel generated a pidfile for
> the process, if it should only be cleaned if Angel can confirm the process
> is dead. Otherwise they should be left so the user can handle it.
>
> So the real question: is the extra burden of an optional configuration
> flag per process worth this feature? Are my assumptions about path #2
> reasonable.
>
> Thanks for your feedback!
>
> --
> Michael Xavier
> http://www.michaelxavier.net
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>


-- 
Alexander

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Quick Angel User's Survey

Reply via email to