> Not sure if this is the proper forum for this question. If not, redirect me.
> A large process may take a relatively long time to complete dumping core
> after it catches a signal. Its parent process meanwhile is blissfully
> unaware of the fact because the sigchld doesnt get delivered until the
> dumping activity is complete and the child is ready to get reaped.

> So the question is, is there a means by which it can be unabiguously the the
> process of dumping core is now currently ongoing? I've looked at proc(4) and
> its vague on the topic as far as I can tell. For example, lwpstatus:pr_cursig
> is described as the "current" signal, but the elaboration says it is the
> "next" signal to be delivered, as in the future. In the case I'm asking
> about the signal has already been delivered and the resulting action was to
> begin dumping core.

If you really need to do this, you can look at psinfo_t.pr_flag (NOT the same
as pr_flags elsewhere) to get the private kernel flags and look for the SDOCORE
bit defined in sys/proc.h.  But this is a Private interface, and isn't part of
the stable set of /proc bits.

We really need to make a stable interface for this, either by adding another
contract(4) event to indicate the start of a core dump, or a full-fledged
/proc pr_flags bit to correspond to this state.

Regardless of the interface, one thing you need to be careful of here is that
while the process is dumping core, it still has file descriptors open (because
we may end up putting info about them into the core file).  Thus if you take
action on a failing process before the core finishes, and your action is to
say respawn something which tries to bind to the same socket or whatever,
your respawned process may end up failing until after the core dump completes.

-Mike

-- 
Mike Shapiro, Solaris Kernel Development. blogs.sun.com/mws/

Reply via email to