[observability-discuss] Semantics of "ptime -p"

Chad Mynhier Sat, 29 Sep 2007 11:02:31 -0400

I'm trying to implement the "ptime -p <pid>" part of
http://bugs.opensolaris.org/view_bug.do?bug_id=4532599.  (I'm aware
that the bug/RFE is marked "6-Fix understood", but the responsible
engineer is no longer at Sun, so I don't know what progress might be
made on it.)


What should the semantics of "ptime -p <pid> be?  I see two options:
- ptime should report usage after the specified process has finished.
- ptime should report statistics for the process up to that point
(i.e., ptime returns almost immediately.)

I can see both being useful, and I've implemented versions of both.
Unfortunately, I can't figure out how to implement the first without
doing a pr_waitid() on the parent process, which  hangs the parent
process.

Here are the options I can think of:

- Sit in a loop calling pr_waitid(... WNOHANG ...) on the parent.
This seems very inefficient/expensive, and there's a race condition
where the parent could reap the child on its own.
- Instruct the parent to ignore SIGCLD, somehow capture all SIGCLD's
destined for the parent, and pass through those that we don't care
about (telling the parent to stop ignoring SIGCLD long enough for us
to pass one through, then re-ignoring it.).  I don't know that this is
even feasible, but it still has a race condition problem, and it
doesn't seem like a good condition in general.
- Adopt the process (change the parent process to be ptime), call a
regular waitid() on it (with WNOWAIT), grab statistics, give the
process back to its original parent, then send the original parent a
SIGCLD.  This seems the most feasible, modulo the adoption/giveback
methods.

So, is this version of "ptime -p" semantics feasible?  Or should the
semantics be the second version ("snapshot" semantics)?  I don't have
any visibility into the RFE other than the brief description.

Thanks,
Chad Mynhier

[observability-discuss] Semantics of "ptime -p"

Reply via email to