I'm sorry for the delay. I'm picking up with responding to the parts of
your review that I did not include in my first reply. I appreciate very
much the discussion you've had with Oleg about the issues that I did not
address myself. I look forward to your replies to my comments in that
first reply, which we have yet to see.
Seems inconsistent on u32 vs enum utrace_resume_action.
Why force enum utrace_resume_action into a u32?
The first argument to almost all callbacks (all the ones made when the
task is alive) is called action and it's consistent that its low bits
contain an enum utrace_resume_action. The argument is u32 action in
the report_signal and report_syscall_entry callbacks, where it combines
an enum utrace_resume_action with an enum utrace_{signal,syscall}_action
(respectively). It would indeed be more consistent to use u32 action
throughout, but it seemed nicer not to have engine writers always
writing utrace_resume_action(action) for all the cases where there are
no other bits in there.
The return value is a mirror of the u32 action argument. In many
calls, it has only an enum utrace_resume_action in it. But in some it
combines that with another enum utrace_*_action. There I went for
consistency (and less typing) in the return type, since it doesn't make
any difference to how you have to write the code in your callback
function. The main reason I used u32 at all instead of unsigned int
is just its shortness for less typing.
I don't really mind changing these details to whatever people think is
best. The other people writing code to use the utrace API may have more
opinions than I do. I guess it could even be OK to make the return
value and action argument always a plain enum utrace_resume_action,
and use a second in/out enum utrace_{signal,syscall}_action *
parameter in those particular calls. But that does put some more
register pressure and loads/stores into this API.
Seems inconsistent in the bitfield type, also it feels like that 3 the 7
and the enum should be more tightly integrated, maybe add:
UTRACE_RESUME_MAX
#define UTRACE_RESUME_BITS (ilog2(UTRACE_RESUME_MAX))
#define UTRACE_RESUME_MASK ((1 UTRACE_RESUME_BITS) - 1)
Yes, that's a good cleanup. Thanks.
(ilog2(7) is 2, so ilog2() + 1 is what you meant.)
+static struct utrace_engine *matching_engine(
[...]
The function does a search, suggesting the function name ought to have
something like find or search in it.
Ok, I'll make it find_matching_engine.
Quite gross this.. can't we key off the
tracehoook_report_clone_complete() and use a wakeup there?
Yes, we intended to clean this up at some point later. But doing that
is just a distraction and complication right now so we put it off. For
this case, however, I suppose we could just punt for the initial version.
This code exists to support the special semantics of calling
utrace_attach_task() from inside the parent's report_clone() callback.
That guarantees the parent that it wins any race with any third thread
calling utrace_attach_task(). This guarantees it will be first attacher
in the callback order, but the general subject of callback order is
already something we think we will want to revisit in the future after
we have more experience with lots of different engines trying to work
together. It's most meaningful when using the UTRACE_ATTACH_EXCLUSIVE
flag--then you can use UTRACE_ATTACH_EXCLUSIVE|UTRACE_ATTACH_MATCH_OPS
to synchronize with other threads trying to attach the same kind of
engine to a task, and give special priority in that to the engine that
attaches in report_clone() from tracing the parent.
Really I came up with this special semantics originally just with ptrace
in mind. ptrace is an engine that wants to exclude other tracer threads
attaching asynchronously with the same kind of engine, and that wants to
give special priority on a child to the parent's tracer (to implement
PTRACE_O_TRACECLONE et al). However, Oleg's current ptrace code still
relies on the old hard-coded locking kludges to exclude the separate
attachers and to magically privilege the auto-attach from the parent's
tracer. So we are not relying on this special semantics yet.
We could just punt utrace_attach_delay() and remove the associated
documentation about the special rights of report_clone() calling
utrace_attach_task(). If we decide it helps clean things up later when
we finish more cleanup of the ptrace world, then we can add the fancy
semantics back in later.
Does this really need the inline?
It has one caller and that call is unconditional.
Asymmetric locking like this is really better not done, and looking at
the callsites its really no bother to clean that up, arguably even makes
them saner.
By assymetric you mean that utrace_reap releases a lock (as the
__releases annotation indicates). As should be obvious from the code, the
unlock is done before the loop that does -report_reap callbacks and
utrace_engine_put() (which can make