>   * Create another global variable utrace_possible_flags. Each bit
>     is set only if there is either a global tracer for the event,
>     or at least one tracer in the system (keep a global counter).
>   * Always check utrace_possible_flags first, and if it is set
>     (thus requesting the slow path anyway), only then check the
>     per-thread and global flags.

That seems like a bad trade-off.  The common case to optimize is that this
event now not going to be traced.  If someone somewhere is running strace
on their programs, my task should not go through any slow paths just
because of them.  That's a degradation from today's performance with plain
old ptrace.

The fast path having two negative tests in the common case is surely better
than what should be the fast path having a slow false positive for me, because
someone else somewhere ran "strace -f sleep 999999 &".

If it comes down to "exactly the current check only is acceptable cost",
then the opposite direction is what makes sense to me.  That is, have
global tracing go do:
        task->utrace_flags |= global_utrace_flags;
on every task whenever a new bit is set in global_utrace_flags.  (Then
there can be some lazy fixup for stale task->utrace_flags values after
global_utrace_flags has bits cleared.  It's essentially the same plan
as for setting TIF_SYSCALL_TRACE for global syscall tracing.)


Thanks,
Roland

Reply via email to