> This kind of interface would be nice to have in utrace only if it were > significantly cheaper than doing what we do now: potentially attaching > utrace-engines to each thread -- or (in the near future, systemtap > bug# 6445) to subtrees of the process hierarchy.
The overhead (memory + setup/teardown cost) is per-thread X per-tracer. We'd have to measure what it is in practice. I'd guess the memory won't be an issue unless you were really milking the system for performance. I'd guess the first issue will be big chunks of slow at script setup/teardown when there are lots of threads on the system. The main feature of global tracing is that it avoids this overhead. It goes without saying that you could always just trace every thread individually and produce the same result at high level. The other feature is its simplicity. The "baseline" work to do global tracing via by-thread is not entirely trivial, as David will attest. For subtrees, there wouldn't any time soon be an option other that global or by-each-thread. In the long run, there might be some new optimizations for using utrace to treat many threads all the same. Whatever comes along to benefit that case, I don't think it will constitute an argument either for or against global tracing. > (An extra chunk of work per clone() may well be cheaper than extra work > at every system call.) I assume what you mean here is for global syscall tracing. There is no such trade-off. With vanilla utrace, you always do both. With global tracing, you still always do the latter. > > Systemtap doesn't currently change outcomes in a callback, so reason > > c. doesn't apply much. [...] > > Actually, this is the main reasons that utrace-level support sounds > interesting to me. We have had requests for exposing some thread > control primitives to systemtap probe handlers - to block/resume, send > signals, that sort of stuff. *If* going through utrace (as opposed to > a separate API) would make this smoother and compose better (should > e.g. there be different systemtap scripts fighting over the threads), > that could be worthwhile. We'd have to discuss concrete scenarios to get entirely clear on this. But off hand those sound like things that make sense to do with vanilla utrace on individual threads. i.e. blocking a thread implies that you maintain per-thread state, as opposed to just a per-event consideration of the thread on hand. (Also, for blocking specifically, utrace is the only kosher way to go about it--anything else fails badly at playing nicely with other tracing and debugging facilities.) So to me this says you just need whatever global tracing facility you're using to have a good place to make utrace setup calls when you discover you want to do this sort of thing. That's a feature that utrace global tracing clearly has. But given a particular scenario and a given other means of getting its necessary event hooks, that other means might well be fine in this regard too. To know, we'd have to get concrete about each of the specific tracepoints you would use instead. Thanks, Roland