That's an entirely different sort of motivation than any of the ones I've had. :-) In fact, I might not necessarily have avoided siglock synchronization in the utrace "extension event" facility I've described for future work, if this had not come to my attention. So that is good to consider.
It might be worthwhile to pin down more definitively how things are working in your case. I don't think we can be sure right away what its primary issue is. It is not just the use of a SIGTRAP signal for a machine event that's the issue if siglock serialization is the problem. The main means of getting some notification work done at a safe place is setting TIF_SIGPENDING so that we'll get into utrace_get_signal. Since that flag is overloaded for "check the signal queues" as well as "check in with utrace", setting it now means that the thread will always be obligated to take and release the siglock at least once before it can return to user mode. So e.g. threads responding to a request to become quiescent, even when report_quiesce clears the action flag and they never need to sleep, all hit the siglock bottleneck. Untying from TIF_SIGPENDING is something I expect eventually to fall out of the development path. (It relates to the "soft-quiesce" feature.) But I'd expected that probably after the extension events. Anyway, more to think about. Thanks, Roland