On Thu, 7 Nov 2024 at 16:45, Mathieu Desnoyers <[email protected]> wrote: > > On 2024-11-07 07:25, Marco Elver wrote: > > prctl() is a complex syscall which multiplexes its functionality based > > on a large set of PR_* options. Currently we count 64 such options. The > > return value of unknown options is -EINVAL, and doesn't distinguish from > > known options that were passed invalid args that also return -EINVAL. > > > > To understand if programs are attempting to use prctl() options not yet > > available on the running kernel, provide the task_prctl_unknown > > tracepoint. > > > > Note, this tracepoint is in an unlikely cold path, and would therefore > > be suitable for continuous monitoring (e.g. via perf_event_open). > > > > While the above is likely the simplest usecase, additionally this > > tracepoint can help unlock some testing scenarios (where probing > > sys_enter or sys_exit causes undesirable performance overheads): > > > > a. unprivileged triggering of a test module: test modules may register a > > probe to be called back on task_prctl_unknown, and pick a very large > > unknown prctl() option upon which they perform a test function for an > > unprivileged user; > > > > b. unprivileged triggering of an eBPF program function: similar > > as idea (a). > > > > Example trace_pipe output: > > > > test-484 [000] ..... 631.748104: task_prctl_unknown: comm=test > > option=1234 arg2=101 arg3=102 arg4=103 arg5=104 > > > > My concern is that we start adding tons of special-case > tracepoints to the implementation of system calls which > are redundant with the sys_enter/exit tracepoints. > > Why favor this approach rather than hooking on sys_enter/exit ?
It's __extremely__ expensive when deployed at scale. See note in commit description above.
