On Fri, Apr 16, 2021 at 11:29:30AM +0200, Peter Zijlstra wrote: > > So I think we've had proposals for being able to close fds in the past; > > while preserving groups etc. We've always pushed back on that because of > > the resource limit issue. By having each counter be a filedesc we get a > > natural limit on the amount of resources you can consume. And in that > > respect, having to use 400k fds is things working as designed. > > > > Anyway, there might be a way around this..
So how about we flip the whole thing sideways, instead of doing one event for multiple cgroups, do an event for multiple-cpus. Basically, allow: perf_event_open(.pid=fd, cpu=-1, .flag=PID_CGROUP); Which would have the kernel create nr_cpus events [the corrolary is that we'd probably also allow: (.pid=-1, cpu=-1) ]. Output could be done by adding FORMAT_PERCPU, which takes the current read() format and writes a copy for each CPU event. (p)read(v)() could be used to explode or partial read that. This gets rid of the nasty variadic nature of the 'get-me-these-n-cgroups'. While still getting rid of the n*m fd issue you're facing.