On Wed, Sep 04, 2019 at 03:37:11PM +0100, Qais Yousef wrote: > I managed to hook into sched_switch to get the nr_running of cfs tasks via > eBPF. > > ``` > int on_switch(struct sched_switch_args *args) { > struct task_struct *prev = (struct task_struct *)bpf_get_current_task(); > struct cgroup *prev_cgroup = > prev->cgroups->subsys[cpuset_cgrp_id]->cgroup; > const char *prev_cgroup_name = prev_cgroup->kn->name; > > if (prev_cgroup->kn->parent) { > bpf_trace_printk("sched_switch_ext: nr_running=%d prev_cgroup=%s\\n", > prev->se.cfs_rq->nr_running, > prev_cgroup_name); > } else { > bpf_trace_printk("sched_switch_ext: nr_running=%d prev_cgroup=/\\n", > prev->se.cfs_rq->nr_running); > } > return 0; > }; > ``` > > You can do something similar by attaching to the sched_switch tracepoint from > a module and a create a new event to get the nr_running. > > Now this is not as accurate as your proposed new tracepoint in terms where you > sample nr_running, but should be good enough?
The above is after deactivate() and gives an up-to-date count for decrements. Attach something to trace_sched_wakeup() to get the increment update.