On Mon, Feb 6, 2017 at 3:27 PM, Luck, Tony <tony.l...@intel.com> wrote: >> cgroup mode gives a per-CPU breakdown of event and running time, the >> tool aggregates it into running time vs event count. Both per-cpu >> breakdown and the aggregate are useful. >> >> Piggy-backing on perf's cgroup mode would give us all the above for free. > > Do you have some sample output from a perf run on a cgroup measuring a > "normal" event showing what you get?
# perf stat -I 1000 -e cycles -a -C 0-1 -A -x, -G / 1.000116648,CPU0,20677864,,cycles,/ 1.000169948,CPU1,24760887,,cycles,/ 2.000453849,CPU0,36120862,,cycles,/ 2.000480259,CPU1,12535575,,cycles,/ 3.000664762,CPU0,7564504,,cycles,/ 3.000692552,CPU1,7307480,,cycles,/ > > I think that requires that we still go through perf ->start() and ->stop() > functions > to know how much time we spent running. I thought we were looking at bundling > the RMID updates into the same spot in sched() where we switch the CLOSID. > More or less at the "start" point, but there is no "stop". If we are > switching between > runnable processes, it amounts to pretty much the same thing ... except we > bill > to someone all the time instead of having a gap in the context switch where we > stopped billing to the old task and haven't started billing to the new one > yet. Another problem is that it will require a perf event all the time for timing measurements to be consistent with RMID measurements. The only sane option I can come up is to do timing in RDT the way perf cgroup does it (keep a per-cpu time that increases with local clock's delta). A reader can add the times for all CPUs in cpu_mask. > > But if we idle ... then we don't "stop". Shouldn't matter much from a > measurement > perspective because idle won't use cache or consume bandwidth. But we'd count > that time as "on cpu" for the last process to run. I may be missing something basic but isn't __switch_to called when switching to the idle task? that will update the CLOSID and RMID to whatever the idle task in on, isnt it? Thanks, David