On 6/20/2019 8:50 AM, Peter Zijlstra wrote:
On Mon, Jun 17, 2019 at 09:41:37PM +0800, Zhang Rui wrote:

After S3 suspend/resume, "perf stat -I 1000 -e power/energy-pkg/ -a"
reports an insane value for the very first sampling period after resume
as shown below.

     19.278989977               2.16 Joules power/energy-pkg/
     20.279373569               1.96 Joules power/energy-pkg/
     21.279765481               2.09 Joules power/energy-pkg/
     22.280305420               2.10 Joules power/energy-pkg/
     25.504782277   4,294,966,686.01 Joules power/energy-pkg/
     26.505114993               3.58 Joules power/energy-pkg/
     27.505471758               1.66 Joules power/energy-pkg/

Fix this by resetting the counter right after resume.

Cute...


+#ifdef CONFIG_PM
+
+static int perf_rapl_suspend(void)
+{
+       int i;
+
+       get_online_cpus();
+       for (i = 0; i < rapl_pmus->maxpkg; i++)
+               rapl_pmu_update_all(rapl_pmus->pmus[i]);
+       put_online_cpus();
+       return 0;
+}
+
+static void perf_rapl_resume(void)
+{
+       int i;
+
+       get_online_cpus();
+       for (i = 0; i < rapl_pmus->maxpkg; i++)
+               rapl_pmu_restart_all(rapl_pmus->pmus[i]);
+       put_online_cpus();
+}

What's the reason for that get/put_online_cpus() here ?


It looks like syscore_* functions are executed with one CPU on-line.
If so, they may not be the right place for the rapl callback.

Rapl is per socket. The driver manipulates the registers on the first CPU of each socket. I think we need to update/restart the counters on all sockets. That's the reason I add get/put_online_cpus() in the original patch.

Besides, I think we also need to call rapl_pmu_restart/update_all() on the target cpu.


Thanks,
Kan

Reply via email to