** Description changed: - When thermald updates /sys/devices/virtual/powercap/intel-rapl/intel- - rapl:0/constraint_0_power_limit_uw the kernel is emitting the following - message: + [SRU Justification][Trusty][Wily] + + thermald is triggering the kernel to SPAM the kernel log with frequent "package locked by BIOS, monitoring only" messages. + + [Fix] + This issue is fixed with the following upstream commits: + + f1a77c5f3b936ba8a7a63d587a803641974f8e62 ("thd_cdev_rapl: stop writing + to sysfs if the write fails (LP: #1543046)") + + 833245725494eb26a1c61ca6f1a9db90599ae71b ("Initialize bios_locked to + false") + + These two fixes have been shown to work on Xenial and apply cleanly to + Trusty and Wily versions of thermald. The risk of regression is low + since these fixes add extra sanity checking to the code rather than + completely new functionality plus they are upstream commits that have + been available Xenial for some time now. + + [Testcase] + Run on a system that reads /sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_0_power_limit_uw where the BIOS has this feature locked and the kernel emits this message every time thermald accesses this /sys file. + + With the fix, this message only appears once, and no more spamming + occurs thereafter. + + [Regression Potential] + Minimal. The fixes are upstream and have been tested in Xenial for quite a while. The fixes patch cleanly to Trusty and Wily and result in the same upstream code, so the code paths are identical to that of Xenial's thermald. + + ---------------------------------------- + + + When thermald updates /sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_0_power_limit_uw the kernel is emitting the following message: [38458.753468] powercap intel-rapl:0: package locked by BIOS, monitoring only [38637.993447] powercap intel-rapl:0: package locked by BIOS, monitoring only [38674.154336] powercap intel-rapl:0: package locked by BIOS, monitoring only [38691.500619] powercap intel-rapl:0: package locked by BIOS, monitoring only This message comes from set_power_limit() in drivers/powercap/intel_rapl.c because the domain is locked by the BIOS. Writing to this interface fails with an error: open("/sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_0_power_limit_uw", O_WRONLY) = 3 write(3, "35000000", 8) = -1 ENODATA (No data available) so in theory thermald should be seeing this failed write and handling it appropriately rather. cthd_sysfs_cdev_rapl::set_curr_state() and cthd_sysfs_cdev_rapl::set_curr_state_raw() in src/thd_cdev_rapl.cpp perform the update and they do check that the sysfs write fails: - if (cdev_sysfs.write(tc_state_dev.str(), state_str.str()) < 0) - curr_state = (state == 0) ? 0 : max_state; + if (cdev_sysfs.write(tc_state_dev.str(), state_str.str()) < 0) + curr_state = (state == 0) ? 0 : max_state; however, I believe they should check errno for the failed write and disable the rapl interface if we get -ENODATA on this interface to avoid repeated failures and hence repeated spamming of kernel messages
** Description changed: [SRU Justification][Trusty][Wily] - thermald is triggering the kernel to SPAM the kernel log with frequent "package locked by BIOS, monitoring only" messages. - + thermald is triggering the kernel to SPAM the kernel log with frequent + "package locked by BIOS, monitoring only" messages. + [Fix] This issue is fixed with the following upstream commits: f1a77c5f3b936ba8a7a63d587a803641974f8e62 ("thd_cdev_rapl: stop writing to sysfs if the write fails (LP: #1543046)") 833245725494eb26a1c61ca6f1a9db90599ae71b ("Initialize bios_locked to false") These two fixes have been shown to work on Xenial and apply cleanly to Trusty and Wily versions of thermald. The risk of regression is low since these fixes add extra sanity checking to the code rather than completely new functionality plus they are upstream commits that have been available Xenial for some time now. [Testcase] Run on a system that reads /sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_0_power_limit_uw where the BIOS has this feature locked and the kernel emits this message every time thermald accesses this /sys file. With the fix, this message only appears once, and no more spamming occurs thereafter. [Regression Potential] Minimal. The fixes are upstream and have been tested in Xenial for quite a while. The fixes patch cleanly to Trusty and Wily and result in the same upstream code, so the code paths are identical to that of Xenial's thermald. ---------------------------------------- - - When thermald updates /sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_0_power_limit_uw the kernel is emitting the following message: + When thermald updates /sys/devices/virtual/powercap/intel-rapl/intel- + rapl:0/constraint_0_power_limit_uw the kernel is emitting the following + message: [38458.753468] powercap intel-rapl:0: package locked by BIOS, monitoring only [38637.993447] powercap intel-rapl:0: package locked by BIOS, monitoring only [38674.154336] powercap intel-rapl:0: package locked by BIOS, monitoring only [38691.500619] powercap intel-rapl:0: package locked by BIOS, monitoring only This message comes from set_power_limit() in drivers/powercap/intel_rapl.c because the domain is locked by the BIOS. Writing to this interface fails with an error: open("/sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/constraint_0_power_limit_uw", O_WRONLY) = 3 write(3, "35000000", 8) = -1 ENODATA (No data available) so in theory thermald should be seeing this failed write and handling it appropriately rather. cthd_sysfs_cdev_rapl::set_curr_state() and cthd_sysfs_cdev_rapl::set_curr_state_raw() in src/thd_cdev_rapl.cpp perform the update and they do check that the sysfs write fails: if (cdev_sysfs.write(tc_state_dev.str(), state_str.str()) < 0) curr_state = (state == 0) ? 0 : max_state; however, I believe they should check errno for the failed write and disable the rapl interface if we get -ENODATA on this interface to avoid repeated failures and hence repeated spamming of kernel messages -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1543046 Title: thermald spamming kernel log when updating powercap RAPL powerlimit To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/thermald/+bug/1543046/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs