This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed- focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.
If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-focal -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws in Ubuntu. https://bugs.launchpad.net/bugs/1920944 Title: kvm: properly tear down PV features on hibernate Status in linux package in Ubuntu: Incomplete Status in linux-aws package in Ubuntu: New Status in linux source package in Focal: Fix Committed Status in linux-aws source package in Focal: Fix Committed Status in linux source package in Groovy: Fix Committed Status in linux-aws source package in Groovy: New Status in linux source package in Hirsute: Fix Committed Status in linux-aws source package in Hirsute: New Bug description: [Impact] In LP: #1918694 we applied a fix and a workaround to solve the hibernation issues on c5.18xlarge. The workaround was in the form of a SAUCE patch: "UBUNTU: SAUCE: aws: kvm: double the size of hv_clock_boot" It looks like we can replace this workaround with a proper fix, by applying this patch: http://next.patchew.org/Linux/20210414123544.1060604-1-vkuzn...@redhat.com/ This is required because various PV features (Async PF, PV EOI, steal time) work through memory shared with hypervisor and when we restore from hibernation we must properly tear down all these features to make sure hypervisor doesn't write to stale locations after we jump to the previously hibernated kernel. For this reason it is safe to apply this patch set also to all the generic kernels and not just AWS. [Test plan] This can be easily tested on AWS (but it should be reproduced by hibernating any kvm instance with multiple CPUs). Create a c5.18xlarge instance, run the memory stress test script (the same test script that we are using to stress test hibernation), trigger the hibernate event, trigger the resume event. Repeat a couple of times and the problem is very likely to happen. [Fix] On the AWS kernel replace "UBUNTU: SAUCE: aws: kvm: double the size of hv_clock_boot" with: http://next.patchew.org/Linux/20210414123544.1060604-1-vkuzn...@redhat.com/ For the other kernels, simply apply this patch set. The fix has been tested extensively in the AWS infrastructure with positive results. [Regression potential] This new code introduced by the fix can be executed also when a CPU is put offline, so we may see potential regressions in the KVM CPU hot- plugging. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1920944/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp