Hi Derek, I've also been picking at that issue for the past few months and reached the same wall as you have with the -EBUSY callback return, although your message kind of gives me a new idea for where to look for the root cause as it seems I have not tracked all possible callbacks. I'm not a kernel developer either and honestly debugging that S3 issue is troublesome to me too, to say the least. If anyone on this mailing list has any idea how to further the debugging or even better - solve this issue, please feel free to share ideas.
On a side note, please add Vincent into the communication, that might speed up the process. Vincent may add others that could know what might be going on with that issue. Thanks, Paweł -----Original Message----- From: Derek Dolney <z...@posteo.net> Sent: Thursday, May 12, 2022 6:20 PM To: Łukasz Hawryłko <luk...@hawrylko.pl>; tboot-devel@lists.sourceforge.net Subject: Re: [tboot-devel] suspend problem since kernel 5.15 I have been working on this as best I can. However, I confess that I am not a kernel developer and have really no understanding of these tboot internals. Nevertheless here is a brief update. Please anyone feel free to share any ideas how to move forward to some resolution. I got a desktop machine with rs232 serial output running tboot and reproduced the suspend problem that way and with this setup I can collect kernel printk and also cpu hotplug (cpuhp) tracing output. I have also thankfully got quite a bit of help from Vincent Donnefort who wrote the cpuhp changes (the commit I posted) that have exposed the issue. He has been very helpful, let me try to tell you what we have figured out. On suspend, I get into the tboot callback: static int tboot_dying_cpu(unsigned int cpu) { atomic_inc(&ap_wfs_count); if (num_online_cpus() == 1) { if (tboot_wait_for_aps(atomic_read(&ap_wfs_count))) return -EBUSY; } return 0; } but the tboot_wait_for_aps times out for me so the callback returns EBUSY. The problem with that happening is that there is not a rollback mechanism in place at this point in the cpuhp sequence. So I mean from cpuhp point of view, there is not even a mechanism to handle the tboot callback failure. Besides that, we don't know what could be a sensible thing to do in the case of EBUSY. What does it mean tboot is busy and what should be done about it? Please help us to understand. _______________________________________________ tboot-devel mailing list tboot-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tboot-devel --------------------------------------------------------------------- Intel Technology Poland sp. z o.o. ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN. Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by others is strictly prohibited. _______________________________________________ tboot-devel mailing list tboot-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/tboot-devel