Hi Derek,

I wish to gladly inform you that I've fixed the bug preventing suspend with 
Tboot.

It was in the piece of code that you've highlighted - the atomic increment on 
ap_wfs_count variable need have been moved into the else clause of the 
following if statement, something like this:

static int tboot_dying_cpu(unsigned int cpu) {
        if (num_online_cpus() == 1) {
                if (tboot_wait_for_aps(atomic_read(&ap_wfs_count)))
                        return -EBUSY;
        } else
                atomic_inc(&ap_wfs_count);
        return 0;
}

Otherwise, when the procedure shutdown all APs, BSP was also counted in as one 
of them, thus creating a mismatch between what kernel had counted and what 
TBOOT reported back to it.
This caused a kernel panic, which froze the machine while putting it into 
suspend.

I've sent a version of TBOOT Live Image equipped with this patch of the kernel 
to our validation teams and I'm awaiting their response after the weekend.
If I get positive results from their tests I will try and have this patch 
attached to the official kernel repository and featured in versions 5.19 and up.

Kind regards,
Paweł





-----Original Message-----
From: Randzio, Pawel 
Sent: Friday, May 13, 2022 12:12 PM
To: 'Derek Dolney' <z...@posteo.net>; Łukasz Hawryłko <luk...@hawrylko.pl>
Cc: tboot-devel@lists.sourceforge.net
Subject: RE: [tboot-devel] suspend problem since kernel 5.15

Hi Derek,

I've also been picking at that issue for the past few months and reached the 
same wall as you have with the -EBUSY callback return, although your message 
kind of gives me a new idea for where to look for the root cause as it seems I 
have not tracked all possible callbacks.
I'm not a kernel developer either and honestly debugging that S3 issue is 
troublesome to me too, to say the least. If anyone on this mailing list has any 
idea how to further the debugging or even better - solve this issue, please 
feel free to share ideas.

On a side note, please add Vincent into the communication, that might speed up 
the process.
Vincent may add others that could know what might be going on with that issue.

Thanks,
Paweł

-----Original Message-----
From: Derek Dolney <z...@posteo.net> 
Sent: Thursday, May 12, 2022 6:20 PM
To: Łukasz Hawryłko <luk...@hawrylko.pl>; tboot-devel@lists.sourceforge.net
Subject: Re: [tboot-devel] suspend problem since kernel 5.15

I have been working on this as best I can. However, I confess that I am not a 
kernel developer and have really no understanding of these tboot internals. 
Nevertheless here is a brief update. Please anyone feel free to share any ideas 
how to move forward to some resolution.

I got a desktop machine with rs232 serial output running tboot and reproduced 
the suspend problem that way and with this setup I can collect kernel printk 
and also cpu hotplug (cpuhp) tracing output. I have also thankfully got quite a 
bit of help from Vincent Donnefort who wrote the cpuhp changes (the commit I 
posted) that have exposed the issue. He has been very helpful, let me try to 
tell you what we have figured out.

On suspend, I get into the tboot callback:

static int tboot_dying_cpu(unsigned int cpu) {
        atomic_inc(&ap_wfs_count);
        if (num_online_cpus() == 1) {
                if (tboot_wait_for_aps(atomic_read(&ap_wfs_count)))
                        return -EBUSY;
        }
        return 0;
}

but the tboot_wait_for_aps times out for me so the callback returns EBUSY. The 
problem with that happening is that there is not a rollback mechanism in place 
at this point in the cpuhp sequence. So I mean from cpuhp point of view, there 
is not even a mechanism to handle the tboot callback failure. Besides that, we 
don't know what could be a sensible thing to do in the case of EBUSY. What does 
it mean tboot is busy and what should be done about it? Please help us to 
understand.


_______________________________________________
tboot-devel mailing list
tboot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tboot-devel
---------------------------------------------------------------------
Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial 
Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | 
Kapital zakladowy 200.000 PLN.
Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i 
moze zawierac informacje poufne. W razie przypadkowego otrzymania tej 
wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; 
jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). If you are not the intended recipient, please 
contact the sender and delete all copies; any review or distribution by others 
is strictly prohibited.



_______________________________________________
tboot-devel mailing list
tboot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tboot-devel

Reply via email to