On 5/6/23 01:27, Michael Brown wrote: > On 05/05/2023 19:56, Laszlo Ersek wrote: >> I don't like the patch. For two reasons: >> >> (1) It papers over the actual issue. The problem should be fixed where >> it is, if possible. > > Agreed, but (as you have shown in > https://bugzilla.redhat.com/show_bug.cgi?id=2189136) the bug lies in > Windows code rather than in EDK2 code. If the goal is to allow these > buggy Windows builds to still be used with OVMF, then the only option is > to paper over the issue. We should do this only if it can be proven > safe to do so, of course. > >> (2) With the patch applied, NestedInterruptRaiseTPL() can return >> TPL_HIGH_LEVEL (as "InterruptedTPL"). Consequently, >> TimerInterruptHandler() [OvmfPkg/LocalApicTimerDxe/LocalApicTimerDxe.c] >> may pass TPL_HIGH_LEVEL back to NestedInterruptRestoreTPL(), as >> "InterruptedTPL". >> >> I believe that this in turn may invalidate at least one comment in >> NestedInterruptRestoreTPL(): >> >> // >> // Call RestoreTPL() to allow event notifications to be >> // dispatched. This will implicitly re-enable interrupts. >> // >> gBS->RestoreTPL (InterruptedTPL); >> >> Restoring TPL_HIGH_LEVEL does not re-enable interrupts -- nominally >> anyways. > > I agree that the comment is invalidated, but as far as I can tell the > logic remains safe. > > I will put together a patch to update the comments in > NestedInterruptTplLib to address the possibility of an interrupt > occurring (illegally) at TPL_HIGH_LEVEL. > >> (a) Make LocalApicTimerDxe Xen-specific again. It's only the OVMF Xen >> platform that really *needs* NestedInterruptTplLib. (Don't get me wrong: >> NestedInterruptTplLib is technically correct in all circumstances, but >> in practice it happens to be too strict.) >> >> (b) For the non-Xen OVMF platforms, re-create a LocalApicTimerDxe >> variant that effectively has commits a086f4a63bc0 and a24fbd606125 >> reverted. (We should keep 9bf473da4c1d.) This returns us to >> pre-239b50a86370 status -- that is, a timer interrupt handler that (a) >> does not try to be smart about nested interrupts, therefore one that is >> much simpler, and (b) is more tolerant of the Windows / cdboot.efi spec >> violation, (c) is vulnerable to the timer interrupt storm seen on Xen, >> but will never run on Xen. (Only the OVMF Xen platform is supposed to be >> launched on Xen.) > > I'm less keen on this because it reduces the runtime exposure of a very > complex piece of code, and will effectively cause that code to become > unmaintained. > > It's also satisfying (to me) that NestedInterruptTplLib provides a > provable upper bound on stack consumption due to interrupts, which can't > be guaranteed by the simpler pre-239b50a86370 scheme. > > Could we defer judgement until after I've fully reasoned through (and > documented) how NestedInterruptTplLib will work in the presence of > interrupts occurring at TPL_HIGH_LEVEL?
Sure, absolutely! As I wrote elsewhere: if you can revalidate the code with a new, less strict set of invariants, and update the comments, I think that would be the perfect workaround. Thank you, Laszlo -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#104251): https://edk2.groups.io/g/devel/message/104251 Mute This Topic: https://groups.io/mt/98656860/21656 Group Owner: devel+ow...@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/leave/9847357/21656/1706620634/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-