On Fri, 6 Mar 2026 13:54:15 +0100 Christian König <[email protected]> wrote:
> On 3/6/26 13:36, Philipp Stanner wrote: > >>>> (which > >>>> is the thing that would be attached to the HW ringbuf. The > >>>> reason is: we don't want to leave unsignalled fences behind, > >>>> > >>> > >>> Not only do we not "want to", we actually *cannot*. We have to > >>> make sure all fences are signaled because only this way the C > >>> backend plus RCU can protect also the Rust code against UAF. > >>> > >>>> and if the HW ring is > >>>> gone, there's nothing that can signal it. Mind explaining why > >>>> you think this shouldn't be done, because I originally > >>>> interpreted your suggestion as exactly the opposite. > >>> > >>> I also don't get it. All fences must always get signaled, that's > >>> one of the most fundamental fence rules. Thus, if the last > >>> accessor to a fence drops, you do want to signal it with > >>> -ECANCELED > >> > >> All fences must always signal because the HW operation must always > >> complete or be terminated by a timeout. > >> > >> If a fence signals only because it runs out of scope than that > >> means that you have a huge potential for data corruption and that > >> is even worse than not signaling a fence. > >> > >> In other words not signaling a fence can leave the system in a > >> deadlock state, but signaling it incorrectly usually results in > >> random data corruption. > > > > It all stands and falls with the question whether a fence can drop > > by accident in Rust, or if it will only ever drop when the hw-ring > > is closed. > > > > What do you believe is the right thing to do when a driver unloads? > > > > Do a dma_fence_wait() to make sure that all HW operations have > completed and all fences signaled. > > > Ideally we could design it in a way that the driver closes its > > rings, the pending fences drop and get signaled with ECANCELED. > > No, exactly that is a really bad idea. > > Just do it the other way around, use the dma_fence to wait for the HW > operation to be completed. But in practice you don't just wait for the HW to finish most of the time. You instruct the HW to stop processing stuff, and then wait for it to acknowledge that it indeed stopped. And the HWRing object will only be dropped when that happens. There's of course a fallback for the case where the STOP operation fails (with reset, etc), which would just delay the drop of the HWRing. But the point is, when the HWRing is dropped, you should be guaranteed that the HW is no longer executing any of the possibly pending jobs. Now, of course you can decide that it's the driver responsibility to walk the list of jobs that were pending after a STOP has been acked and manually signal all fences, or you can assume that the HWRing being dropped is what provides this guarantee. And because the HWRing embeds you DmaFenceCtx that auto-signal on drops, you don't have to do anything. > > Then wait for an RCU grace period to make sure that nobody is still > inside your DMA fence ops. > > And then you can continue with unloading the module. > > > Your concern seems to be a driver by accident droping a fence while > > the hardware is still processing the associated job. > > > > (how's that dangerous, though? Shouldn't parties waiting for the > > fence detect the error? ECANCELED ⇒ you must not access the > > associated memory) > > The dma_fence is the SW object which represents the HW operation. > > And that HW operation is doing DMA, e.g. accessing and potentially > writing into memory. That's where the name Direct Memory Access comes > from. > > So when that is messed up the memory which gets written to is > potentially re-used with the absolutely dire consequences we have > seen so many times. But then, I'd argue that the HWRing (and the underlying FenceCtx keeping track of emitted fences on this ring) should live at least as long as the HW is assumed to be running commands. That's IMHO the guarantee we need to provide: don't drop your HWRing object until you're sure the HW has stopped pulling stuff from there. I can think of the following two cases: 1. The HW has a way to prematurely stop stuff that are currently executing. First thing we do is ensure the HW can't pull anything new, then we issue a STOP and wait for an ACK. When this ACK is received, we proceed with the rest of the cleanup. If the ACK doesn't come in time, we flag the HWRing unusable, schedule a reset, and wait for this reset to be effective before dropping the HWRing. 2. The HW can't stop what it started doing. We make sure nothing new can be pushed to the HWRing, we wait for the in-flight ops to land. If it's taking too long, the timeout handler will take over, schedule a reset, and drop the faulty HWRing only after the reset is effective. > > Keep in mind that this framework is not only used by GPU where at > least modern ones have VM protection, but also old ones and stuff > like V4L were such things is just not present in any way. I'm not questioning the fact signaling fences prematurely can lead to terrible security holes which are worse than deadlocks, I'm questioning the fact this "dont-signal-before-HW-is-done" guarantee needs to happen at the fence level, rather than at the fence emitter level.
