On 3/9/26 16:06, Boris Brezillon wrote: >>>> >>>> Just do it the other way around, use the dma_fence to wait for the >>>> HW operation to be completed. >>> >>> But in practice you don't just wait for the HW to finish most of the >>> time. You instruct the HW to stop processing stuff, and then wait >>> for it to acknowledge that it indeed stopped. >> >> And how does the HW acknowledged that it has indeed stopped? Maybe by >> sending an interrupt which signals a DMA-fence? > > Yes, it's likely something like a _STATUS register update reflecting > the new HW state, plus an interrupt to wake the CPU up. The decision to > poll the status register or go the async-way is up to the driver.
Exactly that's the bad idea we have iterated over so many times. Ideally such stuff should *not* be up to the driver but enforced by the kernel. >> >> The point here is that all acknowledgement from the HW that a DMA >> operation was indeed stopped, independent if it's the normal >> operation completed use case or if it's the I have aborted use case, >> *must* always take the same HW and SW path. >> >> It is *not* sufficient that you do something like busy waiting for a >> bit in a register to flip in the abortion path and for a DMA memory >> write in the normal completion path. > > I'm assuming the DMA_OP_COMPLETE is also a register update of some > sort. But let's assume it's not, then sure, we need to make sure the > operation is either complete (event received through the IRQ handler), > or the DMA engine is fully stopped. Doesn't really matter which path is > doing this check, as long as it's done. Well, it is massively important to get that right or otherwise you end up with random memory corruption. And drivers notoriously get that handling wrong resulting and much worse issues than a simple UAF. >> >> That's why MMU/VM inside a device is usually not sufficient to >> prevent freed memory from being written to. You need an IOMMU for >> that, e.g. close to the CPU/memory and without caches behind the HW >> path. > > Either that, or you need a way to preempt DMA engine operations and > have them cancelled before they make it to the bus, plus wait for the > non-cancellable ones. And it doesn't really matter how the HW works, > because my point is not that we need to enforce how the SW can ensure > the HW is done processing the stuff (that's very HW specific), At least for PCIe devices that is pretty standardized. You need something which is ordered with respect to your DMA transactions, so you either end up with a write or an interrupt from the HW side. How it is implemented in the end (32bit vs 64bit fences, writes vs interrupts etc...) is HW specific, but that is actually only a really minor part of the handling. The problem is that what you describe above with "DMA_OP_COMPLETE is also a register update of some sort" is exactly what we have seen before as not working because MMIO register reads from the CPU side are not necessarily ordered with device writes. > just > that there needs to be a way to do this SW <-> HW synchronization, and > it's the driver responsibility to ensure that ultimately. My second > point was that, once the HW block is considered idle, there might be > operations that were never dequeued because they were cancelled before > the HW got to it, and for those, we'll never get HW events. We just have > to walk the list and manually signal fences. That's the step I was > suggesting to automate through the auto-signal-on-drop approach, but we > can automate it through an explicit DriverFenceTimeline::cancel_all() > method, I guess. I strongly suggest to at least document what is known to work and what not. Regards, Christian.
