On Fri, 6 Mar 2026 12:03:19 +0100 Christian König <[email protected]> wrote:
> On 3/6/26 11:37, Boris Brezillon wrote: > > On Fri, 6 Mar 2026 10:58:07 +0100 > > Christian König <[email protected]> wrote: > > > >> On 3/6/26 10:46, Boris Brezillon wrote: > >>> On Fri, 6 Mar 2026 09:10:52 +0100 > >>> Christian König <[email protected]> wrote: > >>>> Well as I wrote above you either have super reliable locking in > >>>> your signaling path or you will need that for error handling. > >>> > >>> Not really. With rust's ownership model, you can make it so only > >>> one thread gets to own the DriverFence (the signal-able fence > >>> object), and the DriverFence::signal() method consumes this > >>> object. This implies that only one path gets to signal the > >>> DriverFence, and after that it vanishes, so no one else can > >>> signal it anymore. Just to clarify, by vanishes, I mean that the > >>> signal-able view disappears, but the observable object (Fence) > >>> can stay around, so it can be monitored (and only monitored) by > >>> others. With this model, it doesn't matter that _set_error() is > >>> set under a dma_fence locked section or not, because the > >>> concurrency is addressed at a higher level. > >> > >> That whole approach won't work. You have at least the IRQ handler > >> which signals completion and the timeout handler which signals > >> completion with an error. > > > > From a pure rust standpoint, and assuming both path (IRQ handler and > > timeout handler) are written in rust, the compiler won't let you > > signal concurrently if we design the thing properly, that's what > > I'm trying to say. Just to be clear, it doesn't mean you can't have > > one worker (in a workqueue context) that can signal a fence and an > > IRQ handler that can signal the same fence. It just means that rust > > won't let you do that unless you have proper locking in place, and > > rust will also guarantee you won't be able to signal a fence that > > has already been signaled, because as soon as it's signaled, the > > signal-able fence should be consumed. > > Ah got it! I've worked a lot with OCaml in the past which has some > similarities, but doesn't push things that far. > > >> > >> We have documented that this handling is mandatory for DMA-fences > >> since so many driver implementations got it wrong. > > > > Again, I'm just talking about the rust implementation we're aiming > > for. If you start mixing C and rust in the same driver, you're back > > to the original problem you described. > > The key point is the Rust implementation should not repeat the > mistakes we made in the C implementation. > > For example blocking that multiple threads can't signal a DMA-fence > is completely irrelevant. From a correctness standpoint, I think it's important to ensure no more than one thread gets to signal the object. > > What we need to guarantee is correct timeout handling and that > DMA-fence can only signal from something delivered from a HW event, > e.g. a HW interrupt or interrupt worker or similar. We've mostly focused on coming up with a solution that would annotate signaling paths in an automated way, and making sure dma_fence_signal() is never called outside of a non-annotated path: - creation of DmaFenceWorkqueue/DmaFence[Delayed]Work that guarantees all works are executed in a dma_fence_signalling_{begin,end}() section, so we can properly detect deadlocks (through lockdep) - creation of a DmaFenceIrqHandler for the same reason - we'll need variants for each new deferred mechanism drivers might want to use (kthread_worker?) But there's currently no restriction on calling dma_fence_signal() in a user thread context (IOCTL()). I guess that shouldn't be too hard to add (is_user_task() to the rescue). > > A DMA-fence should *never* signal because of an IOCTL Okay, that's understandable. > or because some > object runs out of scope. E.g. when you cleanup a HW ring buffer, FW > queue, etc... We were actually going in the opposite direction: auto-signal(ECANCELED) on DriverFenceTimeline object destruction (which is the thing that would be attached to the HW ringbuf. The reason is: we don't want to leave unsignalled fences behind, and if the HW ring is gone, there's nothing that can signal it. Mind explaining why you think this shouldn't be done, because I originally interpreted your suggestion as exactly the opposite.
