Re: dma_fence: force users to take the lock manually

Boris Brezillon Fri, 06 Mar 2026 03:24:29 -0800

On Fri, 6 Mar 2026 12:03:19 +0100
Christian König <[email protected]> wrote:


> On 3/6/26 11:37, Boris Brezillon wrote:
> > On Fri, 6 Mar 2026 10:58:07 +0100
> > Christian König <[email protected]> wrote:
> >   
> >> On 3/6/26 10:46, Boris Brezillon wrote:  
> >>> On Fri, 6 Mar 2026 09:10:52 +0100
> >>> Christian König <[email protected]> wrote:    
> >>>> Well as I wrote above you either have super reliable locking in
> >>>> your signaling path or you will need that for error handling.    
> >>>
> >>> Not really. With rust's ownership model, you can make it so only
> >>> one thread gets to own the DriverFence (the signal-able fence
> >>> object), and the DriverFence::signal() method consumes this
> >>> object. This implies that only one path gets to signal the
> >>> DriverFence, and after that it vanishes, so no one else can
> >>> signal it anymore. Just to clarify, by vanishes, I mean that the
> >>> signal-able view disappears, but the observable object (Fence)
> >>> can stay around, so it can be monitored (and only monitored) by
> >>> others. With this model, it doesn't matter that _set_error() is
> >>> set under a dma_fence locked section or not, because the
> >>> concurrency is addressed at a higher level.    
> >>
> >> That whole approach won't work. You have at least the IRQ handler
> >> which signals completion and the timeout handler which signals
> >> completion with an error.  
> > 
> > From a pure rust standpoint, and assuming both path (IRQ handler and
> > timeout handler) are written in rust, the compiler won't let you
> > signal concurrently if we design the thing properly, that's what
> > I'm trying to say. Just to be clear, it doesn't mean you can't have
> > one worker (in a workqueue context) that can signal a fence and an
> > IRQ handler that can signal the same fence. It just means that rust
> > won't let you do that unless you have proper locking in place, and
> > rust will also guarantee you won't be able to signal a fence that
> > has already been signaled, because as soon as it's signaled, the
> > signal-able fence should be consumed.  
> 
> Ah got it! I've worked a lot with OCaml in the past which has some
> similarities, but doesn't push things that far.
> 
> >>
> >> We have documented that this handling is mandatory for DMA-fences
> >> since so many driver implementations got it wrong.  
> > 
> > Again, I'm just talking about the rust implementation we're aiming
> > for. If you start mixing C and rust in the same driver, you're back
> > to the original problem you described.  
> 
> The key point is the Rust implementation should not repeat the
> mistakes we made in the C implementation.
> 
> For example blocking that multiple threads can't signal a DMA-fence
> is completely irrelevant.

From a correctness standpoint, I think it's important to ensure no more
than one thread gets to signal the object.

> 
> What we need to guarantee is correct timeout handling and that
> DMA-fence can only signal from something delivered from a HW event,
> e.g. a HW interrupt or interrupt worker or similar.

We've mostly focused on coming up with a solution that would annotate
signaling paths in an automated way, and making sure dma_fence_signal()
is never called outside of a non-annotated path:
- creation of DmaFenceWorkqueue/DmaFence[Delayed]Work that guarantees
  all works are executed in a dma_fence_signalling_{begin,end}()
  section, so we can properly detect deadlocks (through lockdep)
- creation of a DmaFenceIrqHandler for the same reason
- we'll need variants for each new deferred mechanism drivers might
  want to use (kthread_worker?)

But there's currently no restriction on calling dma_fence_signal() in a
user thread context (IOCTL()). I guess that shouldn't be too hard to
add (is_user_task() to the rescue).

> 
> A DMA-fence should *never* signal because of an IOCTL

Okay, that's understandable.

> or because some
> object runs out of scope. E.g. when you cleanup a HW ring buffer, FW
> queue, etc...

We were actually going in the opposite direction:
auto-signal(ECANCELED) on DriverFenceTimeline object destruction (which
is the thing that would be attached to the HW ringbuf. The reason is:
we don't want to leave unsignalled fences behind, and if the HW ring is
gone, there's nothing that can signal it. Mind explaining why you think
this shouldn't be done, because I originally interpreted your
suggestion as exactly the opposite.

Re: dma_fence: force users to take the lock manually

Reply via email to