Re: dma_fence: force users to take the lock manually

Philipp Stanner Fri, 06 Mar 2026 03:57:15 -0800

On Fri, 2026-03-06 at 12:24 +0100, Boris Brezillon wrote:
> On Fri, 6 Mar 2026 12:03:19 +0100
> Christian König <[email protected]> wrote:
> 
> > On 3/6/26 11:37, Boris Brezillon wrote:
> > > On Fri, 6 Mar 2026 10:58:07 +0100
> > > Christian König <[email protected]> wrote:
> > >   
> > > > On 3/6/26 10:46, Boris Brezillon wrote:  
> > > > > On Fri, 6 Mar 2026 09:10:52 +0100
> > > > > Christian König <[email protected]> wrote:    
> > > > > > Well as I wrote above you either have super reliable locking in
> > > > > > your signaling path or you will need that for error handling.    
> > > > > 
> > > > > Not really. With rust's ownership model, you can make it so only
> > > > > one thread gets to own the DriverFence (the signal-able fence
> > > > > object), and the DriverFence::signal() method consumes this
> > > > > object. This implies that only one path gets to signal the
> > > > > DriverFence, and after that it vanishes, so no one else can
> > > > > signal it anymore. Just to clarify, by vanishes, I mean that the
> > > > > signal-able view disappears, but the observable object (Fence)
> > > > > can stay around, so it can be monitored (and only monitored) by
> > > > > others. With this model, it doesn't matter that _set_error() is
> > > > > set under a dma_fence locked section or not, because the
> > > > > concurrency is addressed at a higher level.    
> > > > 
> > > > That whole approach won't work. You have at least the IRQ handler
> > > > which signals completion and the timeout handler which signals
> > > > completion with an error.  
> > > 
> > > From a pure rust standpoint, and assuming both path (IRQ handler and
> > > timeout handler) are written in rust, the compiler won't let you
> > > signal concurrently if we design the thing properly, that's what
> > > I'm trying to say. Just to be clear, it doesn't mean you can't have
> > > one worker (in a workqueue context) that can signal a fence and an
> > > IRQ handler that can signal the same fence. It just means that rust
> > > won't let you do that unless you have proper locking in place, and
> > > rust will also guarantee you won't be able to signal a fence that
> > > has already been signaled, because as soon as it's signaled, the
> > > signal-able fence should be consumed.  
> > 
> > Ah got it! I've worked a lot with OCaml in the past which has some
> > similarities, but doesn't push things that far.
> > 
> > > > 
> > > > We have documented that this handling is mandatory for DMA-fences
> > > > since so many driver implementations got it wrong.  
> > > 
> > > Again, I'm just talking about the rust implementation we're aiming
> > > for. If you start mixing C and rust in the same driver, you're back
> > > to the original problem you described.  
> > 
> > The key point is the Rust implementation should not repeat the
> > mistakes we made in the C implementation.
> > 
> > For example blocking that multiple threads can't signal a DMA-fence
> > is completely irrelevant.
> 
> From a correctness standpoint, I think it's important to ensure no more
> than one thread gets to signal the object.


If you have two paths that can signal a fence, that will result
effectively in you in Rust having to use yet another lock for a fence,
and likely some mechanism for revoking the access.

I would at least consider whether it isn't much easier to have the
signalling-function ignore multiple signal attempts.

AFAIU in Rust we originaly ended up at signal() consuming the fence
because of the code UAF problem with data: T.

> 
> > 
> > What we need to guarantee is correct timeout handling and that
> > DMA-fence can only signal from something delivered from a HW event,
> > e.g. a HW interrupt or interrupt worker or similar.
> 
> We've mostly focused on coming up with a solution that would annotate
> signaling paths in an automated way, and making sure dma_fence_signal()
> is never called outside of a non-annotated path:
> - creation of DmaFenceWorkqueue/DmaFence[Delayed]Work that guarantees
>   all works are executed in a dma_fence_signalling_{begin,end}()
>   section, so we can properly detect deadlocks (through lockdep)
> - creation of a DmaFenceIrqHandler for the same reason
> - we'll need variants for each new deferred mechanism drivers might
>   want to use (kthread_worker?)
> 
> But there's currently no restriction on calling dma_fence_signal() in a
> user thread context (IOCTL()). I guess that shouldn't be too hard to
> add (is_user_task() to the rescue).
> 
> > 
> > A DMA-fence should *never* signal because of an IOCTL
> 
> Okay, that's understandable.
> 
> > or because some
> > object runs out of scope. E.g. when you cleanup a HW ring buffer, FW
> > queue, etc...
> 
> We were actually going in the opposite direction:
> auto-signal(ECANCELED) on DriverFenceTimeline object destruction (which
> is the thing that would be attached to the HW ringbuf. The reason is:
> we don't want to leave unsignalled fences behind,
> 

Not only do we not "want to", we actually *cannot*. We have to make
sure all fences are signaled because only this way the C backend plus
RCU can protect also the Rust code against UAF.

>  and if the HW ring is
> gone, there's nothing that can signal it. Mind explaining why you think
> this shouldn't be done, because I originally interpreted your
> suggestion as exactly the opposite.

I also don't get it. All fences must always get signaled, that's one of
the most fundamental fence rules. Thus, if the last accessor to a fence
drops, you do want to signal it with -ECANCELED


P.

Re: dma_fence: force users to take the lock manually

Reply via email to