On Fri, 2026-03-06 at 12:24 +0100, Boris Brezillon wrote:
> On Fri, 6 Mar 2026 12:03:19 +0100
> Christian König <[email protected]> wrote:
>
> > On 3/6/26 11:37, Boris Brezillon wrote:
> > > On Fri, 6 Mar 2026 10:58:07 +0100
> > > Christian König <[email protected]> wrote:
> > >
> > > > On 3/6/26 10:46, Boris Brezillon wrote:
> > > > > On Fri, 6 Mar 2026 09:10:52 +0100
> > > > > Christian König <[email protected]> wrote:
> > > > > > Well as I wrote above you either have super reliable locking in
> > > > > > your signaling path or you will need that for error handling.
> > > > >
> > > > > Not really. With rust's ownership model, you can make it so only
> > > > > one thread gets to own the DriverFence (the signal-able fence
> > > > > object), and the DriverFence::signal() method consumes this
> > > > > object. This implies that only one path gets to signal the
> > > > > DriverFence, and after that it vanishes, so no one else can
> > > > > signal it anymore. Just to clarify, by vanishes, I mean that the
> > > > > signal-able view disappears, but the observable object (Fence)
> > > > > can stay around, so it can be monitored (and only monitored) by
> > > > > others. With this model, it doesn't matter that _set_error() is
> > > > > set under a dma_fence locked section or not, because the
> > > > > concurrency is addressed at a higher level.
> > > >
> > > > That whole approach won't work. You have at least the IRQ handler
> > > > which signals completion and the timeout handler which signals
> > > > completion with an error.
> > >
> > > From a pure rust standpoint, and assuming both path (IRQ handler and
> > > timeout handler) are written in rust, the compiler won't let you
> > > signal concurrently if we design the thing properly, that's what
> > > I'm trying to say. Just to be clear, it doesn't mean you can't have
> > > one worker (in a workqueue context) that can signal a fence and an
> > > IRQ handler that can signal the same fence. It just means that rust
> > > won't let you do that unless you have proper locking in place, and
> > > rust will also guarantee you won't be able to signal a fence that
> > > has already been signaled, because as soon as it's signaled, the
> > > signal-able fence should be consumed.
> >
> > Ah got it! I've worked a lot with OCaml in the past which has some
> > similarities, but doesn't push things that far.
> >
> > > >
> > > > We have documented that this handling is mandatory for DMA-fences
> > > > since so many driver implementations got it wrong.
> > >
> > > Again, I'm just talking about the rust implementation we're aiming
> > > for. If you start mixing C and rust in the same driver, you're back
> > > to the original problem you described.
> >
> > The key point is the Rust implementation should not repeat the
> > mistakes we made in the C implementation.
> >
> > For example blocking that multiple threads can't signal a DMA-fence
> > is completely irrelevant.
>
> From a correctness standpoint, I think it's important to ensure no more
> than one thread gets to signal the object.
If you have two paths that can signal a fence, that will result
effectively in you in Rust having to use yet another lock for a fence,
and likely some mechanism for revoking the access.
I would at least consider whether it isn't much easier to have the
signalling-function ignore multiple signal attempts.
AFAIU in Rust we originaly ended up at signal() consuming the fence
because of the code UAF problem with data: T.
>
> >
> > What we need to guarantee is correct timeout handling and that
> > DMA-fence can only signal from something delivered from a HW event,
> > e.g. a HW interrupt or interrupt worker or similar.
>
> We've mostly focused on coming up with a solution that would annotate
> signaling paths in an automated way, and making sure dma_fence_signal()
> is never called outside of a non-annotated path:
> - creation of DmaFenceWorkqueue/DmaFence[Delayed]Work that guarantees
> all works are executed in a dma_fence_signalling_{begin,end}()
> section, so we can properly detect deadlocks (through lockdep)
> - creation of a DmaFenceIrqHandler for the same reason
> - we'll need variants for each new deferred mechanism drivers might
> want to use (kthread_worker?)
>
> But there's currently no restriction on calling dma_fence_signal() in a
> user thread context (IOCTL()). I guess that shouldn't be too hard to
> add (is_user_task() to the rescue).
>
> >
> > A DMA-fence should *never* signal because of an IOCTL
>
> Okay, that's understandable.
>
> > or because some
> > object runs out of scope. E.g. when you cleanup a HW ring buffer, FW
> > queue, etc...
>
> We were actually going in the opposite direction:
> auto-signal(ECANCELED) on DriverFenceTimeline object destruction (which
> is the thing that would be attached to the HW ringbuf. The reason is:
> we don't want to leave unsignalled fences behind,
>
Not only do we not "want to", we actually *cannot*. We have to make
sure all fences are signaled because only this way the C backend plus
RCU can protect also the Rust code against UAF.
> and if the HW ring is
> gone, there's nothing that can signal it. Mind explaining why you think
> this shouldn't be done, because I originally interpreted your
> suggestion as exactly the opposite.
I also don't get it. All fences must always get signaled, that's one of
the most fundamental fence rules. Thus, if the last accessor to a fence
drops, you do want to signal it with -ECANCELED
P.