On 2/10/26 14:49, Alice Ryhl wrote: > On Tue, Feb 10, 2026 at 02:26:31PM +0100, Boris Brezillon wrote: >> On Tue, 10 Feb 2026 13:15:31 +0000 >> Alice Ryhl <[email protected]> wrote: >> >>> On Tue, Feb 10, 2026 at 01:36:17PM +0100, Boris Brezillon wrote: >>>> On Tue, 10 Feb 2026 10:15:04 +0000 >>>> Alice Ryhl <[email protected]> wrote: >>>> >>>>> impl MustBeSignalled<'_> { >>>>> /// Drivers generally should not use this one. >>>>> fn i_promise_it_will_be_signalled(self) -> WillBeSignalled { ... } >>>>> >>>>> /// One way to ensure the fence has been signalled is to signal it. >>>>> fn signal_fence(self) -> WillBeSignalled { >>>>> self.fence.signal(); >>>>> self.i_promise_it_will_be_signalled() >>>>> } >>>>> >>>>> /// Another way to ensure the fence will be signalled is to spawn a >>>>> /// workqueue item that promises to signal it. >>>>> fn transfer_to_wq( >>>>> self, >>>>> wq: &Workqueue, >>>>> item: impl DmaFenceWorkItem, >>>>> ) -> WillBeSignalled { >>>>> // briefly obtain the lock class of the wq to indicate to >>>>> // lockdep that the signalling path "blocks" on arbitrary jobs >>>>> // from this wq completing >>>>> bindings::lock_acquire(&wq->key); >>>>> bindings::lock_release(&wq->key); >>>>> >>>>> // enqueue the job >>>>> wq.enqueue(item, wq); >>>>> >>>>> // The signature of DmaFenceWorkItem::run() promises to arrange >>>>> // for it to be signalled. >>>>> self.i_promise_it_will_be_signalled() >>>>> } >>>> >>>> I guess what's still missing is some sort of `transfer_to_hw()` >>>> function and way to flag the IRQ handler taking over the fence >>>> signaling token. >>> >>> Yes, transfer to hardware needs to be another piece of logic similar to >>> transfer to wq. And I imagine there are many ways such a transfer to >>> hardware could work. >>> >>> Unless you have a timeout on it, in which case the WillBeSignalled is >>> satisfied by the fact you have a timeout alone, and the signalling that >>> happens from the irq is just an opportunistic signal from outside the >>> dma fence signalling critical path. >> >> Yes and no. If it deadlocks in the completion WorkItem because of >> allocations (or any of the forbidden use cases), I think we want to >> catch that, because that's a sign fences are likely to end up with >> timeouts when they should have otherwise been signaled properly. >> >>> Well ... unless triggering timeouts can block on GFP_KERNEL >>> allocations... >> >> I mean, the timeout handler should also be considered a DMA-signalling >> path, and the same rules should apply to it. > > I guess that's fair. Even with a timeout you want both to be signalling > path. > > I guess more generally, if a fence is signalled by mechanism A or B, > whichever happens first, you have the choice between:
That doesn't happen in practice. For each fence you only have one signaling path you need to guarantee forward progress for. All other signaling paths are just opportunistically optimizations which *can* signal the fence, but there is no guarantee that they will. We used to have some exceptions to that, especially around aborting submissions, but those turned out to be a really bad idea as well. Thinking more about it you should probably enforce that there is only one signaling path for each fence signaling. Regards, Christian. > > 1. A in signalling path, B is not > 2. B in signalling path, A is not > 3. A and B both in signalling path > > But the downside of choosing (1.) or (2.) is that if you declare that > event B is not in the signalling path, then B can kmalloc(GFP_KERNEL), > which may deadlock on itself until event A happens, and if A is a > timeout that could be a long time, so this scenario is undesirable even > if technically it's not a deadlock because it eventually unblocks > itself. > > So we should choose option (3.) and declare that both timeout and hw irq > codepaths are signalling paths. > > Alice
