On Thu, Jul 22, 2021 at 2:28 AM Christian König <ckoenig.leichtzumer...@gmail.com> wrote: > > Am 22.07.21 um 11:08 schrieb Daniel Vetter: > > [SNIP] > >> As far as I know wake_up_state() tries to run the thread on the CPU it was > >> scheduled last, while wait_event_* makes the thread run on the CPU who > >> issues the wake by default. > >> > >> And yes I've also noticed this already and it was one of the reason why I > >> suggested to use a wait_queue instead of the hand wired dma_fence_wait > >> implementation. > > The first versions had used wait_queue, but iirc we had some issues with > > the callbacks and stuff and that was the reasons for hand-rolling. Or > > maybe it was the integration of the lockless fastpath for > > dma_fence_is_signalled(). > > > >> [SNIP] > >> Well it would have been nicer if we used the existing infrastructure > >> instead > >> of re-inventing stuff for dma_fence, but that chance is long gone. > >> > >> And you don't need a dma_fence_context base class, but rather just a flag > >> in > >> the dma_fence_ops if you want to change the behavior. > > If there's something broken we should just fix it, not force everyone to > > set a random flag. dma_fence work like special wait_queues, so if we > > differ then we should go back to that. > > Wait a second with that, this is not broken. It's just different > behavior and there are good arguments for both sides. > > If a wait is short you can have situations where you want to start the > thread on the original CPU. > This is because you can assume that the caches on that CPU are > still hot and heating up the caches on the local CPU would take longer > than an inter CPU interrupt. > > But if the wait is long it makes more sense to run the thread on the CPU > where you noticed the wake up event. > This is because you can assume that the caches are cold anyway and > starting the thread on the current CPU (most likely from an interrupt > handler) gives you the absolutely best latency. > In other words you usually return from the interrupt handler and > just directly switch to the now running thread. > > I'm not sure if all drivers want the same behavior. Rob here seems to > prefer number 2, but we have used 1 for dma_fence for a rather long time > now and it could be that some people start to complain when we switch > unconditionally. >
Hmm, I wonder if it would make sense to have a dma_wait_fence() flag to control the behavior, since it is maybe more about the waiter (and perhaps how long the waiter expects to wait) than the signaler.. BR, -R