> Date: Sun, 23 Jul 2023 22:57:14 +0900
> From: PHO <p...@cielonegro.org>
> 
> On 7/23/23 21:24, Taylor R Campbell wrote:
> > Why can't the loop just use dma_fence_get_rcu and dma_fence_put?
> > 
> > Might need to release and reacquire the lock around dma_fence_put.
> 
> I tried but it didn't work, apparently because someone was holding a 
> pointer to a dying fence and waiting for it to be signaled. Using 
> dma_fence_get_rcu() meant we didn't signal such fences, and the kernel 
> hanged and eventually got killed by heartbeat().

What exactly did you try and what was the heartbeat panic stack trace?
Nothing here should wait for signalling in a way that trips the kernel
heartbeat monitor -- that should only happen if you hold onto a spin
lock for too long.

(That said: yay, the heartbeat monitor is catching what would
otherwise be a silent nonresponsive hang!)

Thinking about it some more, I see there is a snag with the loop and
dma_fence_put, since

(a) we can only use list_next(&fence->head) if we read it and then use
it under the lock, but

(b) we can only dma_fence_put outside the lock because the release
callback, vmw_fence_obj_destroy, takes the lock,

so there's no way to do the iteration safely as is with an extra
reference count acquire/release cycle.  Two options come to mind:

1. Move vmw_fence_obj_destroy to an `RCU' callback so that we can
   safely do dma_fence_put under the lock.

2. Create a list on the stack of signalled fences to put in
   __vmw_fences_update and put them after the loop, outside the lock:

        struct list_head to_put = LIST_HEAD_INIT(to_put);
...
        list_for_each_entry_safe(fence, next_fence, &fman->fence_list, head) {
                if (dma_fence_get_rcu(fence) != 0) {            /* (*) */
                        list_del_init(&fence->head);
                        continue;
                }
                if (seqno - fence->base.seqno < VMW_FENCE_WRAP) {
                        list_del_init(&fence->head);
                        list_add(&fence->head, &to_put);        /* (*) */
                        dma_fence_signal_locked(&fence->base);
                        ...
                }
                        break;
        }
...
        if (fence != NULL) {
                spin_unlock(&fman->lock);
                dma_fence_put(fence);
                spin_lock(&fman->lock);
        }
        while ((fence = list_first(&to_put)) != NULL) {
                list_del_init(&fence->head);
                spin_unlock(&fman->lock);
                dma_fence_put(fence);
                spin_lock(&fman->lock);
        }

> > If that doesn't work for some reason and we do have to add a special
> > case, I'd strongly prefer to add a new front end like
> > dma_fence_signal_vmwgfxspecialhack or something (other examples in the
> > tasklet API) that has the relaxed assertion instead of relaxing the
> > assertions for every other driver which isn't bonkers.
> 
> Alright. I'll do that.

This should be a last resort (but not as bad as relaxing the
assertions) if we can't figure out a better way to do it.

I'm not sure the patch above is going to be best (if it works), but
I'd like to see if there's a way to do this without abusing the API
semantics too much before giving in to the abuse.

Reply via email to