On Mon, 2025-09-08 at 17:16 +0200, Christian König wrote: > Back to this topic again :/ > > On 22.08.25 10:51, Thomas Hellström wrote: > > > > We would still need some form of refcounting while waiting on > > > > the > > > > struct completion, but if we restricted the TTM refcount to > > > > *only* > > > > be > > > > used internally for that sole purpose, and also replaced the > > > > final > > > > ttm_bo_put() with the ttm_bo_finalize() that you suggest we > > > > wouldn't > > > > need to resurrect that refcount since it wouldn't drop to zero > > > > until > > > > the object is ready for final free. > > > > > > > > Ideas, comments? > > > > > > Ideally I think we would use the handle_count as backing store > > > the > > > drm_gem_object->refcount as structure reference. > > > > > > But that means a massive rework of the GEM handling/drivers/TTM. > > > > > > Alternative we could just grab a reference to a unsignaled fence > > > when > > > we encounter a dead BO on the LRU. > > > > > > What do you think of that idea? > > > > I think to be able to *guarantee* exhaustive eviction, we need > > 1) all unfreed resources to sit on an LRU, and > > 2) everything on the LRU needs to be able to have something to wait > > for. > > Yeah, completely agree. > > > A fence can't really guarantee 2), but it's close. There is a time- > > interval in betwen where the last fence signals and we take the > > resource from the LRU and free it. > > Correct, yes. > > > A struct completion can be made to signal when the resource is > > freed. > > I think the locking restriction in the struct completion case (the > > struct completion is likely waited for under a dma-resv), is that > > nothing except the object destructor may take an individualized > > resv of > > a zombie gem object whose refcount has gone to zero. The destructor > > should use an asserted trylock only to make lockdep happy. The > > struct > > completion also needs a refcount to avoid destroying it while there > > are > > waiters. > > Exactly that's the problem, as far as I can see we can't do that. > > See imported dma_resv objects needs to block waiting on the dma_resv > lock in the destruction path. Otherwise we can't cleanup their > mappings any more.
Ugh, yeah that is a problem. Unless we make an exception for imported dma-buf and push out the final dma_buf_unmap_attachment() until after signalling the completion. We're not directly freeing any pages anyway since that's done if evicting the exporter. > > > So what do you think about starting out with a fence, and if / when > > that appears not to be sufficient, we have a backup plan to move to > > a > > struct completion? > > Well we need to start somewhere, so grabbing an unsignaled dma_fence > reference sounds like the best plan for now. True. A good starting point. Although I have a feeling it will turn out to be not fully sufficient. Thanks, Thomas > > Regards, > Christian. > > > > > Thomas > > > > > > > > > > Regards, > > > Christian. > > >
