Luca Barbieri wrote: >> Also note that the delayed delete list is not in fence order but in >> deletion-time order, which perhaps gives room for more optimizations. >> > You are right. > I think then that ttm_bo_delayed_delete may still need to be changed, > because it stops when ttm_bo_cleanup_refs returns -EBUSY, which > happens when a fence has not been reached. > This means that a buffer will need to wait for all previously deleted > buffers to become unused, even if it is unused itself. > Is this acceptable? >
Yes, I think it's acceptable if you view it in the context that the most important buffer resources (GPU memory space and physical system memory) are immediately reclaimable through the eviction- and swapping mechanisms. > What if we get rid of the delayed destroy list, and instead append > buffers to be deleted to their fence object, and delete them when the > fence is signaled? > > This also allows to do it more naturally, since the fence object can > just keep a normal reference to the buffers it fences, and unreference > them on expiration. > > Then there needs to be no special "delayed destruction" logic, and it > would work as if the GPU were keeping a reference to the buffer > itself, using fences as a proxy to have the CPU do that work for the > GPU. > > Then the delayed work is no longer "periodically destroy buffers" but > rather "periodically check if fences are expired", naturally stopping > at the first unexpired one. > Drivers that support IRQs on fences could also do the work in the > interrupt handler/tasklet instead, avoid the delay jiffies magic > number. This may need a NAPI-like interrupt mitigation middle layer > for optimal results though. > > Yes, I think that this way, it should definitely be possible to find a more optimal solution. One should keep in mind, however, that we'll probably not able to destroy buffers from within an atomic context, which means we have to schedule a workqueue to do that task. We had to do a similar thing in the Poulsbo driver and it turned out that we could save a significant amount of CPU by using a delayed workqueue, collecting objects and destroying them periodically. /Thomas _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau