Re: [PATCH v6 06/18] drm/virtio: remove ttm calls from in virtio_gpu_object_{reserve, unreserve}
On Fri, Jul 5, 2019 at 1:53 AM Gerd Hoffmann wrote: > > On Thu, Jul 04, 2019 at 12:17:48PM -0700, Chia-I Wu wrote: > > On Thu, Jul 4, 2019 at 4:10 AM Gerd Hoffmann wrote: > > > > > > Hi, > > > > > > > > - r = ttm_bo_reserve(>tbo, true, false, NULL); > > > > > + r = reservation_object_lock_interruptible(bo->gem_base.resv, > > > > > NULL); > > > > Can you elaborate a bit about how TTM keeps the BOs alive in, for > > > > example, virtio_gpu_transfer_from_host_ioctl? In that function, only > > > > three TTM functions are called: ttm_bo_reserve, ttm_bo_validate, and > > > > ttm_bo_unreserve. I am curious how they keep the BO alive. > > > > > > It can't go away between reserve and unreserve, and I think it also > > > can't be evicted then. Havn't checked how ttm implements that. > > Hm, but the vbuf using the BO outlives the reserve/unreserve section. > > The NO_EVICT flag applies only when the BO is still alive. Someone > > needs to hold a reference to the BO to keep it alive, otherwise the BO > > can go away before the vbuf is retired. > > Note that patches 14+15 rework virtio_gpu_transfer_*_ioctl to keep > gem reference until the command is finished and patch 17 drops > virtio_gpu_object_{reserve,unreserve} altogether. > > Maybe I should try to reorder the series, then squash 6+17 to reduce > confusion. I suspect that'll cause quite a few conflicts though ... This may be well-known and is what you meant by "the fence keeps the bo alive", but I finally realize that ttm_bo_put delays the deletion of a BO when it is busy. In the current design, vbuf does not hold references to its BOs. Nor do fences. It is possible for a BO to lose all its references and gets virtio_gpu_gem_free_object()ed while it is still busy. The key is ttm_bo_put. ttm_bo_put calls ttm_bo_cleanup_refs_or_queue to decide whether to delete the BO immediately (when the BO is already idle) or to queue the BO to a delayed delete list (when the BO is still busy). If a BO is queued to the delayed delete list, ttm_bo_delayed_delete is called every 10ms (HZ/100 to be exact) to scan through the list and delete idled BOs. I wrote a simple test (attached) and added a bunch of printk's to confirm this. Anyway, I believe the culprit is patch 11, when we switch from ttm_bo_put to drm_gem_shmem_free_object to free a BO whose last reference is gone. The deletion becomes immediately after the switch. We need to fix vbuf to refcount its BOs before we can do the switch. > > cheers, > Gerd > /* gcc -std=c11 -D_GNU_SOURCE -o virtio-gpu-bo virtio-gpu-bo.c */ #include #include #include #include #include #include #include #include #include #define PIPE_BUFFER 0 #define VIRGL_FORMAT_R8_UNORM 64 #define VIRGL_BIND_CONSTANT_BUFFER (1 << 6) #define DRM_VIRTGPU_RESOURCE_CREATE 0x04 #define DRM_IOCTL_VIRTGPU_RESOURCE_CREATE \ DRM_IOWR(DRM_COMMAND_BASE + DRM_VIRTGPU_RESOURCE_CREATE, \ struct drm_virtgpu_resource_create) struct drm_virtgpu_resource_create { uint32_t target; uint32_t format; uint32_t bind; uint32_t width; uint32_t height; uint32_t depth; uint32_t array_size; uint32_t last_level; uint32_t nr_samples; uint32_t flags; uint32_t bo_handle; uint32_t res_handle; uint32_t size; uint32_t stride; }; struct drm_virtgpu_3d_box { uint32_t x, y, z; uint32_t w, h, d; }; #define DRM_VIRTGPU_TRANSFER_TO_HOST 0x07 #define DRM_IOCTL_VIRTGPU_TRANSFER_TO_HOST \ DRM_IOWR(DRM_COMMAND_BASE + DRM_VIRTGPU_TRANSFER_TO_HOST, \ struct drm_virtgpu_3d_transfer_to_host) struct drm_virtgpu_3d_transfer_to_host { uint32_t bo_handle; struct drm_virtgpu_3d_box box; uint32_t level; uint32_t offset; }; static uint32_t buffer_create(int fd, uint32_t size) { struct drm_virtgpu_resource_create args = { .target = PIPE_BUFFER, .format = VIRGL_FORMAT_R8_UNORM, .bind = VIRGL_BIND_CONSTANT_BUFFER, .width = size, .height = 1, .depth = 1, .array_size = 1, .nr_samples = 1, }; int ret = ioctl(fd, DRM_IOCTL_VIRTGPU_RESOURCE_CREATE, ); assert(!ret); return args.bo_handle; } static void buffer_close(int fd, uint32_t bo) { struct drm_gem_close args = { .handle = bo, }; int ret = ioctl(fd, DRM_IOCTL_GEM_CLOSE, ); assert(!ret); } static void transfer_to_host(int fd, uint32_t bo, uint32_t size) { struct drm_virtgpu_3d_transfer_to_host args = { .bo_handle = bo, .box.w = size, .box.h = 1, .box.d = 1, }; int ret = ioctl(fd, DRM_IOCTL_VIRTGPU_TRANSFER_TO_HOST, ); assert(!ret); } int main() { const uint32_t size = 1 * 1024 * 1024; int fd = open("/dev/dri/renderD128", O_RDWR); assert(fd >= 0); uint32_t bo = buffer_create(fd, size); printf("transfer and close the BO immediately...\n"); transfer_to_host(fd, bo, size); buffer_close(fd, bo); printf("wait for 1
Re: [PATCH v6 06/18] drm/virtio: remove ttm calls from in virtio_gpu_object_{reserve,unreserve}
On Thu, Jul 04, 2019 at 12:17:48PM -0700, Chia-I Wu wrote: > On Thu, Jul 4, 2019 at 4:10 AM Gerd Hoffmann wrote: > > > > Hi, > > > > > > - r = ttm_bo_reserve(>tbo, true, false, NULL); > > > > + r = reservation_object_lock_interruptible(bo->gem_base.resv, > > > > NULL); > > > Can you elaborate a bit about how TTM keeps the BOs alive in, for > > > example, virtio_gpu_transfer_from_host_ioctl? In that function, only > > > three TTM functions are called: ttm_bo_reserve, ttm_bo_validate, and > > > ttm_bo_unreserve. I am curious how they keep the BO alive. > > > > It can't go away between reserve and unreserve, and I think it also > > can't be evicted then. Havn't checked how ttm implements that. > Hm, but the vbuf using the BO outlives the reserve/unreserve section. > The NO_EVICT flag applies only when the BO is still alive. Someone > needs to hold a reference to the BO to keep it alive, otherwise the BO > can go away before the vbuf is retired. Note that patches 14+15 rework virtio_gpu_transfer_*_ioctl to keep gem reference until the command is finished and patch 17 drops virtio_gpu_object_{reserve,unreserve} altogether. Maybe I should try to reorder the series, then squash 6+17 to reduce confusion. I suspect that'll cause quite a few conflicts though ... cheers, Gerd ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v6 06/18] drm/virtio: remove ttm calls from in virtio_gpu_object_{reserve, unreserve}
On Thu, Jul 4, 2019 at 4:10 AM Gerd Hoffmann wrote: > > Hi, > > > > - r = ttm_bo_reserve(>tbo, true, false, NULL); > > > + r = reservation_object_lock_interruptible(bo->gem_base.resv, > > > NULL); > > Can you elaborate a bit about how TTM keeps the BOs alive in, for > > example, virtio_gpu_transfer_from_host_ioctl? In that function, only > > three TTM functions are called: ttm_bo_reserve, ttm_bo_validate, and > > ttm_bo_unreserve. I am curious how they keep the BO alive. > > It can't go away between reserve and unreserve, and I think it also > can't be evicted then. Havn't checked how ttm implements that. Hm, but the vbuf using the BO outlives the reserve/unreserve section. The NO_EVICT flag applies only when the BO is still alive. Someone needs to hold a reference to the BO to keep it alive, otherwise the BO can go away before the vbuf is retired. I can be wrong, but on the other hand, it seems fine for a BO to go away before the vbuf using it is retired. When that happens, the driver emits a RESOURCE_UNREF vbuf which is *after* the original vbuf. > > cheers, > Gerd > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v6 06/18] drm/virtio: remove ttm calls from in virtio_gpu_object_{reserve,unreserve}
Hi, > > - r = ttm_bo_reserve(>tbo, true, false, NULL); > > + r = reservation_object_lock_interruptible(bo->gem_base.resv, NULL); > Can you elaborate a bit about how TTM keeps the BOs alive in, for > example, virtio_gpu_transfer_from_host_ioctl? In that function, only > three TTM functions are called: ttm_bo_reserve, ttm_bo_validate, and > ttm_bo_unreserve. I am curious how they keep the BO alive. It can't go away between reserve and unreserve, and I think it also can't be evicted then. Havn't checked how ttm implements that. cheers, Gerd ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v6 06/18] drm/virtio: remove ttm calls from in virtio_gpu_object_{reserve, unreserve}
On Tue, Jul 2, 2019 at 7:19 AM Gerd Hoffmann wrote: > > Call reservation_object_* directly instead > of using ttm_bo_{reserve,unreserve}. > > v4: check for EINTR only. > v3: check for EINTR too. > > Signed-off-by: Gerd Hoffmann > Reviewed-by: Daniel Vetter > --- > drivers/gpu/drm/virtio/virtgpu_drv.h | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h > b/drivers/gpu/drm/virtio/virtgpu_drv.h > index 06cc0e961df6..07f6001ea91e 100644 > --- a/drivers/gpu/drm/virtio/virtgpu_drv.h > +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h > @@ -402,9 +402,9 @@ static inline int virtio_gpu_object_reserve(struct > virtio_gpu_object *bo) > { > int r; > > - r = ttm_bo_reserve(>tbo, true, false, NULL); > + r = reservation_object_lock_interruptible(bo->gem_base.resv, NULL); Can you elaborate a bit about how TTM keeps the BOs alive in, for example, virtio_gpu_transfer_from_host_ioctl? In that function, only three TTM functions are called: ttm_bo_reserve, ttm_bo_validate, and ttm_bo_unreserve. I am curious how they keep the BO alive. > if (unlikely(r != 0)) { > - if (r != -ERESTARTSYS) { > + if (r != -EINTR) { > struct virtio_gpu_device *qdev = > bo->gem_base.dev->dev_private; > dev_err(qdev->dev, "%p reserve failed\n", bo); > @@ -416,7 +416,7 @@ static inline int virtio_gpu_object_reserve(struct > virtio_gpu_object *bo) > > static inline void virtio_gpu_object_unreserve(struct virtio_gpu_object *bo) > { > - ttm_bo_unreserve(>tbo); > + reservation_object_unlock(bo->gem_base.resv); > } > > /* virgl debufs */ > -- > 2.18.1 > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization