Re: [PATCH] drm/radeon/kms: Bailout of blit if error happen protect with mutex V2
On Mon, Jan 25, 2010 at 04:02:37PM +1000, Dave Airlie wrote: On Sat, Jan 23, 2010 at 1:56 AM, Jerome Glisse jgli...@redhat.com wrote: If an error happen in r600_blit_prepare_copy report it rather than WARNING and keeping execution. For instance if ib allocation failed we did just warn about but then latter tried to access NULL ib ptr causing oops. This patch also protect r600_copy_blit with a mutex as otherwise one process might overwrite blit temporary data with new one possibly leading to GPU lockup. Did you boot this to X on any relevant hw? it totally fails to start gdm here on my r600 card. I'm assuming because you now never pin the shader object at all from what I can see, since _startup gets called before blit_init. Please no more crap like this unless you are doing serious boot testing on it. radeon_bo_unref(rdev-r600_blit.shader_obj); + rdev-r600_blit.shader_obj = NULL; } And WTF? ^^^ you wrote radeon_bo_unref you should know what it does. Dave. Sorry i did V2 in hurry and didn't test it, here is V3 tested on R6XX R7XX. Cheers, Jerome From 57fecda40f7fdda5500a361b2df2ac047dca5aba Mon Sep 17 00:00:00 2001 From: Jerome Glisse jgli...@redhat.com Date: Fri, 22 Jan 2010 15:19:00 +0100 Subject: [PATCH] drm/radeon/kms: Bailout of blit if error happen protect with mutex V3 If an error happen in r600_blit_prepare_copy report it rather than WARNING and keeping execution. For instance if ib allocation failed we did just warn about but then latter tried to access NULL ib ptr causing oops. This patch also protect r600_copy_blit with a mutex as otherwise one process might overwrite blit temporary data with new one possibly leading to GPU lockup. Should partialy or totaly fix: https://bugzilla.redhat.com/show_bug.cgi?id=553279 V2 failing blit initialization is not fatal, fallback to memcpy when this happen V3 init blit before startup as we pin in startup, remove duplicate code (this one was actualy tested unlike V2) Signed-off-by: Jerome Glisse jgli...@redhat.com --- drivers/gpu/drm/radeon/r600.c | 54 drivers/gpu/drm/radeon/r600_blit_kms.c |7 +++- drivers/gpu/drm/radeon/radeon.h|1 + drivers/gpu/drm/radeon/rv770.c | 31 +- 4 files changed, 48 insertions(+), 45 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index d0bd117..b833b4b 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -1788,23 +1788,24 @@ void r600_fence_ring_emit(struct radeon_device *rdev, radeon_ring_write(rdev, RB_INT_STAT); } -int r600_copy_dma(struct radeon_device *rdev, - uint64_t src_offset, - uint64_t dst_offset, - unsigned num_pages, - struct radeon_fence *fence) -{ - /* FIXME: implement */ - return 0; -} - int r600_copy_blit(struct radeon_device *rdev, uint64_t src_offset, uint64_t dst_offset, unsigned num_pages, struct radeon_fence *fence) { - r600_blit_prepare_copy(rdev, num_pages * RADEON_GPU_PAGE_SIZE); + int r; + + mutex_lock(rdev-r600_blit.mutex); + rdev-r600_blit.vb_ib = NULL; + r = r600_blit_prepare_copy(rdev, num_pages * RADEON_GPU_PAGE_SIZE); + if (r) { + if (rdev-r600_blit.vb_ib) + radeon_ib_free(rdev, rdev-r600_blit.vb_ib); + mutex_unlock(rdev-r600_blit.mutex); + return r; + } r600_kms_blit_copy(rdev, src_offset, dst_offset, num_pages * RADEON_GPU_PAGE_SIZE); r600_blit_done_copy(rdev, fence); + mutex_unlock(rdev-r600_blit.mutex); return 0; } @@ -1860,26 +1861,19 @@ int r600_startup(struct radeon_device *rdev) return r; } r600_gpu_init(rdev); - - if (!rdev-r600_blit.shader_obj) { - r = r600_blit_init(rdev); + /* pin copy shader into vram */ + if (rdev-r600_blit.shader_obj) { + r = radeon_bo_reserve(rdev-r600_blit.shader_obj, false); + if (unlikely(r != 0)) + return r; + r = radeon_bo_pin(rdev-r600_blit.shader_obj, RADEON_GEM_DOMAIN_VRAM, + rdev-r600_blit.shader_gpu_addr); + radeon_bo_unreserve(rdev-r600_blit.shader_obj); if (r) { - DRM_ERROR(radeon: failed blitter (%d).\n, r); + dev_err(rdev-dev, (%d) pin blit object failed\n, r); return r; } } - - r = radeon_bo_reserve(rdev-r600_blit.shader_obj, false); - if (unlikely(r != 0)) - return r; - r = radeon_bo_pin(rdev-r600_blit.shader_obj, RADEON_GEM_DOMAIN_VRAM, - rdev-r600_blit.shader_gpu_addr); - radeon_bo_unreserve(rdev-r600_blit.shader_obj); -
Re: [PATCH] drm/radeon/kms: Bailout of blit if error happen protect with mutex V2
On Sat, Jan 23, 2010 at 1:56 AM, Jerome Glisse jgli...@redhat.com wrote: If an error happen in r600_blit_prepare_copy report it rather than WARNING and keeping execution. For instance if ib allocation failed we did just warn about but then latter tried to access NULL ib ptr causing oops. This patch also protect r600_copy_blit with a mutex as otherwise one process might overwrite blit temporary data with new one possibly leading to GPU lockup. Did you boot this to X on any relevant hw? it totally fails to start gdm here on my r600 card. I'm assuming because you now never pin the shader object at all from what I can see, since _startup gets called before blit_init. Please no more crap like this unless you are doing serious boot testing on it. radeon_bo_unref(rdev-r600_blit.shader_obj); + rdev-r600_blit.shader_obj = NULL; } And WTF? ^^^ you wrote radeon_bo_unref you should know what it does. Dave. -- Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[PATCH] drm/radeon/kms: Bailout of blit if error happen protect with mutex
If an error happen in r600_blit_prepare_copy report it rather than WARNING and keeping execution. For instance if ib allocation failed we did just warn about but then latter tried to access NULL ib ptr causing oops. This patch also protect r600_copy_blit with a mutex as otherwise one process might overwrite blit temporary data with new one possibly leading to GPU lockup. Should partialy or totaly fix: https://bugzilla.redhat.com/show_bug.cgi?id=553279 Signed-off-by: Jerome Glisse jgli...@redhat.com --- drivers/gpu/drm/radeon/r600.c | 39 ++- drivers/gpu/drm/radeon/r600_blit_kms.c |7 - drivers/gpu/drm/radeon/radeon.h|1 + drivers/gpu/drm/radeon/rv770.c | 16 + 4 files changed, 30 insertions(+), 33 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index d0bd117..6e47e11 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -1788,23 +1788,24 @@ void r600_fence_ring_emit(struct radeon_device *rdev, radeon_ring_write(rdev, RB_INT_STAT); } -int r600_copy_dma(struct radeon_device *rdev, - uint64_t src_offset, - uint64_t dst_offset, - unsigned num_pages, - struct radeon_fence *fence) -{ - /* FIXME: implement */ - return 0; -} - int r600_copy_blit(struct radeon_device *rdev, uint64_t src_offset, uint64_t dst_offset, unsigned num_pages, struct radeon_fence *fence) { - r600_blit_prepare_copy(rdev, num_pages * RADEON_GPU_PAGE_SIZE); + int r; + + mutex_lock(rdev-r600_blit.mutex); + rdev-r600_blit.vb_ib = NULL; + r = r600_blit_prepare_copy(rdev, num_pages * RADEON_GPU_PAGE_SIZE); + if (r) { + if (rdev-r600_blit.vb_ib) + radeon_ib_free(rdev, rdev-r600_blit.vb_ib); + mutex_unlock(rdev-r600_blit.mutex); + return r; + } r600_kms_blit_copy(rdev, src_offset, dst_offset, num_pages * RADEON_GPU_PAGE_SIZE); r600_blit_done_copy(rdev, fence); + mutex_unlock(rdev-r600_blit.mutex); return 0; } @@ -1860,15 +1861,7 @@ int r600_startup(struct radeon_device *rdev) return r; } r600_gpu_init(rdev); - - if (!rdev-r600_blit.shader_obj) { - r = r600_blit_init(rdev); - if (r) { - DRM_ERROR(radeon: failed blitter (%d).\n, r); - return r; - } - } - + /* pin copy shader into vram */ r = radeon_bo_reserve(rdev-r600_blit.shader_obj, false); if (unlikely(r != 0)) return r; @@ -1879,7 +1872,6 @@ int r600_startup(struct radeon_device *rdev) dev_err(rdev-dev, (%d) pin blit object failed\n, r); return r; } - /* Enable IRQ */ r = r600_irq_init(rdev); if (r) { @@ -2052,6 +2044,11 @@ int r600_init(struct radeon_device *rdev) if (r) return r; + r = r600_blit_init(rdev); + if (r) { + dev_err(rdev-dev, failed blitter (%d)\n, r); + return r; + } rdev-accel_working = true; r = r600_startup(rdev); if (r) { diff --git a/drivers/gpu/drm/radeon/r600_blit_kms.c b/drivers/gpu/drm/radeon/r600_blit_kms.c index 2bedce4..af1c3ca 100644 --- a/drivers/gpu/drm/radeon/r600_blit_kms.c +++ b/drivers/gpu/drm/radeon/r600_blit_kms.c @@ -449,6 +449,7 @@ int r600_blit_init(struct radeon_device *rdev) u32 packet2s[16]; int num_packet2s = 0; + mutex_init(rdev-r600_blit.mutex); rdev-r600_blit.state_offset = 0; if (rdev-family = CHIP_RV770) @@ -557,7 +558,8 @@ int r600_blit_prepare_copy(struct radeon_device *rdev, int size_bytes) int dwords_per_loop = 76, num_loops; r = r600_vb_ib_get(rdev); - WARN_ON(r); + if (r) + return r; /* set_render_target emits 2 extra dwords on rv6xx */ if (rdev-family CHIP_R600 rdev-family CHIP_RV770) @@ -583,7 +585,8 @@ int r600_blit_prepare_copy(struct radeon_device *rdev, int size_bytes) ring_size += 5; /* done copy */ ring_size += 7; /* fence emit for done copy */ r = radeon_ring_lock(rdev, ring_size); - WARN_ON(r); + if (r) + return r; set_default_state(rdev); /* 14 */ set_shaders(rdev); /* 26 */ diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index f7df1a7..2d5f2bf 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -416,6 +416,7 @@ struct r600_ih { }; struct r600_blit { + struct mutexmutex; struct radeon_bo*shader_obj; u64 shader_gpu_addr; u32 vs_offset, ps_offset; diff --git a/drivers/gpu/drm/radeon/rv770.c
Re: [PATCH] drm/radeon/kms: Bailout of blit if error happen protect with mutex
On Fri, Jan 22, 2010 at 9:24 AM, Jerome Glisse jgli...@redhat.com wrote: If an error happen in r600_blit_prepare_copy report it rather than WARNING and keeping execution. For instance if ib allocation failed we did just warn about but then latter tried to access NULL ib ptr causing oops. This patch also protect r600_copy_blit with a mutex as otherwise one process might overwrite blit temporary data with new one possibly leading to GPU lockup. Should partialy or totaly fix: https://bugzilla.redhat.com/show_bug.cgi?id=553279 Signed-off-by: Jerome Glisse jgli...@redhat.com --- drivers/gpu/drm/radeon/r600.c | 39 ++- drivers/gpu/drm/radeon/r600_blit_kms.c | 7 - drivers/gpu/drm/radeon/radeon.h | 1 + drivers/gpu/drm/radeon/rv770.c | 16 + 4 files changed, 30 insertions(+), 33 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index d0bd117..6e47e11 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -1788,23 +1788,24 @@ void r600_fence_ring_emit(struct radeon_device *rdev, radeon_ring_write(rdev, RB_INT_STAT); } -int r600_copy_dma(struct radeon_device *rdev, - uint64_t src_offset, - uint64_t dst_offset, - unsigned num_pages, - struct radeon_fence *fence) -{ - /* FIXME: implement */ - return 0; -} - int r600_copy_blit(struct radeon_device *rdev, uint64_t src_offset, uint64_t dst_offset, unsigned num_pages, struct radeon_fence *fence) { - r600_blit_prepare_copy(rdev, num_pages * RADEON_GPU_PAGE_SIZE); + int r; + + mutex_lock(rdev-r600_blit.mutex); + rdev-r600_blit.vb_ib = NULL; + r = r600_blit_prepare_copy(rdev, num_pages * RADEON_GPU_PAGE_SIZE); + if (r) { + if (rdev-r600_blit.vb_ib) + radeon_ib_free(rdev, rdev-r600_blit.vb_ib); + mutex_unlock(rdev-r600_blit.mutex); + return r; + } r600_kms_blit_copy(rdev, src_offset, dst_offset, num_pages * RADEON_GPU_PAGE_SIZE); r600_blit_done_copy(rdev, fence); + mutex_unlock(rdev-r600_blit.mutex); return 0; } @@ -1860,15 +1861,7 @@ int r600_startup(struct radeon_device *rdev) return r; } r600_gpu_init(rdev); - - if (!rdev-r600_blit.shader_obj) { - r = r600_blit_init(rdev); - if (r) { - DRM_ERROR(radeon: failed blitter (%d).\n, r); - return r; - } - } - + /* pin copy shader into vram */ r = radeon_bo_reserve(rdev-r600_blit.shader_obj, false); if (unlikely(r != 0)) return r; @@ -1879,7 +1872,6 @@ int r600_startup(struct radeon_device *rdev) dev_err(rdev-dev, (%d) pin blit object failed\n, r); return r; } - /* Enable IRQ */ r = r600_irq_init(rdev); if (r) { @@ -2052,6 +2044,11 @@ int r600_init(struct radeon_device *rdev) if (r) return r; + r = r600_blit_init(rdev); + if (r) { + dev_err(rdev-dev, failed blitter (%d)\n, r); + return r; + } rdev-accel_working = true; r = r600_startup(rdev); if (r) { diff --git a/drivers/gpu/drm/radeon/r600_blit_kms.c b/drivers/gpu/drm/radeon/r600_blit_kms.c index 2bedce4..af1c3ca 100644 --- a/drivers/gpu/drm/radeon/r600_blit_kms.c +++ b/drivers/gpu/drm/radeon/r600_blit_kms.c @@ -449,6 +449,7 @@ int r600_blit_init(struct radeon_device *rdev) u32 packet2s[16]; int num_packet2s = 0; + mutex_init(rdev-r600_blit.mutex); rdev-r600_blit.state_offset = 0; if (rdev-family = CHIP_RV770) @@ -557,7 +558,8 @@ int r600_blit_prepare_copy(struct radeon_device *rdev, int size_bytes) int dwords_per_loop = 76, num_loops; r = r600_vb_ib_get(rdev); - WARN_ON(r); + if (r) + return r; /* set_render_target emits 2 extra dwords on rv6xx */ if (rdev-family CHIP_R600 rdev-family CHIP_RV770) @@ -583,7 +585,8 @@ int r600_blit_prepare_copy(struct radeon_device *rdev, int size_bytes) ring_size += 5; /* done copy */ ring_size += 7; /* fence emit for done copy */ r = radeon_ring_lock(rdev, ring_size); - WARN_ON(r); + if (r) + return r; set_default_state(rdev); /* 14 */ set_shaders(rdev); /* 26 */ diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index f7df1a7..2d5f2bf 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -416,6 +416,7 @@ struct r600_ih { }; struct r600_blit { + struct mutex mutex;
[PATCH] drm/radeon/kms: Bailout of blit if error happen protect with mutex V2
If an error happen in r600_blit_prepare_copy report it rather than WARNING and keeping execution. For instance if ib allocation failed we did just warn about but then latter tried to access NULL ib ptr causing oops. This patch also protect r600_copy_blit with a mutex as otherwise one process might overwrite blit temporary data with new one possibly leading to GPU lockup. Should partialy or totaly fix: https://bugzilla.redhat.com/show_bug.cgi?id=553279 V2 failing blit initialization is not fatal, fallback to memcpy when this happen Signed-off-by: Jerome Glisse jgli...@redhat.com --- drivers/gpu/drm/radeon/r600.c | 54 drivers/gpu/drm/radeon/r600_blit_kms.c |8 +++- drivers/gpu/drm/radeon/radeon.h|1 + drivers/gpu/drm/radeon/rv770.c | 32 +-- 4 files changed, 49 insertions(+), 46 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index d0bd117..670dbc2 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -1788,23 +1788,24 @@ void r600_fence_ring_emit(struct radeon_device *rdev, radeon_ring_write(rdev, RB_INT_STAT); } -int r600_copy_dma(struct radeon_device *rdev, - uint64_t src_offset, - uint64_t dst_offset, - unsigned num_pages, - struct radeon_fence *fence) -{ - /* FIXME: implement */ - return 0; -} - int r600_copy_blit(struct radeon_device *rdev, uint64_t src_offset, uint64_t dst_offset, unsigned num_pages, struct radeon_fence *fence) { - r600_blit_prepare_copy(rdev, num_pages * RADEON_GPU_PAGE_SIZE); + int r; + + mutex_lock(rdev-r600_blit.mutex); + rdev-r600_blit.vb_ib = NULL; + r = r600_blit_prepare_copy(rdev, num_pages * RADEON_GPU_PAGE_SIZE); + if (r) { + if (rdev-r600_blit.vb_ib) + radeon_ib_free(rdev, rdev-r600_blit.vb_ib); + mutex_unlock(rdev-r600_blit.mutex); + return r; + } r600_kms_blit_copy(rdev, src_offset, dst_offset, num_pages * RADEON_GPU_PAGE_SIZE); r600_blit_done_copy(rdev, fence); + mutex_unlock(rdev-r600_blit.mutex); return 0; } @@ -1860,26 +1861,19 @@ int r600_startup(struct radeon_device *rdev) return r; } r600_gpu_init(rdev); - - if (!rdev-r600_blit.shader_obj) { - r = r600_blit_init(rdev); + /* pin copy shader into vram */ + if (rdev-r600_blit.shader_obj) { + r = radeon_bo_reserve(rdev-r600_blit.shader_obj, false); + if (unlikely(r != 0)) + return r; + r = radeon_bo_pin(rdev-r600_blit.shader_obj, RADEON_GEM_DOMAIN_VRAM, + rdev-r600_blit.shader_gpu_addr); + radeon_bo_unreserve(rdev-r600_blit.shader_obj); if (r) { - DRM_ERROR(radeon: failed blitter (%d).\n, r); + dev_err(rdev-dev, (%d) pin blit object failed\n, r); return r; } } - - r = radeon_bo_reserve(rdev-r600_blit.shader_obj, false); - if (unlikely(r != 0)) - return r; - r = radeon_bo_pin(rdev-r600_blit.shader_obj, RADEON_GEM_DOMAIN_VRAM, - rdev-r600_blit.shader_gpu_addr); - radeon_bo_unreserve(rdev-r600_blit.shader_obj); - if (r) { - dev_err(rdev-dev, (%d) pin blit object failed\n, r); - return r; - } - /* Enable IRQ */ r = r600_irq_init(rdev); if (r) { @@ -2062,6 +2056,12 @@ int r600_init(struct radeon_device *rdev) rdev-accel_working = false; } if (rdev-accel_working) { + r = r600_blit_init(rdev); + if (r) { + r600_blit_fini(rdev); + rdev-asic-copy = NULL; + dev_warn(rdev-dev, failed blitter (%d) falling back to memcpy\n, r); + } r = radeon_ib_pool_init(rdev); if (r) { dev_err(rdev-dev, IB initialization failed (%d).\n, r); diff --git a/drivers/gpu/drm/radeon/r600_blit_kms.c b/drivers/gpu/drm/radeon/r600_blit_kms.c index 2bedce4..77e7665 100644 --- a/drivers/gpu/drm/radeon/r600_blit_kms.c +++ b/drivers/gpu/drm/radeon/r600_blit_kms.c @@ -449,6 +449,7 @@ int r600_blit_init(struct radeon_device *rdev) u32 packet2s[16]; int num_packet2s = 0; + mutex_init(rdev-r600_blit.mutex); rdev-r600_blit.state_offset = 0; if (rdev-family = CHIP_RV770) @@ -523,6 +524,7 @@ void r600_blit_fini(struct radeon_device *rdev) radeon_bo_unreserve(rdev-r600_blit.shader_obj); } radeon_bo_unref(rdev-r600_blit.shader_obj); + rdev-r600_blit.shader_obj = NULL; }