[PATCH 1/3 v2] drm/v3d: Take a lock across GPU scheduler job creation and queuing.
Between creation and queueing of a job, you need to prevent any other job from being created and queued. Otherwise the scheduler's fences may be signaled out of seqno order. v2: move mutex unlock to the error label. Signed-off-by: Eric Anholt Fixes: 57692c94dcbe ("drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+") --- drivers/gpu/drm/v3d/v3d_drv.h | 5 + drivers/gpu/drm/v3d/v3d_gem.c | 4 2 files changed, 9 insertions(+) diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index a043ac3aae98..26005abd9c5d 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -85,6 +85,11 @@ struct v3d_dev { */ struct mutex reset_lock; + /* Lock taken when creating and pushing the GPU scheduler +* jobs, to keep the sched-fence seqnos in order. +*/ + struct mutex sched_lock; + struct { u32 num_allocated; u32 pages_allocated; diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c index b513f9189caf..269fe16379c0 100644 --- a/drivers/gpu/drm/v3d/v3d_gem.c +++ b/drivers/gpu/drm/v3d/v3d_gem.c @@ -550,6 +550,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, if (ret) goto fail; + mutex_lock(&v3d->sched_lock); if (exec->bin.start != exec->bin.end) { ret = drm_sched_job_init(&exec->bin.base, &v3d->queue[V3D_BIN].sched, @@ -576,6 +577,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, kref_get(&exec->refcount); /* put by scheduler job completion */ drm_sched_entity_push_job(&exec->render.base, &v3d_priv->sched_entity[V3D_RENDER]); + mutex_unlock(&v3d->sched_lock); v3d_attach_object_fences(exec); @@ -594,6 +596,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, return 0; fail_unreserve: + mutex_unlock(&v3d->sched_lock); v3d_unlock_bo_reservations(dev, exec, &acquire_ctx); fail: v3d_exec_put(exec); @@ -615,6 +618,7 @@ v3d_gem_init(struct drm_device *dev) spin_lock_init(&v3d->job_lock); mutex_init(&v3d->bo_lock); mutex_init(&v3d->reset_lock); + mutex_init(&v3d->sched_lock); /* Note: We don't allocate address 0. Various bits of HW * treat 0 as special, such as the occlusion query counters -- 2.17.0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH xf86-video-amdgpu 0/7] Enabling Color Management - Round 3
On 2018-06-06 01:03 PM, Michel Dänzer wrote: On 2018-06-06 06:01 PM, Michel Dänzer wrote: On 2018-06-01 06:03 PM, sunpeng...@amd.com wrote: From: "Leo (Sunpeng) Li" This ended up being different enough from v2 to warrant a new patchset. Per Michel's suggestions, there have been various optimizations and cleanups. Here's what's changed: * Cache DRM color management property IDs at pre-init, * instead of querying DRM each time we need to modify color properties. * Remove drmmode_update_cm_props(). * Update color properties in drmmode_output_get_property() instead. * This also makes old calls to update_cm_props redundant. * Get rid of fake CRTCs. * Previously, we were allocating a fake CRTC to configure color props on outputs that don't have a CRTC. * Instead, rr_configure_and_change_cm_property() can be easily modified to accept NULL CRTCs. * Drop patches to persist color properties across DPMS events. * Kernel driver should be patched instead: https://lists.freedesktop.org/archives/amd-gfx/2018-May/022744.html * Color props including legacy gamma now persist across crtc dpms. * Non-legacy props now persist across output dpms and hotplug, as long as the same CRTC remains attached to that output. And some smaller improvements: * Change CTM to be 32-bit format instead of 16-bit. * This requires clients to ensure that each 32-bit element is padded to be long-sized, since libXrandr parses 32-bit format as long-typed. * Optimized color management init during CRTC init. * Query DRM once for the list of properties, instead of twice. Sounds good. I'll be going through the patches in detail from now on, but I don't know yet when I'll be able to finish the review. Meanwhile, heads up on two issues I discovered while smoke-testing the series (which are sort of related, but occur even without this series): Running Xorg in depth 30[0] results in completely wrong colours (everything has a red tint) with current kernels. I think this is because DC now preserves the gamma LUT values, but xf86-video-amdgpu never sets them at depth 30, so the hardware is still using values for 24-bit RGB. Actually, looks like I made a mistake in my testing before; this issue only occurs as of patch 6 of this series. Hi Michel, I'll look into this. Thanks for testing :) Leo ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH xf86-video-amdgpu 0/7] Enabling Color Management - Round 3
On 2018-06-06 06:01 PM, Michel Dänzer wrote: > On 2018-06-01 06:03 PM, sunpeng...@amd.com wrote: >> From: "Leo (Sunpeng) Li" >> >> This ended up being different enough from v2 to warrant a new patchset. Per >> Michel's suggestions, there have been various optimizations and cleanups. >> Here's what's changed: >> >> * Cache DRM color management property IDs at pre-init, >> * instead of querying DRM each time we need to modify color properties. >> >> * Remove drmmode_update_cm_props(). >> * Update color properties in drmmode_output_get_property() instead. >> * This also makes old calls to update_cm_props redundant. >> >> * Get rid of fake CRTCs. >> * Previously, we were allocating a fake CRTC to configure color props on >> outputs that don't have a CRTC. >> * Instead, rr_configure_and_change_cm_property() can be easily modified >> to >> accept NULL CRTCs. >> >> * Drop patches to persist color properties across DPMS events. >> * Kernel driver should be patched instead: >> https://lists.freedesktop.org/archives/amd-gfx/2018-May/022744.html >> * Color props including legacy gamma now persist across crtc dpms. >> * Non-legacy props now persist across output dpms and hotplug, as long >> as the same CRTC remains attached to that output. >> >> And some smaller improvements: >> >> * Change CTM to be 32-bit format instead of 16-bit. >> * This requires clients to ensure that each 32-bit element is padded to >> be >> long-sized, since libXrandr parses 32-bit format as long-typed. >> >> * Optimized color management init during CRTC init. >> * Query DRM once for the list of properties, instead of twice. > > Sounds good. I'll be going through the patches in detail from now on, > but I don't know yet when I'll be able to finish the review. > > > Meanwhile, heads up on two issues I discovered while smoke-testing the > series (which are sort of related, but occur even without this series): > > > Running Xorg in depth 30[0] results in completely wrong colours > (everything has a red tint) with current kernels. I think this is > because DC now preserves the gamma LUT values, but xf86-video-amdgpu > never sets them at depth 30, so the hardware is still using values for > 24-bit RGB. Actually, looks like I made a mistake in my testing before; this issue only occurs as of patch 6 of this series. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH xf86-video-amdgpu 0/7] Enabling Color Management - Round 3
Hi Leo, On 2018-06-01 06:03 PM, sunpeng...@amd.com wrote: > From: "Leo (Sunpeng) Li" > > This ended up being different enough from v2 to warrant a new patchset. Per > Michel's suggestions, there have been various optimizations and cleanups. > Here's what's changed: > > * Cache DRM color management property IDs at pre-init, > * instead of querying DRM each time we need to modify color properties. > > * Remove drmmode_update_cm_props(). > * Update color properties in drmmode_output_get_property() instead. > * This also makes old calls to update_cm_props redundant. > > * Get rid of fake CRTCs. > * Previously, we were allocating a fake CRTC to configure color props on > outputs that don't have a CRTC. > * Instead, rr_configure_and_change_cm_property() can be easily modified to > accept NULL CRTCs. > > * Drop patches to persist color properties across DPMS events. > * Kernel driver should be patched instead: > https://lists.freedesktop.org/archives/amd-gfx/2018-May/022744.html > * Color props including legacy gamma now persist across crtc dpms. > * Non-legacy props now persist across output dpms and hotplug, as long > as the same CRTC remains attached to that output. > > And some smaller improvements: > > * Change CTM to be 32-bit format instead of 16-bit. > * This requires clients to ensure that each 32-bit element is padded to be > long-sized, since libXrandr parses 32-bit format as long-typed. > > * Optimized color management init during CRTC init. > * Query DRM once for the list of properties, instead of twice. Sounds good. I'll be going through the patches in detail from now on, but I don't know yet when I'll be able to finish the review. Meanwhile, heads up on two issues I discovered while smoke-testing the series (which are sort of related, but occur even without this series): Running Xorg in depth 30[0] results in completely wrong colours (everything has a red tint) with current kernels. I think this is because DC now preserves the gamma LUT values, but xf86-video-amdgpu never sets them at depth 30, so the hardware is still using values for 24-bit RGB. Can you look into making xf86-video-amdgpu set the LUT values at depth 30 as well? Ideally in a way which doesn't require all patches in this series, so that it could be backported for a 18.0.2 point release if necessary. (Similarly for skipping drmmode_cursor_gamma when the kernel supports the new colour management properties, to fix https://bugs.freedesktop.org/106578) Trying to run Xorg in depth 16 or 8[0] results in failure to set any mode: [56.138] (EE) AMDGPU(0): failed to set mode: Invalid argument [56.138] (WW) AMDGPU(0): Failed to set mode on CRTC 0 [56.172] (EE) AMDGPU(0): Failed to enable any CRTC Works fine with amdgpu.dc=0. This has been broken at least since the 4.16 development cycle. [0] You can change Xorg's colour depth either via -depth on its command line, or via the xorg.conf screen section: Section "Screen" Identifier "Screen0" DefaultDepth 30 # or 16 or 8 EndSection -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 1/5] dma_buf: remove device parameter from attach callback
Just a gentle ping. Daniel, Chris and all the other usual suspects for infrastructure stuff: What do you think about that? The cleanup patches are rather obvious correct, but #3 could result in some fallout. I really think it is the right thing in the long term. Regards, Christian. Am 01.06.2018 um 14:00 schrieb Christian König: The device parameter is completely unused because it is available in the attachment structure as well. Signed-off-by: Christian König --- drivers/dma-buf/dma-buf.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c | 3 +-- drivers/gpu/drm/drm_prime.c | 3 +-- drivers/gpu/drm/udl/udl_dmabuf.c | 1 - drivers/gpu/drm/vmwgfx/vmwgfx_prime.c | 1 - drivers/media/common/videobuf2/videobuf2-dma-contig.c | 2 +- drivers/media/common/videobuf2/videobuf2-dma-sg.c | 2 +- drivers/media/common/videobuf2/videobuf2-vmalloc.c| 2 +- include/drm/drm_prime.h | 2 +- include/linux/dma-buf.h | 3 +-- 10 files changed, 8 insertions(+), 13 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index d78d5fc173dc..e99a8d19991b 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -568,7 +568,7 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, mutex_lock(&dmabuf->lock); if (dmabuf->ops->attach) { - ret = dmabuf->ops->attach(dmabuf, dev, attach); + ret = dmabuf->ops->attach(dmabuf, attach); if (ret) goto err_attach; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c index 4683626b065f..f1500f1ec0f5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c @@ -133,7 +133,6 @@ amdgpu_gem_prime_import_sg_table(struct drm_device *dev, } static int amdgpu_gem_map_attach(struct dma_buf *dma_buf, -struct device *target_dev, struct dma_buf_attachment *attach) { struct drm_gem_object *obj = dma_buf->priv; @@ -141,7 +140,7 @@ static int amdgpu_gem_map_attach(struct dma_buf *dma_buf, struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev); long r; - r = drm_gem_map_attach(dma_buf, target_dev, attach); + r = drm_gem_map_attach(dma_buf, attach); if (r) return r; diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 7856a9b3f8a8..4a3a232fea67 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -186,7 +186,6 @@ static int drm_prime_lookup_buf_handle(struct drm_prime_file_private *prime_fpri /** * drm_gem_map_attach - dma_buf attach implementation for GEM * @dma_buf: buffer to attach device to - * @target_dev: not used * @attach: buffer attachment data * * Allocates &drm_prime_attachment and calls &drm_driver.gem_prime_pin for @@ -195,7 +194,7 @@ static int drm_prime_lookup_buf_handle(struct drm_prime_file_private *prime_fpri * * Returns 0 on success, negative error code on failure. */ -int drm_gem_map_attach(struct dma_buf *dma_buf, struct device *target_dev, +int drm_gem_map_attach(struct dma_buf *dma_buf, struct dma_buf_attachment *attach) { struct drm_prime_attachment *prime_attach; diff --git a/drivers/gpu/drm/udl/udl_dmabuf.c b/drivers/gpu/drm/udl/udl_dmabuf.c index 2867ed155ff6..5fdc8bdc2026 100644 --- a/drivers/gpu/drm/udl/udl_dmabuf.c +++ b/drivers/gpu/drm/udl/udl_dmabuf.c @@ -29,7 +29,6 @@ struct udl_drm_dmabuf_attachment { }; static int udl_attach_dma_buf(struct dma_buf *dmabuf, - struct device *dev, struct dma_buf_attachment *attach) { struct udl_drm_dmabuf_attachment *udl_attach; diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_prime.c b/drivers/gpu/drm/vmwgfx/vmwgfx_prime.c index 0d42a46521fc..fbffb37ccf42 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_prime.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_prime.c @@ -40,7 +40,6 @@ */ static int vmw_prime_map_attach(struct dma_buf *dma_buf, - struct device *target_dev, struct dma_buf_attachment *attach) { return -ENOSYS; diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c index f1178f6f434d..12d0072c52c2 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c @@ -222,7 +222,7 @@ struct vb2_dc_attachment { enum dma_data_direction dma_dir; }; -static int vb2_dc_dmabuf_ops_attach(struct dma_buf *dbuf, struct device *dev, +static int vb2_dc_dmabuf_ops_attach(struct dma_buf *dbuf,
Re: [PATCH 3/3] drm/amd/amdgpu: Fix NULL pointer OOPS during S3
NAK as well. mem->mm_node can't be NULL on a correctly allocated BO. You are running into a BO corruption here and trying to work around by mitigating the effect and not fixing the root problem. Regards, Christian. Am 06.06.2018 um 11:25 schrieb Pratik Vishwakarma: Fixes NULL pointer dereference in amdgpu_ttm_copy_mem_to_mem BUG: unable to handle kernel NULL pointer dereference at 0010 IP: amdgpu_ttm_copy_mem_to_mem+0x85/0x40c Workqueue: events_unbound async_run_entry_fn Call Trace: ? _raw_spin_unlock+0xe/0x20 ? ttm_check_swapping+0x4e/0x72 ? ttm_mem_global_reserve.constprop.4+0xb1/0xc0 amdgpu_move_blit+0x80/0xe2 amdgpu_bo_move+0x114/0x155 ttm_bo_handle_move_mem+0x1f7/0x34a ? ttm_bo_mem_space+0x162/0x38e ? dev_vprintk_emit+0x10a/0x1f2 ttm_bo_evict+0x13e/0x2e9 ? do_wait_for_common+0x151/0x187 ttm_mem_evict_first+0x136/0x181 ttm_bo_force_list_clean+0x78/0x10f amdgpu_device_suspend+0x167/0x210 pci_pm_suspend+0x12a/0x1a5 ? pci_dev_driver+0x36/0x36 dpm_run_callback+0x59/0xbf __device_suspend+0x215/0x33f async_suspend+0x1f/0x5c async_run_entry_fn+0x3d/0xd2 process_one_work+0x1b0/0x314 worker_thread+0x1cb/0x2c1 ? create_worker+0x1da/0x1da kthread+0x156/0x15e ? kthread_flush_work+0xea/0xea ret_from_fork+0x22/0x40 Signed-off-by: Pratik Vishwakarma --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 57d4da6..f9de429 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -414,12 +414,16 @@ int amdgpu_ttm_copy_mem_to_mem(struct amdgpu_device *adev, return -EINVAL; } + if (!src->mem->mm_node) + return -EINVAL; src_mm = amdgpu_find_mm_node(src->mem, &src->offset); src_node_start = amdgpu_mm_node_addr(src->bo, src_mm, src->mem) + src->offset; src_node_size = (src_mm->size << PAGE_SHIFT) - src->offset; src_page_offset = src_node_start & (PAGE_SIZE - 1); + if (!dst->mem->mm_node) + return -EINVAL; dst_mm = amdgpu_find_mm_node(dst->mem, &dst->offset); dst_node_start = amdgpu_mm_node_addr(dst->bo, dst_mm, dst->mem) + dst->offset; ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 2/3] drm/amd/amdgpu: Fix crash in amdgpu_bo_reserve
NAK, when bo->tbo.resv is NULL then the BO is corrupted (or already released). Please find the root cause of that corruption or freed memory access instead of adding such crude workarounds. Regards, Christian. Am 06.06.2018 um 11:25 schrieb Pratik Vishwakarma: Fixes null pointer access in ww_mutex_lock where lock->base is NULL Crash dump is as follows: Call Trace: ww_mutex_lock+0x3a/0x8e amdgpu_bo_reserve+0x40/0x87 amdgpu_device_suspend+0xf4/0x210 pci_pm_suspend+0x12a/0x1a5 ? pci_dev_driver+0x36/0x36 dpm_run_callback+0x59/0xbf __device_suspend+0x215/0x33f async_suspend+0x1f/0x5c async_run_entry_fn+0x3d/0xd2 process_one_work+0x1b0/0x314 worker_thread+0x1cb/0x2c1 ? create_worker+0x1da/0x1da kthread+0x156/0x15e ? kthread_flush_work+0xea/0xea ret_from_fork+0x22/0x40 Signed-off-by: Pratik Vishwakarma --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h index 7317480..c9df7ae 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h @@ -152,6 +152,8 @@ static inline int amdgpu_bo_reserve(struct amdgpu_bo *bo, bool no_intr) struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev); int r; + if (&(bo->tbo.resv->lock) == NULL) + return -EINVAL; r = ttm_bo_reserve(&bo->tbo, !no_intr, false, NULL); if (unlikely(r != 0)) { if (r != -ERESTARTSYS) ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 2/3] drm/amd/amdgpu: Fix crash in amdgpu_bo_reserve
Fixes null pointer access in ww_mutex_lock where lock->base is NULL Crash dump is as follows: Call Trace: ww_mutex_lock+0x3a/0x8e amdgpu_bo_reserve+0x40/0x87 amdgpu_device_suspend+0xf4/0x210 pci_pm_suspend+0x12a/0x1a5 ? pci_dev_driver+0x36/0x36 dpm_run_callback+0x59/0xbf __device_suspend+0x215/0x33f async_suspend+0x1f/0x5c async_run_entry_fn+0x3d/0xd2 process_one_work+0x1b0/0x314 worker_thread+0x1cb/0x2c1 ? create_worker+0x1da/0x1da kthread+0x156/0x15e ? kthread_flush_work+0xea/0xea ret_from_fork+0x22/0x40 Signed-off-by: Pratik Vishwakarma --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h index 7317480..c9df7ae 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h @@ -152,6 +152,8 @@ static inline int amdgpu_bo_reserve(struct amdgpu_bo *bo, bool no_intr) struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev); int r; + if (&(bo->tbo.resv->lock) == NULL) + return -EINVAL; r = ttm_bo_reserve(&bo->tbo, !no_intr, false, NULL); if (unlikely(r != 0)) { if (r != -ERESTARTSYS) -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 3/3] drm/amd/amdgpu: Fix NULL pointer OOPS during S3
Fixes NULL pointer dereference in amdgpu_ttm_copy_mem_to_mem BUG: unable to handle kernel NULL pointer dereference at 0010 IP: amdgpu_ttm_copy_mem_to_mem+0x85/0x40c Workqueue: events_unbound async_run_entry_fn Call Trace: ? _raw_spin_unlock+0xe/0x20 ? ttm_check_swapping+0x4e/0x72 ? ttm_mem_global_reserve.constprop.4+0xb1/0xc0 amdgpu_move_blit+0x80/0xe2 amdgpu_bo_move+0x114/0x155 ttm_bo_handle_move_mem+0x1f7/0x34a ? ttm_bo_mem_space+0x162/0x38e ? dev_vprintk_emit+0x10a/0x1f2 ttm_bo_evict+0x13e/0x2e9 ? do_wait_for_common+0x151/0x187 ttm_mem_evict_first+0x136/0x181 ttm_bo_force_list_clean+0x78/0x10f amdgpu_device_suspend+0x167/0x210 pci_pm_suspend+0x12a/0x1a5 ? pci_dev_driver+0x36/0x36 dpm_run_callback+0x59/0xbf __device_suspend+0x215/0x33f async_suspend+0x1f/0x5c async_run_entry_fn+0x3d/0xd2 process_one_work+0x1b0/0x314 worker_thread+0x1cb/0x2c1 ? create_worker+0x1da/0x1da kthread+0x156/0x15e ? kthread_flush_work+0xea/0xea ret_from_fork+0x22/0x40 Signed-off-by: Pratik Vishwakarma --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 57d4da6..f9de429 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -414,12 +414,16 @@ int amdgpu_ttm_copy_mem_to_mem(struct amdgpu_device *adev, return -EINVAL; } + if (!src->mem->mm_node) + return -EINVAL; src_mm = amdgpu_find_mm_node(src->mem, &src->offset); src_node_start = amdgpu_mm_node_addr(src->bo, src_mm, src->mem) + src->offset; src_node_size = (src_mm->size << PAGE_SHIFT) - src->offset; src_page_offset = src_node_start & (PAGE_SIZE - 1); + if (!dst->mem->mm_node) + return -EINVAL; dst_mm = amdgpu_find_mm_node(dst->mem, &dst->offset); dst_node_start = amdgpu_mm_node_addr(dst->bo, dst_mm, dst->mem) + dst->offset; -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 0/3] Fix S3 entry crashes
This patch series resolves crashes observed during S3 entry. First patch removes the cursor BO hack which causes crashes mentioned in patches 2 and 3. Patches 2 and 3 add NULL checks to prevent crashing the system. Pratik Vishwakarma (3): drm/amd/display: Remove cursor hack for S3 drm/amd/amdgpu: Fix crash in amdgpu_bo_reserve drm/amd/amdgpu: Fix NULL pointer OOPS during S3 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h| 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 17 + drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 11 --- 3 files changed, 15 insertions(+), 15 deletions(-) -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 1/3] drm/amd/display: remove cursor hack for S3
cursor_bo operations cause crash during S3 entry. On cursor movement between displays during S3 cycles the system crashes on S3 entry. These crashes are no more seen with this patch applied. Also as per the comment just above the code that this patch removes "IN 4.10 kernel this code should be removed and amdgpu_device_suspend code touching fram buffers should be avoided for DC" Signed-off-by: Pratik Vishwakarma --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 11 --- 1 file changed, 11 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index a2009d5..8c31abf 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -3138,17 +3138,6 @@ static int dm_plane_helper_prepare_fb(struct drm_plane *plane, } } - /* It's a hack for s3 since in 4.9 kernel filter out cursor buffer -* prepare and cleanup in drm_atomic_helper_prepare_planes -* and drm_atomic_helper_cleanup_planes because fb doens't in s3. -* IN 4.10 kernel this code should be removed and amdgpu_device_suspend -* code touching fram buffers should be avoided for DC. -*/ - if (plane->type == DRM_PLANE_TYPE_CURSOR) { - struct amdgpu_crtc *acrtc = to_amdgpu_crtc(new_state->crtc); - - acrtc->cursor_bo = obj; - } return 0; } -- 1.9.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 1/2] drm/scheduler: Rename cleanup functions.
Am Dienstag, den 05.06.2018, 13:02 -0400 schrieb Andrey Grodzovsky: > Everything in the flush code path (i.e. waiting for SW queue > to become empty) names with *_flush() > and everything in the release code path names *_fini() > > This patch also effect the amdgpu and etnaviv drivers which > use those functions. > > > Signed-off-by: Andrey Grodzovsky > > Suggested-by: Christian König > --- [...] > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c > b/drivers/gpu/drm/etnaviv/etnaviv_drv.c > index 23e73c2..3dff4d0 100644 > --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c > +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c > @@ -140,7 +140,7 @@ static void etnaviv_postclose(struct drm_device *dev, > struct drm_file *file) > > gpu->lastctx = NULL; > > mutex_unlock(&gpu->lock); > > > - drm_sched_entity_fini(&gpu->sched, > > + drm_sched_entity_destroy(&gpu->sched, > &ctx->sched_entity[i]); Style nit: this disaligns the second row of parameters to the opening parenthesis where it was previously aligned. I would prefer if the second line is also changed to keep the alignment. Acked-by: Lucas Stach ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 1/3] drm/v3d: Take a lock across GPU scheduler job creation and queuing.
Am 06.06.2018 um 10:46 schrieb Lucas Stach: Am Dienstag, den 05.06.2018, 12:03 -0700 schrieb Eric Anholt: Between creation and queueing of a job, you need to prevent any other job from being created and queued. Otherwise the scheduler's fences may be signaled out of seqno order. Signed-off-by: Eric Anholt Fixes: 57692c94dcbe ("drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+") --- ccing amd-gfx due to interaction of this series with the scheduler. drivers/gpu/drm/v3d/v3d_drv.h | 5 + drivers/gpu/drm/v3d/v3d_gem.c | 11 +-- 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index a043ac3aae98..26005abd9c5d 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -85,6 +85,11 @@ struct v3d_dev { */ struct mutex reset_lock; + /* Lock taken when creating and pushing the GPU scheduler + * jobs, to keep the sched-fence seqnos in order. + */ + struct mutex sched_lock; + struct { u32 num_allocated; u32 pages_allocated; diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c index b513f9189caf..9ea83bdb9a30 100644 --- a/drivers/gpu/drm/v3d/v3d_gem.c +++ b/drivers/gpu/drm/v3d/v3d_gem.c @@ -550,13 +550,16 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, if (ret) goto fail; + mutex_lock(&v3d->sched_lock); if (exec->bin.start != exec->bin.end) { ret = drm_sched_job_init(&exec->bin.base, &v3d->queue[V3D_BIN].sched, &v3d_priv->sched_entity[V3D_BIN], v3d_priv); - if (ret) + if (ret) { + mutex_unlock(&v3d->sched_lock); goto fail_unreserve; I don't see any path where you would go to fail_unreserve with the mutex not yet locked, so you could just fold the mutex_unlock into this error path for a bit less code duplication. Otherwise this looks fine. Yeah, agree that could be cleaned up. I can't judge the correctness of the driver, but at least the scheduler handling looks good to me. Regards, Christian. Regards, Lucas + } exec->bin_done_fence = dma_fence_get(&exec->bin.base.s_fence->finished); @@ -570,12 +573,15 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, &v3d->queue[V3D_RENDER].sched, &v3d_priv->sched_entity[V3D_RENDER], v3d_priv); - if (ret) + if (ret) { + mutex_unlock(&v3d->sched_lock); goto fail_unreserve; + } kref_get(&exec->refcount); /* put by scheduler job completion */ drm_sched_entity_push_job(&exec->render.base, &v3d_priv->sched_entity[V3D_RENDER]); + mutex_unlock(&v3d->sched_lock); v3d_attach_object_fences(exec); @@ -615,6 +621,7 @@ v3d_gem_init(struct drm_device *dev) spin_lock_init(&v3d->job_lock); mutex_init(&v3d->bo_lock); mutex_init(&v3d->reset_lock); + mutex_init(&v3d->sched_lock); /* Note: We don't allocate address 0. Various bits of HW * treat 0 as special, such as the occlusion query counters ___ dri-devel mailing list dri-de...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 1/3] drm/v3d: Take a lock across GPU scheduler job creation and queuing.
Am Dienstag, den 05.06.2018, 12:03 -0700 schrieb Eric Anholt: > Between creation and queueing of a job, you need to prevent any other > job from being created and queued. Otherwise the scheduler's fences > may be signaled out of seqno order. > > > Signed-off-by: Eric Anholt > Fixes: 57692c94dcbe ("drm/v3d: Introduce a new DRM driver for Broadcom V3D > V3.x+") > --- > > ccing amd-gfx due to interaction of this series with the scheduler. > > drivers/gpu/drm/v3d/v3d_drv.h | 5 + > drivers/gpu/drm/v3d/v3d_gem.c | 11 +-- > 2 files changed, 14 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h > index a043ac3aae98..26005abd9c5d 100644 > --- a/drivers/gpu/drm/v3d/v3d_drv.h > +++ b/drivers/gpu/drm/v3d/v3d_drv.h > @@ -85,6 +85,11 @@ struct v3d_dev { > > */ > > struct mutex reset_lock; > > > + /* Lock taken when creating and pushing the GPU scheduler > > + * jobs, to keep the sched-fence seqnos in order. > > + */ > > + struct mutex sched_lock; > + > > struct { > > u32 num_allocated; > > u32 pages_allocated; > diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c > index b513f9189caf..9ea83bdb9a30 100644 > --- a/drivers/gpu/drm/v3d/v3d_gem.c > +++ b/drivers/gpu/drm/v3d/v3d_gem.c > @@ -550,13 +550,16 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, > > if (ret) > > goto fail; > > > + mutex_lock(&v3d->sched_lock); > > if (exec->bin.start != exec->bin.end) { > > ret = drm_sched_job_init(&exec->bin.base, > > &v3d->queue[V3D_BIN].sched, > > &v3d_priv->sched_entity[V3D_BIN], > > v3d_priv); > > - if (ret) > > + if (ret) { > > + mutex_unlock(&v3d->sched_lock); > goto fail_unreserve; I don't see any path where you would go to fail_unreserve with the mutex not yet locked, so you could just fold the mutex_unlock into this error path for a bit less code duplication. Otherwise this looks fine. Regards, Lucas > + } > > > exec->bin_done_fence = > > dma_fence_get(&exec->bin.base.s_fence->finished); > @@ -570,12 +573,15 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, > > &v3d->queue[V3D_RENDER].sched, > > &v3d_priv->sched_entity[V3D_RENDER], > > v3d_priv); > > - if (ret) > > + if (ret) { > > + mutex_unlock(&v3d->sched_lock); > > goto fail_unreserve; > > + } > > > kref_get(&exec->refcount); /* put by scheduler job completion */ > > drm_sched_entity_push_job(&exec->render.base, > > &v3d_priv->sched_entity[V3D_RENDER]); > > + mutex_unlock(&v3d->sched_lock); > > > v3d_attach_object_fences(exec); > > @@ -615,6 +621,7 @@ v3d_gem_init(struct drm_device *dev) > > spin_lock_init(&v3d->job_lock); > > mutex_init(&v3d->bo_lock); > > mutex_init(&v3d->reset_lock); > > + mutex_init(&v3d->sched_lock); > > > /* Note: We don't allocate address 0. Various bits of HW > > * treat 0 as special, such as the occlusion query counters ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 1/2] drm/scheduler: Rename cleanup functions.
Am 06.06.2018 um 09:03 schrieb Michel Dänzer: On 2018-06-05 07:02 PM, Andrey Grodzovsky wrote: Everything in the flush code path (i.e. waiting for SW queue to become empty) names with *_flush() and everything in the release code path names *_fini() This patch also effect the amdgpu and etnaviv drivers which use those functions. Signed-off-by: Andrey Grodzovsky Suggested-by: Christian König [...] diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c b/drivers/gpu/drm/etnaviv/etnaviv_drv.c index 23e73c2..3dff4d0 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c @@ -140,7 +140,7 @@ static void etnaviv_postclose(struct drm_device *dev, struct drm_file *file) gpu->lastctx = NULL; mutex_unlock(&gpu->lock); - drm_sched_entity_fini(&gpu->sched, + drm_sched_entity_destroy(&gpu->sched, &ctx->sched_entity[i]); } } The drm-next tree for 4.18 has a new v3d driver, which also uses drm_sched_entity_fini. Please either make sure to merge this change via a tree which has the v3d driver, and fix it up as well, or don't do the fini => destroy rename. I think we should just wait for the next rebase of amd-staging-drm-next before pushing this cleanup. Alex was already preparing that before his vacation, so that should happen shortly after he's back. Apart from that looks good to me, Christian. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000010da43f last fence id 0x00000000010da52d on ring 0)
Am 05.06.2018 um 16:44 schrieb Borislav Petkov: Hi guys, X just froze here ontop of 4.17-rc7+ tip/master (kernel is from last week) with the splat at the end. Box is a x470 chipset with Ryzen 2700X. GPU gets detected as [7.440971] [drm] radeon kernel modesetting enabled. [7.441220] [drm] initializing kernel modesetting (RV635 0x1002:0x9598 0x1043:0x01DA 0x00). ... in the PCIe slot with two monitors connected to it. radeon firmware is Version: 20170823-1 What practically happened is X froze and got restarted after the GPU reset. It seems to be ok now, as I'm typing in it. Thoughts? Well what did you do to trigger the lockup? Looks like an application send something to the hardware to crash the GFX block. Christian. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000010da43f last fence id 0x00000000010da52d on ring 0)
On Tue, Jun 05, 2018 at 04:44:04PM +0200, Borislav Petkov wrote: > Hi guys, > > X just froze here ontop of 4.17-rc7+ tip/master (kernel is from last > week) with the splat at the end. > > Box is a x470 chipset with Ryzen 2700X. > > GPU gets detected as > > [7.440971] [drm] radeon kernel modesetting enabled. > [7.441220] [drm] initializing kernel modesetting (RV635 0x1002:0x9598 > 0x1043:0x01DA 0x00). > [7.441328] ATOM BIOS: 9598.10.88.0.3.AS05 > [7.441395] radeon :1d:00.0: VRAM: 512M 0x - > 0x1FFF (512M used) > [7.441464] radeon :1d:00.0: GTT: 512M 0x2000 - > 0x3FFF > [7.441531] [drm] Detected VRAM RAM=512M, BAR=256M > [7.441588] [drm] RAM width 128bits DDR > [7.441690] [TTM] Zone kernel: Available graphics memory: 16462214 kiB > [7.441751] [TTM] Zone dma32: Available graphics memory: 2097152 kiB > [7.441811] [TTM] Initializing pool allocator > [7.441868] [TTM] Initializing DMA pool allocator > [7.441934] [drm] radeon: 512M of VRAM memory ready > [7.441990] [drm] radeon: 512M of GTT memory ready. > [7.442050] [drm] Loading RV635 Microcode > [7.442865] [drm] Internal thermal controller without fan control > [7.442940] [drm] radeon: power management initialized > [7.443222] [drm] GART: num cpu pages 131072, num gpu pages 131072 > [7.443487] [drm] enabling PCIE gen 2 link speeds, disable with > radeon.pcie_gen2=0 > [7.477319] [drm] PCIE GART of 512M enabled (table at 0x00142000). > [7.477400] radeon :1d:00.0: WB enabled > [7.477455] radeon :1d:00.0: fence driver on ring 0 use gpu addr > 0x2c00 and cpu addr 0x(ptrval) > [7.477708] radeon :1d:00.0: fence driver on ring 5 use gpu addr > 0x000521d0 and cpu addr 0x(ptrval) > [7.48] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). > [7.477836] [drm] Driver supports precise vblank timestamp query. > [7.477896] radeon :1d:00.0: radeon: MSI limited to 32-bit > [7.477990] radeon :1d:00.0: radeon: using MSI. > [7.478062] [drm] radeon: irq initialized. > [7.509056] [drm] ring test on 0 succeeded in 0 usecs > [7.683793] [drm] ring test on 5 succeeded in 1 usecs > [7.683853] [drm] UVD initialized successfully. > [7.684009] [drm] ib test on ring 0 succeeded in 0 usecs > [8.348466] [drm] ib test on ring 5 succeeded > [8.348921] [drm] Radeon Display Connectors > [8.348978] [drm] Connector 0: > [8.349031] [drm] DVI-I-1 > [8.349082] [drm] HPD1 > [8.349135] [drm] DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c > 0x7e5c > [8.349200] [drm] Encoders: > [8.349252] [drm] DFP1: INTERNAL_UNIPHY > [8.349308] [drm] CRT2: INTERNAL_KLDSCP_DAC2 > [8.349364] [drm] Connector 1: > [8.349416] [drm] DIN-1 > [8.349467] [drm] Encoders: > [8.349520] [drm] TV1: INTERNAL_KLDSCP_DAC2 > [8.349576] [drm] Connector 2: > [8.349628] [drm] DVI-I-2 > [8.349680] [drm] HPD2 > [8.349732] [drm] DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c > 0x7e4c > [8.349797] [drm] Encoders: > [8.349849] [drm] CRT1: INTERNAL_KLDSCP_DAC1 > [8.349905] [drm] DFP2: INTERNAL_KLDSCP_LVTMA > [8.430521] [drm] fb mappable at 0xE0243000 > [8.430575] [drm] vram apper at 0xE000 > [8.431194] [drm] size 9216000 > [8.431245] [drm] fb depth is 24 > [8.431295] [drm]pitch is 7680 > [8.431406] fbcon: radeondrmfb (fb0) is primary device > [8.496928] Console: switching to colour frame buffer device 240x75 > [8.501851] radeon :1d:00.0: fb0: radeondrmfb frame buffer device > [8.520179] [drm] Initialized radeon 2.50.0 20080528 for :1d:00.0 on > minor 0 > > in the PCIe slot with two monitors connected to it. radeon firmware is > > Version: 20170823-1 > > What practically happened is X froze and got restarted after the GPU > reset. It seems to be ok now, as I'm typing in it. > > Thoughts? > > [197439.022249] Restarting tasks ... done. > [197439.024043] PM: hibernation exit > [197439.058296] r8169 :18:00.0 eth0: link up > [200941.240184] perf: interrupt took too long (2507 > 2500), lowering > kernel.perf_event_max_sample_rate to 79750 > [221973.686894] radeon :1d:00.0: ring 0 stalled for more than 10176msec > [221973.686900] radeon :1d:00.0: GPU lockup (current fence id > 0x010da43f last fence id 0x010da52d on ring 0) > [221973.686929] radeon :1d:00.0: failed to get a new IB (-35) > [221973.686950] [drm:radeon_cs_ioctl [radeon]] *ERROR* Failed to get ib ! > [221973.693971] radeon :1d:00.0: Saved 7609 dwords of commands on ring 0. > [221973.693985] radeon :1d:00.0: GPU softreset: 0x0008 > [221973.693988] radeon :1d:00.0: R_008010_GRBM_STATUS = 0xA0001030 > [221973.693990] radeon :1d:00.0: R_008014_GRBM_STATUS2 = 0x0003 > [2
radeon 0000:1d:00.0: GPU lockup (current fence id 0x00000000010da43f last fence id 0x00000000010da52d on ring 0)
Hi guys, X just froze here ontop of 4.17-rc7+ tip/master (kernel is from last week) with the splat at the end. Box is a x470 chipset with Ryzen 2700X. GPU gets detected as [7.440971] [drm] radeon kernel modesetting enabled. [7.441220] [drm] initializing kernel modesetting (RV635 0x1002:0x9598 0x1043:0x01DA 0x00). [7.441328] ATOM BIOS: 9598.10.88.0.3.AS05 [7.441395] radeon :1d:00.0: VRAM: 512M 0x - 0x1FFF (512M used) [7.441464] radeon :1d:00.0: GTT: 512M 0x2000 - 0x3FFF [7.441531] [drm] Detected VRAM RAM=512M, BAR=256M [7.441588] [drm] RAM width 128bits DDR [7.441690] [TTM] Zone kernel: Available graphics memory: 16462214 kiB [7.441751] [TTM] Zone dma32: Available graphics memory: 2097152 kiB [7.441811] [TTM] Initializing pool allocator [7.441868] [TTM] Initializing DMA pool allocator [7.441934] [drm] radeon: 512M of VRAM memory ready [7.441990] [drm] radeon: 512M of GTT memory ready. [7.442050] [drm] Loading RV635 Microcode [7.442865] [drm] Internal thermal controller without fan control [7.442940] [drm] radeon: power management initialized [7.443222] [drm] GART: num cpu pages 131072, num gpu pages 131072 [7.443487] [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0 [7.477319] [drm] PCIE GART of 512M enabled (table at 0x00142000). [7.477400] radeon :1d:00.0: WB enabled [7.477455] radeon :1d:00.0: fence driver on ring 0 use gpu addr 0x2c00 and cpu addr 0x(ptrval) [7.477708] radeon :1d:00.0: fence driver on ring 5 use gpu addr 0x000521d0 and cpu addr 0x(ptrval) [7.48] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [7.477836] [drm] Driver supports precise vblank timestamp query. [7.477896] radeon :1d:00.0: radeon: MSI limited to 32-bit [7.477990] radeon :1d:00.0: radeon: using MSI. [7.478062] [drm] radeon: irq initialized. [7.509056] [drm] ring test on 0 succeeded in 0 usecs [7.683793] [drm] ring test on 5 succeeded in 1 usecs [7.683853] [drm] UVD initialized successfully. [7.684009] [drm] ib test on ring 0 succeeded in 0 usecs [8.348466] [drm] ib test on ring 5 succeeded [8.348921] [drm] Radeon Display Connectors [8.348978] [drm] Connector 0: [8.349031] [drm] DVI-I-1 [8.349082] [drm] HPD1 [8.349135] [drm] DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c 0x7e5c [8.349200] [drm] Encoders: [8.349252] [drm] DFP1: INTERNAL_UNIPHY [8.349308] [drm] CRT2: INTERNAL_KLDSCP_DAC2 [8.349364] [drm] Connector 1: [8.349416] [drm] DIN-1 [8.349467] [drm] Encoders: [8.349520] [drm] TV1: INTERNAL_KLDSCP_DAC2 [8.349576] [drm] Connector 2: [8.349628] [drm] DVI-I-2 [8.349680] [drm] HPD2 [8.349732] [drm] DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 0x7e4c [8.349797] [drm] Encoders: [8.349849] [drm] CRT1: INTERNAL_KLDSCP_DAC1 [8.349905] [drm] DFP2: INTERNAL_KLDSCP_LVTMA [8.430521] [drm] fb mappable at 0xE0243000 [8.430575] [drm] vram apper at 0xE000 [8.431194] [drm] size 9216000 [8.431245] [drm] fb depth is 24 [8.431295] [drm]pitch is 7680 [8.431406] fbcon: radeondrmfb (fb0) is primary device [8.496928] Console: switching to colour frame buffer device 240x75 [8.501851] radeon :1d:00.0: fb0: radeondrmfb frame buffer device [8.520179] [drm] Initialized radeon 2.50.0 20080528 for :1d:00.0 on minor 0 in the PCIe slot with two monitors connected to it. radeon firmware is Version: 20170823-1 What practically happened is X froze and got restarted after the GPU reset. It seems to be ok now, as I'm typing in it. Thoughts? [197439.022249] Restarting tasks ... done. [197439.024043] PM: hibernation exit [197439.058296] r8169 :18:00.0 eth0: link up [200941.240184] perf: interrupt took too long (2507 > 2500), lowering kernel.perf_event_max_sample_rate to 79750 [221973.686894] radeon :1d:00.0: ring 0 stalled for more than 10176msec [221973.686900] radeon :1d:00.0: GPU lockup (current fence id 0x010da43f last fence id 0x010da52d on ring 0) [221973.686929] radeon :1d:00.0: failed to get a new IB (-35) [221973.686950] [drm:radeon_cs_ioctl [radeon]] *ERROR* Failed to get ib ! [221973.693971] radeon :1d:00.0: Saved 7609 dwords of commands on ring 0. [221973.693985] radeon :1d:00.0: GPU softreset: 0x0008 [221973.693988] radeon :1d:00.0: R_008010_GRBM_STATUS = 0xA0001030 [221973.693990] radeon :1d:00.0: R_008014_GRBM_STATUS2 = 0x0003 [221973.693992] radeon :1d:00.0: R_000E50_SRBM_STATUS = 0x200010C0 [221973.693994] radeon :1d:00.0: R_008674_CP_STALLED_STAT1 = 0x [221973.693996] radeon :1d:00.0: R_008678_CP_STALLED_STAT2 = 0x [221973.693998] radeon :1d:00.0:
Re: [PATCH 1/2] drm/scheduler: Rename cleanup functions.
On 2018-06-05 07:02 PM, Andrey Grodzovsky wrote: > Everything in the flush code path (i.e. waiting for SW queue > to become empty) names with *_flush() > and everything in the release code path names *_fini() > > This patch also effect the amdgpu and etnaviv drivers which > use those functions. > > Signed-off-by: Andrey Grodzovsky > Suggested-by: Christian König [...] > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c > b/drivers/gpu/drm/etnaviv/etnaviv_drv.c > index 23e73c2..3dff4d0 100644 > --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c > +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c > @@ -140,7 +140,7 @@ static void etnaviv_postclose(struct drm_device *dev, > struct drm_file *file) > gpu->lastctx = NULL; > mutex_unlock(&gpu->lock); > > - drm_sched_entity_fini(&gpu->sched, > + drm_sched_entity_destroy(&gpu->sched, > &ctx->sched_entity[i]); > } > } The drm-next tree for 4.18 has a new v3d driver, which also uses drm_sched_entity_fini. Please either make sure to merge this change via a tree which has the v3d driver, and fix it up as well, or don't do the fini => destroy rename. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx