Re: [PATCH] radeon: Use only a single work queue thread for crt
On Sun, Sep 6, 2020 at 4:54 AM Christian König < ckoenig.leichtzumer...@gmail.com> wrote: > Am 03.08.19 um 02:09 schrieb Andi Kleen: > > From: Andi Kleen > > > > I got tired of seeing a lot of radeon-crt kernel threads in ps on my > > workstation, one for each CPU and one for each display, which never use > any CPU time. > > Surely a single kernel thread is enough to handle the display. > > NAK, radeon blocks inside the kernel thread and those need to run in > parallel or otherwise the hardware can hang. > Shouldn't WQ_UNBOUND create a new worker thread whenever all current workers go to sleep/block (and the total number of worker threads is below 'max_active') ? Jan > > Christian. > > > > > Signed-off-by: Andi Kleen > > --- > > drivers/gpu/drm/radeon/radeon_display.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/radeon/radeon_display.c > b/drivers/gpu/drm/radeon/radeon_display.c > > index bd52f15e6330..fb0ca688f6fe 100644 > > --- a/drivers/gpu/drm/radeon/radeon_display.c > > +++ b/drivers/gpu/drm/radeon/radeon_display.c > > @@ -682,7 +682,7 @@ static void radeon_crtc_init(struct drm_device *dev, > int index) > > > > drm_mode_crtc_set_gamma_size(&radeon_crtc->base, 256); > > radeon_crtc->crtc_id = index; > > - radeon_crtc->flip_queue = alloc_workqueue("radeon-crtc", > WQ_HIGHPRI, 0); > > + radeon_crtc->flip_queue = alloc_workqueue("radeon-crtc", > WQ_HIGHPRI|WQ_UNBOUND, 0); > > rdev->mode_info.crtcs[index] = radeon_crtc; > > > > if (rdev->family >= CHIP_BONAIRE) { > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: Slow memory access when using OpenCL without X11
On Sat, Mar 9, 2019 at 1:54 AM Lauri Ehrenpreis wrote: > > Even if it's using CPU for OCL (I know it's not doing this), why does memcpy > on CPU slow down permanently, if I'm not doing anything with OpenCL after > clCreateContext? > > As you see from test program it just does clCreateContext and then a loop of > memcpy-s on CPU. > > Also I found out that writing different values to > /sys/class/drm/card0/device/power_dpm_force_performance_level changes my max > memcpy speed on CPU: > > echo "low" > /sys/class/drm/card0/device/power_dpm_force_performance_level > ./cl_slow_test 1 5 > got 1 platforms 1 devices > speed 731.810425 avg 731.810425 mbytes/s > speed 163.425583 avg 447.618011 mbytes/s > speed 123.441612 avg 339.559235 mbytes/s > speed 121.655266 avg 285.083252 mbytes/s > speed 123.806801 avg 252.827972 mbytes/s > > echo "high" > /sys/class/drm/card0/device/power_dpm_force_performance_level > ./cl_slow_test 1 5 > got 1 platforms 1 devices > speed 3742.063721 avg 3742.063721 mbytes/s > speed 836.148987 avg 2289.106445 mbytes/s > speed 189.379166 avg 1589.197266 mbytes/s > speed 189.271393 avg 1239.215820 mbytes/s > speed 188.290451 avg 1029.030762 mbytes/s > > echo "profile_standard" > > /sys/class/drm/card0/device/power_dpm_force_performance_level > ./cl_slow_test 1 5 > got 1 platforms 1 devices > speed 2303.955566 avg 2303.955566 mbytes/s > speed 2298.224121 avg 2301.089844 mbytes/s > speed 2295.585205 avg 2299.254883 mbytes/s > speed 2295.762939 avg 2298.381836 mbytes/s > speed 2288.766602 avg 2296.458740 mbytes/s > > echo "profile_peak" > > /sys/class/drm/card0/device/power_dpm_force_performance_level > ./cl_slow_test 1 5 > got 1 platforms 1 devices > speed 3710.360352 avg 3710.360352 mbytes/s > speed 3713.660400 avg 3712.010254 mbytes/s > speed 3797.630859 avg 3740.550537 mbytes/s > speed 3708.004883 avg 3732.414062 mbytes/s > speed 3796.403076 avg 3745.211914 mbytes/s > > However none of those is close to the memcpy speed I get when I don't do > clCreateContext (my test prog has first arg 0): > ./cl_slow_test 0 5 > speed 7299.201660 avg 7299.201660 mbytes/s > speed 9298.841797 avg 8299.021484 mbytes/s > speed 9360.181641 avg 8652.742188 mbytes/s > speed 9004.759766 avg 8740.746094 mbytes/s > speed 9414.607422 avg 8875.518555 mbytes/s > > Also attached clinfo.txt. It shows that opencl is using GPU so device node > permissions are probably not the issue. Is it only memory accesses or does overall CPU performance degrade (including compute - say sysbench) as well? Jan > -- > Lauri > > On Fri, Mar 8, 2019 at 10:35 PM Alex Deucher wrote: >> >> I think you are probably using the CPU for OCL in the remote login >> case. When you log into the desktop, the permissions on the device >> nodes get changed dynamically to support accelerated rendering. You >> probably need to change the permissions on the device nodes manually >> if you are not logging into the desktop. >> >> Alex >> >> On Fri, Mar 8, 2019 at 2:43 PM Lauri Ehrenpreis wrote: >> > >> > Hi! >> > >> > I am using Ryzen 2400G with Gigabyte AMD B450 AORUS board. I have latest >> > bios, ubuntu 18.04 and latest mainline kernel (5.0.0-05-generic) >> > installed. Also I have rocm-dev 2.1.96 but no rock-dkms installed. >> > >> > I found that when I log in over ssh and try to use OpenCL (doing >> > clCreateContext is enough) then cpu memory accesses after that will slow >> > down by 100x. >> > If I connect HDMI cable and log in to desktop mode then this does not >> > happen. Also if I don't call clCreateContext then everything works >> > properly. >> > >> > Attached the test program and kernel log also. Test works like that : >> > g++ cl_slow_test.cpp -o cl_slow_test -I /opt/rocm/opencl/include/ -L >> > /opt/rocm/opencl/lib/x86_64/ -lOpenCL >> > lauri@rv:~$ ./cl_slow_test 0 5 >> > speed 7003.145508 avg 7003.145508 mbytes/s >> > speed 8427.357422 avg 7715.251465 mbytes/s >> > speed 9203.049805 avg 8211.184570 mbytes/s >> > speed 9845.956055 avg 8619.877930 mbytes/s >> > speed 9882.748047 avg 8872.452148 mbytes/s >> > lauri@rv:~$ ./cl_slow_test 1 5 >> > got 1 platforms 1 devices >> > speed 1599.803589 avg 1599.803589 mbytes/s >> > speed 1665.426392 avg 1632.614990 mbytes/s >> > speed 146.137253 avg 1137.122437 mbytes/s >> > speed 121.056877 avg 883.106018 mbytes/s >> > speed 122.428970 avg 730.970581 mbytes/s >> > >> > I also tried latest amd-staging kernel >> > https://github.com/M-Bab/linux-kernel-amdgpu-binaries and it had the same >> > issue. >> > >> > Can anyone point me into right direction? >> > >> > Br, >> > Lauri >> > ___ >> > amd-gfx mailing list >> > amd-gfx@lists.freedesktop.org >> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd
Re: [PATCH] drm/amdgpu: Partially revert commit 2dc80b006
thank you. Every effort to make the patches easier to follow is appreciated. Jan On Wed, Jun 13, 2018 at 10:39 AM, Zhu, Rex wrote: > Hi Jan, > > > Thanks for your suggestion. Obvious the patch title was inappropriate. I > will update it. > > In fact, I originally plan to send this patch internally for discussion. > So just thought that "Revert x" can cause the attention. > > Best Regards > Rex > > > > > -- > *From:* jv...@scarletmail.rutgers.edu on > behalf of Jan Vesely > *Sent:* Wednesday, June 13, 2018 9:30 PM > *To:* Koenig, Christian > *Cc:* Zhu, Rex; amd-gfx list > *Subject:* Re: [PATCH] drm/amdgpu: Partially revert commit 2dc80b006 > > Hi, > > can you please improve the commit message? > seeing "Revert $HASH" conveys zero information about the code change. > I'm sorry for bringing this up again, but following AMDGPU/Radeon driver > development is an exercise in frustration for anyone who is not on AMD's > payroll. > git commit logs like: > "revert XYZ" or "fix bug #123" make it really cumbersome to actually look > at history and pick interesting/breaking commits. > > thanks, > Jan > > On Wed, Jun 13, 2018 at 8:46 AM, Christian König < > ckoenig.leichtzumer...@gmail.com> wrote: > > Am 13.06.2018 um 13:40 schrieb Rex Zhu: > > Move the CG enablement out of delay worker thread. > > 1. CG/PG enablement are part of gpu hw ip initialize, we should > wait for them complete. otherwise, there are some potential conflicts, > for example, Suspend and CG enablement concurrently. > 2. better run ib test after hw initialize completely. That is to say, > ib test should be after CG/PG enablement. otherwise, the test will > not cover the cg/pg/poweroff enable case. > > Signed-off-by: Rex Zhu > > > Yeah, that thought came to my mind as well. > > Essentially the IB test should simulate a submission from userspace to > make sure that the stack is working as expected. I think it was just moved > before CG/PG to avoid issues with that, which is actually not very clever. > > Patch is Reviewed-by: Christian König , but > there could be some fallout we could need to deal with. > > Thanks, > Christian. > > > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 -- > 1 file changed, 8 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index 9647f54..90b78c7 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -1709,10 +1709,6 @@ static int amdgpu_device_ip_late_set_cg_state(struct > amdgpu_device *adev) > if (amdgpu_emu_mode == 1) > return 0; > - r = amdgpu_ib_ring_tests(adev); > - if (r) > - DRM_ERROR("ib ring test failed (%d).\n", r); > - > for (i = 0; i < adev->num_ip_blocks; i++) { > if (!adev->ip_blocks[i].status.valid) > continue; > @@ -1793,6 +1789,9 @@ static int amdgpu_device_ip_late_init(struct > amdgpu_device *adev) > } > } > + amdgpu_device_ip_late_set_cg_state(adev); > + amdgpu_device_ip_late_set_pg_state(adev); > + > queue_delayed_work(system_wq, &adev->late_init_work, >msecs_to_jiffies(AMDGPU_RESUME_MS)); > @@ -1921,8 +1920,11 @@ static void > amdgpu_device_ip_late_init_func_handler(struct > work_struct *work) > { > struct amdgpu_device *adev = > container_of(work, struct amdgpu_device, > late_init_work.work); > - amdgpu_device_ip_late_set_cg_state(adev); > - amdgpu_device_ip_late_set_pg_state(adev); > + int r; > + > + r = amdgpu_ib_ring_tests(adev); > + if (r) > + DRM_ERROR("ib ring test failed (%d).\n", r); > } > /** > > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: Partially revert commit 2dc80b006
Hi, can you please improve the commit message? seeing "Revert $HASH" conveys zero information about the code change. I'm sorry for bringing this up again, but following AMDGPU/Radeon driver development is an exercise in frustration for anyone who is not on AMD's payroll. git commit logs like: "revert XYZ" or "fix bug #123" make it really cumbersome to actually look at history and pick interesting/breaking commits. thanks, Jan On Wed, Jun 13, 2018 at 8:46 AM, Christian König < ckoenig.leichtzumer...@gmail.com> wrote: > Am 13.06.2018 um 13:40 schrieb Rex Zhu: > >> Move the CG enablement out of delay worker thread. >> >> 1. CG/PG enablement are part of gpu hw ip initialize, we should >> wait for them complete. otherwise, there are some potential conflicts, >> for example, Suspend and CG enablement concurrently. >> 2. better run ib test after hw initialize completely. That is to say, >> ib test should be after CG/PG enablement. otherwise, the test will >> not cover the cg/pg/poweroff enable case. >> >> Signed-off-by: Rex Zhu >> > > Yeah, that thought came to my mind as well. > > Essentially the IB test should simulate a submission from userspace to > make sure that the stack is working as expected. I think it was just moved > before CG/PG to avoid issues with that, which is actually not very clever. > > Patch is Reviewed-by: Christian König , but > there could be some fallout we could need to deal with. > > Thanks, > Christian. > > > --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 -- >> 1 file changed, 8 insertions(+), 6 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> index 9647f54..90b78c7 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> @@ -1709,10 +1709,6 @@ static int amdgpu_device_ip_late_set_cg_state(struct >> amdgpu_device *adev) >> if (amdgpu_emu_mode == 1) >> return 0; >> - r = amdgpu_ib_ring_tests(adev); >> - if (r) >> - DRM_ERROR("ib ring test failed (%d).\n", r); >> - >> for (i = 0; i < adev->num_ip_blocks; i++) { >> if (!adev->ip_blocks[i].status.valid) >> continue; >> @@ -1793,6 +1789,9 @@ static int amdgpu_device_ip_late_init(struct >> amdgpu_device *adev) >> } >> } >> + amdgpu_device_ip_late_set_cg_state(adev); >> + amdgpu_device_ip_late_set_pg_state(adev); >> + >> queue_delayed_work(system_wq, &adev->late_init_work, >>msecs_to_jiffies(AMDGPU_RESUME_MS)); >> @@ -1921,8 +1920,11 @@ static void >> amdgpu_device_ip_late_init_func_handler(struct >> work_struct *work) >> { >> struct amdgpu_device *adev = >> container_of(work, struct amdgpu_device, >> late_init_work.work); >> - amdgpu_device_ip_late_set_cg_state(adev); >> - amdgpu_device_ip_late_set_pg_state(adev); >> + int r; >> + >> + r = amdgpu_ib_ring_tests(adev); >> + if (r) >> + DRM_ERROR("ib ring test failed (%d).\n", r); >> } >> /** >> > > ___ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH libdrm 3/3] amdgpu: Destroy fd_hash table when the last device is removed.
On Fri, 2018-05-18 at 13:00 -0400, Jan Vesely wrote: > Fixes memory leak on module unload. > Analogous to mesa commit of the same name. > Signed-off-by: Jan Vesely > --- > amdgpu/amdgpu_device.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/amdgpu/amdgpu_device.c b/amdgpu/amdgpu_device.c > index e23dd3b3..34ac95b8 100644 > --- a/amdgpu/amdgpu_device.c > +++ b/amdgpu/amdgpu_device.c > @@ -128,6 +128,10 @@ static void > amdgpu_device_free_internal(amdgpu_device_handle dev) > { > pthread_mutex_lock(&fd_mutex); > util_hash_table_remove(fd_tab, UINT_TO_PTR(dev->fd)); > + if (util_hash_table_count(fd_tab) == 0) { > + util_hash_table_destroy(fd_tab); > + fd_tab = NULL; > + } > close(dev->fd); > if ((dev->flink_fd >= 0) && (dev->fd != dev->flink_fd)) > close(dev->flink_fd); gentle ping. Jan signature.asc Description: This is a digitally signed message part ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH libdrm 3/3] amdgpu: Destroy fd_hash table when the last device is removed.
Fixes memory leak on module unload. Analogous to mesa commit of the same name. Signed-off-by: Jan Vesely --- amdgpu/amdgpu_device.c | 4 1 file changed, 4 insertions(+) diff --git a/amdgpu/amdgpu_device.c b/amdgpu/amdgpu_device.c index e23dd3b3..34ac95b8 100644 --- a/amdgpu/amdgpu_device.c +++ b/amdgpu/amdgpu_device.c @@ -128,6 +128,10 @@ static void amdgpu_device_free_internal(amdgpu_device_handle dev) { pthread_mutex_lock(&fd_mutex); util_hash_table_remove(fd_tab, UINT_TO_PTR(dev->fd)); + if (util_hash_table_count(fd_tab) == 0) { + util_hash_table_destroy(fd_tab); + fd_tab = NULL; + } close(dev->fd); if ((dev->flink_fd >= 0) && (dev->fd != dev->flink_fd)) close(dev->flink_fd); -- 2.17.0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH libdrm v2 1/3] amdgpu: Take lock before removing devices from fd_tab hash table.
Close the file descriptors under lock as well. v2: close fds after removing from hash table Signed-off-by: Jan Vesely --- amdgpu/amdgpu_device.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/amdgpu/amdgpu_device.c b/amdgpu/amdgpu_device.c index 983b19ab..e23dd3b3 100644 --- a/amdgpu/amdgpu_device.c +++ b/amdgpu/amdgpu_device.c @@ -126,6 +126,13 @@ static int amdgpu_get_auth(int fd, int *auth) static void amdgpu_device_free_internal(amdgpu_device_handle dev) { + pthread_mutex_lock(&fd_mutex); + util_hash_table_remove(fd_tab, UINT_TO_PTR(dev->fd)); + close(dev->fd); + if ((dev->flink_fd >= 0) && (dev->fd != dev->flink_fd)) + close(dev->flink_fd); + pthread_mutex_unlock(&fd_mutex); + amdgpu_vamgr_deinit(&dev->vamgr_32); amdgpu_vamgr_deinit(&dev->vamgr); amdgpu_vamgr_deinit(&dev->vamgr_high_32); @@ -133,10 +140,6 @@ static void amdgpu_device_free_internal(amdgpu_device_handle dev) util_hash_table_destroy(dev->bo_flink_names); util_hash_table_destroy(dev->bo_handles); pthread_mutex_destroy(&dev->bo_table_mutex); - util_hash_table_remove(fd_tab, UINT_TO_PTR(dev->fd)); - close(dev->fd); - if ((dev->flink_fd >= 0) && (dev->fd != dev->flink_fd)) - close(dev->flink_fd); free(dev->marketing_name); free(dev); } -- 2.17.0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH libdrm 2/3] amdgpu/util_hash_table: Add helper function to count the number of entries in hash table
Analogous to the mesa commit of the same name. Signed-off-by: Jan Vesely --- amdgpu/util_hash_table.c | 12 amdgpu/util_hash_table.h | 2 ++ 2 files changed, 14 insertions(+) diff --git a/amdgpu/util_hash_table.c b/amdgpu/util_hash_table.c index 89a8bf9b..e06d4415 100644 --- a/amdgpu/util_hash_table.c +++ b/amdgpu/util_hash_table.c @@ -237,6 +237,18 @@ drm_private void util_hash_table_foreach(struct util_hash_table *ht, } } +static void util_hash_table_inc(void *k, void *v, void *d) +{ + ++*(size_t *)d; +} + +drm_private size_t util_hash_table_count(struct util_hash_table *ht) +{ + size_t count = 0; + util_hash_table_foreach(ht, util_hash_table_inc, &count); + return count; +} + drm_private void util_hash_table_destroy(struct util_hash_table *ht) { struct util_hash_iter iter; diff --git a/amdgpu/util_hash_table.h b/amdgpu/util_hash_table.h index 5e295a81..3ab81a12 100644 --- a/amdgpu/util_hash_table.h +++ b/amdgpu/util_hash_table.h @@ -64,6 +64,8 @@ drm_private void util_hash_table_foreach(struct util_hash_table *ht, void (*callback)(void *key, void *value, void *data), void *data); +drm_private size_t util_hash_table_count(struct util_hash_table *ht); + drm_private void util_hash_table_destroy(struct util_hash_table *ht); #endif /* U_HASH_TABLE_H_ */ -- 2.17.0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH libdrm 1/1] amdgpu: Take lock before removing devices from fd_tab hash table.
On Thu, 2018-05-10 at 19:33 -0400, Jan Vesely wrote: > Close the file descriptors under lock as well. > > Signed-off-by: Jan Vesely > --- > amdgpu/amdgpu_device.c | 11 +++ > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/amdgpu/amdgpu_device.c b/amdgpu/amdgpu_device.c > index 983b19ab..f71fc050 100644 > --- a/amdgpu/amdgpu_device.c > +++ b/amdgpu/amdgpu_device.c > @@ -126,6 +126,13 @@ static int amdgpu_get_auth(int fd, int *auth) > > static void amdgpu_device_free_internal(amdgpu_device_handle dev) > { > + pthread_mutex_lock(&fd_mutex); > + close(dev->fd); > + if ((dev->flink_fd >= 0) && (dev->fd != dev->flink_fd)) > + close(dev->flink_fd); > + util_hash_table_remove(fd_tab, UINT_TO_PTR(dev->fd)); > + pthread_mutex_unlock(&fd_mutex); > + > amdgpu_vamgr_deinit(&dev->vamgr_32); > amdgpu_vamgr_deinit(&dev->vamgr); > amdgpu_vamgr_deinit(&dev->vamgr_high_32); > @@ -133,10 +140,6 @@ static void > amdgpu_device_free_internal(amdgpu_device_handle dev) > util_hash_table_destroy(dev->bo_flink_names); > util_hash_table_destroy(dev->bo_handles); > pthread_mutex_destroy(&dev->bo_table_mutex); > - util_hash_table_remove(fd_tab, UINT_TO_PTR(dev->fd)); > - close(dev->fd); > - if ((dev->flink_fd >= 0) && (dev->fd != dev->flink_fd)) > - close(dev->flink_fd); > free(dev->marketing_name); > free(dev); > } ping. Jan signature.asc Description: This is a digitally signed message part ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH libdrm 1/1] amdgpu: Take lock before removing devices from fd_tab hash table.
Close the file descriptors under lock as well. Signed-off-by: Jan Vesely --- amdgpu/amdgpu_device.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/amdgpu/amdgpu_device.c b/amdgpu/amdgpu_device.c index 983b19ab..f71fc050 100644 --- a/amdgpu/amdgpu_device.c +++ b/amdgpu/amdgpu_device.c @@ -126,6 +126,13 @@ static int amdgpu_get_auth(int fd, int *auth) static void amdgpu_device_free_internal(amdgpu_device_handle dev) { + pthread_mutex_lock(&fd_mutex); + close(dev->fd); + if ((dev->flink_fd >= 0) && (dev->fd != dev->flink_fd)) + close(dev->flink_fd); + util_hash_table_remove(fd_tab, UINT_TO_PTR(dev->fd)); + pthread_mutex_unlock(&fd_mutex); + amdgpu_vamgr_deinit(&dev->vamgr_32); amdgpu_vamgr_deinit(&dev->vamgr); amdgpu_vamgr_deinit(&dev->vamgr_high_32); @@ -133,10 +140,6 @@ static void amdgpu_device_free_internal(amdgpu_device_handle dev) util_hash_table_destroy(dev->bo_flink_names); util_hash_table_destroy(dev->bo_handles); pthread_mutex_destroy(&dev->bo_table_mutex); - util_hash_table_remove(fd_tab, UINT_TO_PTR(dev->fd)); - close(dev->fd); - if ((dev->flink_fd >= 0) && (dev->fd != dev->flink_fd)) - close(dev->flink_fd); free(dev->marketing_name); free(dev); } -- 2.17.0 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 1/1] drm/amdkfd: Do not ignore requested queue size during allocation
On Wed, 2017-11-29 at 16:58 -0500, Felix Kuehling wrote: > You can see the state of the queues in debugfs: > /sys/kernel/debug/kfd/... You can look at MQDs and HQDs. thanks. how do I decode the information? The rptr always stops at pos 60 which looks like this in mqds: DIQ on device 45a2 : c0310800 4000 0020: 0001 0040: 0060: If I understood correctly that's the queue dump, so those fs look wrong > > If your application isn't stopping queues deliberately, queues get > disabled by evictions, usually temporarily. You'll see kernel messages > when that happens. > > A VM fault will result in queues of the offending process getting > disabled permanently. Again, you'll see messages about that in the > kernel log. > > The RPTR can also stop advancing if you have an infinite loop in a > shader program, or just a shader that takes a very long time to execute. > Or maybe if you have some dependencies (barriers) in your AQL packets > that never get satisfied. > > The function you changed only affects the HIQ, the queue that KFD uses > to control the HWS. It does not affect user mode queues. If your problem > is with a user mode queue, your change should have no effect at all. It's not a userspace queue that stops. I'm using kernel dbgdev to issue wave_resume commands. (waves are halted after executing s_sendmsg_halt). I bumped KFD_KERNEL_QUEUE_SIZE to 16KB to make sure all 320 resume commads fit (otherwise I get spurious ENOMEM when the queue is full but still advancing). thanks, Jan > > Regards, > Felix > > > On 2017-11-29 04:43 PM, Jan Vesely wrote: > > On Mon, 2017-11-20 at 14:22 -0500, Felix Kuehling wrote: > > > I think this patch is not correct. The EOP-mem is not associated with > > > the queue size. The EOP buffer is a separate buffer used by the firmware > > > to handle command completion. As I understand it, this allows more > > > concurrency, while still making it look like all commands in the queue > > > are completing in order. > > > > thanks for the explanation. I was looking for a source of a CP hang > > (rptr stops advancing), but bumping the eop size actually mode things > > worse. Is there a way to find out if a queue got disabled and for what > > reason? (I'm running ROCK-1.6.x based kernel) > > > > thanks, > > Jan > > > > > Regards, > > > Felix > > > > > > > > > On 2017-11-19 03:19 AM, Oded Gabbay wrote: > > > > On Thu, Nov 16, 2017 at 11:36 PM, Jan Vesely > > > > wrote: > > > > > Signed-off-by: Jan Vesely > > > > > --- > > > > > drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c | 5 +++-- > > > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c > > > > > b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c > > > > > index f1d48281e322..b3bee39661ab 100644 > > > > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c > > > > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c > > > > > @@ -37,15 +37,16 @@ static bool initialize_vi(struct kernel_queue > > > > > *kq, struct kfd_dev *dev, > > > > > enum kfd_queue_type type, unsigned int > > > > > queue_size) > > > > > { > > > > > int retval; > > > > > + unsigned int size = ALIGN(queue_size, PAGE_SIZE); > > > > > > > > > > - retval = kfd_gtt_sa_allocate(dev, PAGE_SIZE, &kq->eop_mem); > > > > > + retval = kfd_gtt_sa_allocate(dev, size, &kq->eop_mem); > > > > > if (retval != 0) > > > > > return false; > > > > > > > > > > kq->eop_gpu_addr = kq->eop_mem->gpu_addr; > > > > > kq->eop_kernel_addr = kq->eop_mem->cpu_ptr; > > > > > > > > > > - memset(kq->eop_kernel_addr, 0, PAGE_SIZE); > > > > > + memset(kq->eop_kernel_addr, 0, size); > > > > > > > > > > return true; > > > > > } > > > > > -- > > > > > 2.13.6 > > > > > > > > > > ___ > > > > > amd-gfx mailing list > > > > > amd-gfx@lists.freedesktop.org > > > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > > > > > > > Thanks! > > > > Applied to -next tree > > > > Oded > > > > ___ > > > > amd-gfx mailing list > > > > amd-gfx@lists.freedesktop.org > > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > signature.asc Description: This is a digitally signed message part ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 1/1] drm/amdkfd: Do not ignore requested queue size during allocation
On Mon, 2017-11-20 at 14:22 -0500, Felix Kuehling wrote: > I think this patch is not correct. The EOP-mem is not associated with > the queue size. The EOP buffer is a separate buffer used by the firmware > to handle command completion. As I understand it, this allows more > concurrency, while still making it look like all commands in the queue > are completing in order. thanks for the explanation. I was looking for a source of a CP hang (rptr stops advancing), but bumping the eop size actually mode things worse. Is there a way to find out if a queue got disabled and for what reason? (I'm running ROCK-1.6.x based kernel) thanks, Jan > > Regards, > Felix > > > On 2017-11-19 03:19 AM, Oded Gabbay wrote: > > On Thu, Nov 16, 2017 at 11:36 PM, Jan Vesely wrote: > > > Signed-off-by: Jan Vesely > > > --- > > > drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c | 5 +++-- > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c > > > b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c > > > index f1d48281e322..b3bee39661ab 100644 > > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c > > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c > > > @@ -37,15 +37,16 @@ static bool initialize_vi(struct kernel_queue *kq, > > > struct kfd_dev *dev, > > > enum kfd_queue_type type, unsigned int queue_size) > > > { > > > int retval; > > > + unsigned int size = ALIGN(queue_size, PAGE_SIZE); > > > > > > - retval = kfd_gtt_sa_allocate(dev, PAGE_SIZE, &kq->eop_mem); > > > + retval = kfd_gtt_sa_allocate(dev, size, &kq->eop_mem); > > > if (retval != 0) > > > return false; > > > > > > kq->eop_gpu_addr = kq->eop_mem->gpu_addr; > > > kq->eop_kernel_addr = kq->eop_mem->cpu_ptr; > > > > > > - memset(kq->eop_kernel_addr, 0, PAGE_SIZE); > > > + memset(kq->eop_kernel_addr, 0, size); > > > > > > return true; > > > } > > > -- > > > 2.13.6 > > > > > > ___ > > > amd-gfx mailing list > > > amd-gfx@lists.freedesktop.org > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > > > Thanks! > > Applied to -next tree > > Oded > > ___ > > amd-gfx mailing list > > amd-gfx@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx > > signature.asc Description: This is a digitally signed message part ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 1/2] drm/amdgpu:fix virtual dce bug
On Fri, 2017-11-17 at 04:26 +, Liu, Monk wrote: > I think it's already clear enough nice. what a friendly response. good job! "fix a bug" is definitely not descriptive of the change, and the commit message does not even parse as a sentence. Jan > > -Original Message- > From: Jan Vesely [mailto:jv...@scarletmail.rutgers.edu] On Behalf Of Jan > Vesely > Sent: 2017年11月17日 0:40 > To: Liu, Monk ; amd-gfx@lists.freedesktop.org > Subject: Re: [PATCH 1/2] drm/amdgpu:fix virtual dce bug > > On Thu, 2017-11-16 at 11:14 +0800, Monk Liu wrote: > > this fix the issue that access memory after freed after driver > > unloaded. > > can you please change the patch subject and commit message to something more > descriptive? > > Jan > > > > > Change-Id: I64e2488c18f5dc044b57c74567785da21fc028da > > Signed-off-by: Monk Liu > > --- > > drivers/gpu/drm/amd/amdgpu/dce_virtual.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/dce_virtual.c > > b/drivers/gpu/drm/amd/amdgpu/dce_virtual.c > > index a8829af..39460eb 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/dce_virtual.c > > +++ b/drivers/gpu/drm/amd/amdgpu/dce_virtual.c > > @@ -437,6 +437,8 @@ static int dce_virtual_sw_fini(void *handle) > > drm_kms_helper_poll_fini(adev->ddev); > > > > drm_mode_config_cleanup(adev->ddev); > > + /* clear crtcs pointer to avoid dce irq finish routine access freed > > data */ > > + memset(adev->mode_info.crtcs, 0, sizeof(adev->mode_info.crtcs[0]) * > > +AMDGPU_MAX_CRTCS); > > adev->mode_info.mode_config_initialized = false; > > return 0; > > } > > @@ -723,7 +725,7 @@ static void > > dce_virtual_set_crtc_vblank_interrupt_state(struct amdgpu_device *ad > > int crtc, > > enum > > amdgpu_interrupt_state state) { > > - if (crtc >= adev->mode_info.num_crtc) { > > + if (crtc >= adev->mode_info.num_crtc || > > +!adev->mode_info.crtcs[crtc]) { > > DRM_DEBUG("invalid crtc %d\n", crtc); > > return; > > } -- Jan Vesely signature.asc Description: This is a digitally signed message part ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 1/1] drm/amdkfd: Do not ignore requested queue size during allocation
Signed-off-by: Jan Vesely --- drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c index f1d48281e322..b3bee39661ab 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c @@ -37,15 +37,16 @@ static bool initialize_vi(struct kernel_queue *kq, struct kfd_dev *dev, enum kfd_queue_type type, unsigned int queue_size) { int retval; + unsigned int size = ALIGN(queue_size, PAGE_SIZE); - retval = kfd_gtt_sa_allocate(dev, PAGE_SIZE, &kq->eop_mem); + retval = kfd_gtt_sa_allocate(dev, size, &kq->eop_mem); if (retval != 0) return false; kq->eop_gpu_addr = kq->eop_mem->gpu_addr; kq->eop_kernel_addr = kq->eop_mem->cpu_ptr; - memset(kq->eop_kernel_addr, 0, PAGE_SIZE); + memset(kq->eop_kernel_addr, 0, size); return true; } -- 2.13.6 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 1/2] drm/amdgpu:fix virtual dce bug
On Thu, 2017-11-16 at 11:14 +0800, Monk Liu wrote: > this fix the issue that access memory after freed > after driver unloaded. can you please change the patch subject and commit message to something more descriptive? Jan > > Change-Id: I64e2488c18f5dc044b57c74567785da21fc028da > Signed-off-by: Monk Liu > --- > drivers/gpu/drm/amd/amdgpu/dce_virtual.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/dce_virtual.c > b/drivers/gpu/drm/amd/amdgpu/dce_virtual.c > index a8829af..39460eb 100644 > --- a/drivers/gpu/drm/amd/amdgpu/dce_virtual.c > +++ b/drivers/gpu/drm/amd/amdgpu/dce_virtual.c > @@ -437,6 +437,8 @@ static int dce_virtual_sw_fini(void *handle) > drm_kms_helper_poll_fini(adev->ddev); > > drm_mode_config_cleanup(adev->ddev); > + /* clear crtcs pointer to avoid dce irq finish routine access freed > data */ > + memset(adev->mode_info.crtcs, 0, sizeof(adev->mode_info.crtcs[0]) * > AMDGPU_MAX_CRTCS); > adev->mode_info.mode_config_initialized = false; > return 0; > } > @@ -723,7 +725,7 @@ static void > dce_virtual_set_crtc_vblank_interrupt_state(struct amdgpu_device *ad > int crtc, > enum > amdgpu_interrupt_state state) > { > - if (crtc >= adev->mode_info.num_crtc) { > + if (crtc >= adev->mode_info.num_crtc || !adev->mode_info.crtcs[crtc]) { > DRM_DEBUG("invalid crtc %d\n", crtc); > return; > } signature.asc Description: This is a digitally signed message part ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 03/11] drm/amdkfd: Rectify the jiffies calculation error with milliseconds
On Fri, 2017-09-15 at 19:42 -0400, Felix Kuehling wrote: > From: Yong Zhao > > The timeout in milliseconds should not be regarded as jiffies. This > commit fixed that. > > Signed-off-by: Yong Zhao > Signed-off-by: Felix Kuehling > --- > drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 8 +--- > drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 +- > 2 files changed, 6 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > index 5db82b8..3db6a31 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c > @@ -835,12 +835,14 @@ static int create_queue_cpsch(struct > device_queue_manager *dqm, struct queue *q, > > int amdkfd_fence_wait_timeout(unsigned int *fence_addr, > unsigned int fence_value, > - unsigned long timeout) > + unsigned long timeout_ms) > { > - timeout += jiffies; > + unsigned long end_jiffies; > + > + end_jiffies = (timeout_ms * HZ / 1000) + jiffies; msecs_to_jiffies? (jiffies.h) Jan > > while (*fence_addr != fence_value) { > - if (time_after(jiffies, timeout)) { > + if (time_after(jiffies, end_jiffies)) { > pr_err("qcm fence wait loop timeout expired\n"); > return -ETIME; > } > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > index ef582cc..f8d6a8e 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > @@ -669,7 +669,7 @@ struct kernel_queue *pqm_get_kernel_queue(struct > process_queue_manager *pqm, > > int amdkfd_fence_wait_timeout(unsigned int *fence_addr, > unsigned int fence_value, > - unsigned long timeout); > + unsigned long timeout_ms); > > /* Packet Manager */ > signature.asc Description: This is a digitally signed message part ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH libdrm 1/1] amdgpu: Add FX-9800P Bristol Ridge iGPU id
Signed-off-by: Jan Vesely --- data/amdgpu.ids | 1 + 1 file changed, 1 insertion(+) diff --git a/data/amdgpu.ids b/data/amdgpu.ids index 0b98c3c3..f6c65dd9 100644 --- a/data/amdgpu.ids +++ b/data/amdgpu.ids @@ -153,6 +153,7 @@ 9874, C5, AMD Radeon R6 Graphics 9874, C6, AMD Radeon R6 Graphics 9874, C7, AMD Radeon R5 Graphics +9874, C8, AMD Radeon R7 Graphics 9874, 81, AMD Radeon R6 Graphics 9874, 87, AMD Radeon R5 Graphics 9874, 85, AMD Radeon R6 Graphics -- 2.13.3 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] iommu/amd: flush IOTLB for specific domains only (v2)
On Fri, 2017-05-19 at 15:32 +0530, arindam.n...@amd.com wrote: > From: Arindam Nath > > Change History > -- > > v2: changes suggested by Joerg > - add flush flag to improve efficiency of flush operation > > v1: > - The idea behind flush queues is to defer the IOTLB flushing > for domains for which the mappings are no longer valid. We > add such domains in queue_add(), and when the queue size > reaches FLUSH_QUEUE_SIZE, we perform __queue_flush(). > > Since we have already taken lock before __queue_flush() > is called, we need to make sure the IOTLB flushing is > performed as quickly as possible. > > In the current implementation, we perform IOTLB flushing > for all domains irrespective of which ones were actually > added in the flush queue initially. This can be quite > expensive especially for domains for which unmapping is > not required at this point of time. > > This patch makes use of domain information in > 'struct flush_queue_entry' to make sure we only flush > IOTLBs for domains who need it, skipping others. Hi, just a note, the patch also fixes "AMD-Vi: Completion-Wait loop timed out" boot hang on my system (carrizo + iceland) [0,1,2]. Presumably the old loop also tried to flush domains that included powered-off devices. regards, Jan [0] https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/issues/20 [1] https://bugs.freedesktop.org/show_bug.cgi?id=101029 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1448121 > > Suggested-by: Joerg Roedel > Signed-off-by: Arindam Nath > --- > drivers/iommu/amd_iommu.c | 27 --- > drivers/iommu/amd_iommu_types.h | 2 ++ > 2 files changed, 22 insertions(+), 7 deletions(-) > > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c > index 63cacf5..1edeebec 100644 > --- a/drivers/iommu/amd_iommu.c > +++ b/drivers/iommu/amd_iommu.c > @@ -2227,15 +2227,26 @@ static struct iommu_group > *amd_iommu_device_group(struct device *dev) > > static void __queue_flush(struct flush_queue *queue) > { > - struct protection_domain *domain; > - unsigned long flags; > int idx; > > - /* First flush TLB of all known domains */ > - spin_lock_irqsave(&amd_iommu_pd_lock, flags); > - list_for_each_entry(domain, &amd_iommu_pd_list, list) > - domain_flush_tlb(domain); > - spin_unlock_irqrestore(&amd_iommu_pd_lock, flags); > + /* First flush TLB of all domains which were added to flush queue */ > + for (idx = 0; idx < queue->next; ++idx) { > + struct flush_queue_entry *entry; > + > + entry = queue->entries + idx; > + > + /* > + * There might be cases where multiple IOVA entries for the > + * same domain are queued in the flush queue. To avoid > + * flushing the same domain again, we check whether the > + * flag is set or not. This improves the efficiency of > + * flush operation. > + */ > + if (!entry->dma_dom->domain.already_flushed) { > + entry->dma_dom->domain.already_flushed = true; > + domain_flush_tlb(&entry->dma_dom->domain); > + } > + } > > /* Wait until flushes have completed */ > domain_flush_complete(NULL); > @@ -2289,6 +2300,8 @@ static void queue_add(struct dma_ops_domain *dma_dom, > pages = __roundup_pow_of_two(pages); > address >>= PAGE_SHIFT; > > + dma_dom->domain.already_flushed = false; > + > queue = get_cpu_ptr(&flush_queue); > spin_lock_irqsave(&queue->lock, flags); > > diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h > index 4de8f41..4f5519d 100644 > --- a/drivers/iommu/amd_iommu_types.h > +++ b/drivers/iommu/amd_iommu_types.h > @@ -454,6 +454,8 @@ struct protection_domain { > bool updated; /* complete domain flush required */ > unsigned dev_cnt; /* devices assigned to this domain */ > unsigned dev_iommu[MAX_IOMMUS]; /* per-IOMMU reference count */ > + bool already_flushed; /* flag to avoid flushing the same domain again > +in a single invocation of __queue_flush() */ > }; > > /* -- Jan Vesely signature.asc Description: This is a digitally signed message part ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx