Re: [PATCH] radeon: Use only a single work queue thread for crt

2020-09-06 Thread Jan Vesely
On Sun, Sep 6, 2020 at 4:54 AM Christian König <
ckoenig.leichtzumer...@gmail.com> wrote:

> Am 03.08.19 um 02:09 schrieb Andi Kleen:
> > From: Andi Kleen 
> >
> > I got tired of seeing a lot of radeon-crt kernel threads in ps on my
> > workstation, one for each CPU and one for each display, which never use
> any CPU time.
> > Surely a single kernel thread is enough to handle the display.
>
> NAK, radeon blocks inside the kernel thread and those need to run in
> parallel or otherwise the hardware can hang.
>

Shouldn't WQ_UNBOUND create a new worker thread whenever all current
workers go to sleep/block (and the total number of worker threads is below
'max_active') ?

Jan

>
> Christian.
>
> >
> > Signed-off-by: Andi Kleen 
> > ---
> >   drivers/gpu/drm/radeon/radeon_display.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/radeon/radeon_display.c
> b/drivers/gpu/drm/radeon/radeon_display.c
> > index bd52f15e6330..fb0ca688f6fe 100644
> > --- a/drivers/gpu/drm/radeon/radeon_display.c
> > +++ b/drivers/gpu/drm/radeon/radeon_display.c
> > @@ -682,7 +682,7 @@ static void radeon_crtc_init(struct drm_device *dev,
> int index)
> >
> >   drm_mode_crtc_set_gamma_size(&radeon_crtc->base, 256);
> >   radeon_crtc->crtc_id = index;
> > - radeon_crtc->flip_queue = alloc_workqueue("radeon-crtc",
> WQ_HIGHPRI, 0);
> > + radeon_crtc->flip_queue = alloc_workqueue("radeon-crtc",
> WQ_HIGHPRI|WQ_UNBOUND, 0);
> >   rdev->mode_info.crtcs[index] = radeon_crtc;
> >
> >   if (rdev->family >= CHIP_BONAIRE) {
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Slow memory access when using OpenCL without X11

2019-03-09 Thread Jan Vesely
On Sat, Mar 9, 2019 at 1:54 AM Lauri Ehrenpreis  wrote:
>
> Even if it's using CPU for OCL (I know it's not doing this), why does memcpy 
> on CPU slow down permanently, if I'm not doing anything with OpenCL after 
> clCreateContext?
>
> As you see from test program it just does clCreateContext and then a loop of 
> memcpy-s on CPU.
>
> Also I found out that writing different values to 
> /sys/class/drm/card0/device/power_dpm_force_performance_level changes my max 
> memcpy speed on CPU:
>
> echo "low" > /sys/class/drm/card0/device/power_dpm_force_performance_level
> ./cl_slow_test 1 5
> got 1 platforms 1 devices
> speed 731.810425 avg 731.810425 mbytes/s
> speed 163.425583 avg 447.618011 mbytes/s
> speed 123.441612 avg 339.559235 mbytes/s
> speed 121.655266 avg 285.083252 mbytes/s
> speed 123.806801 avg 252.827972 mbytes/s
>
> echo "high" > /sys/class/drm/card0/device/power_dpm_force_performance_level
> ./cl_slow_test 1 5
> got 1 platforms 1 devices
> speed 3742.063721 avg 3742.063721 mbytes/s
> speed 836.148987 avg 2289.106445 mbytes/s
> speed 189.379166 avg 1589.197266 mbytes/s
> speed 189.271393 avg 1239.215820 mbytes/s
> speed 188.290451 avg 1029.030762 mbytes/s
>
> echo "profile_standard" > 
> /sys/class/drm/card0/device/power_dpm_force_performance_level
> ./cl_slow_test 1 5
> got 1 platforms 1 devices
> speed 2303.955566 avg 2303.955566 mbytes/s
> speed 2298.224121 avg 2301.089844 mbytes/s
> speed 2295.585205 avg 2299.254883 mbytes/s
> speed 2295.762939 avg 2298.381836 mbytes/s
> speed 2288.766602 avg 2296.458740 mbytes/s
>
>  echo "profile_peak" > 
> /sys/class/drm/card0/device/power_dpm_force_performance_level
> ./cl_slow_test 1 5
> got 1 platforms 1 devices
> speed 3710.360352 avg 3710.360352 mbytes/s
> speed 3713.660400 avg 3712.010254 mbytes/s
> speed 3797.630859 avg 3740.550537 mbytes/s
> speed 3708.004883 avg 3732.414062 mbytes/s
> speed 3796.403076 avg 3745.211914 mbytes/s
>
> However none of those is close to the memcpy speed I get when I don't do 
> clCreateContext (my test prog has first arg 0):
> ./cl_slow_test 0 5
> speed 7299.201660 avg 7299.201660 mbytes/s
> speed 9298.841797 avg 8299.021484 mbytes/s
> speed 9360.181641 avg 8652.742188 mbytes/s
> speed 9004.759766 avg 8740.746094 mbytes/s
> speed 9414.607422 avg 8875.518555 mbytes/s
>
> Also attached clinfo.txt. It shows that opencl is using GPU so device node 
> permissions are probably not the issue.

Is it only memory accesses or does overall CPU performance degrade
(including compute - say sysbench) as well?

Jan

> --
> Lauri
>
> On Fri, Mar 8, 2019 at 10:35 PM Alex Deucher  wrote:
>>
>> I think you are probably using the CPU for OCL in the remote login
>> case.  When you log into the desktop, the permissions on the device
>> nodes get changed dynamically to support accelerated rendering.  You
>> probably need to change the permissions on the device nodes manually
>> if you are not logging into the desktop.
>>
>> Alex
>>
>> On Fri, Mar 8, 2019 at 2:43 PM Lauri Ehrenpreis  wrote:
>> >
>> > Hi!
>> >
>> > I am using Ryzen 2400G with Gigabyte AMD B450 AORUS board. I have latest 
>> > bios, ubuntu 18.04 and latest mainline kernel (5.0.0-05-generic) 
>> > installed. Also I have rocm-dev 2.1.96 but no rock-dkms installed.
>> >
>> > I found that when I log in over ssh and try to use OpenCL (doing 
>> > clCreateContext is enough) then cpu memory accesses after that will slow 
>> > down by 100x.
>> > If I connect HDMI cable and log in to desktop mode then this does not 
>> > happen. Also if I don't call clCreateContext then everything works 
>> > properly.
>> >
>> > Attached the test program and kernel log also. Test works like that :
>> > g++ cl_slow_test.cpp -o cl_slow_test -I /opt/rocm/opencl/include/ -L 
>> > /opt/rocm/opencl/lib/x86_64/  -lOpenCL
>> > lauri@rv:~$ ./cl_slow_test 0 5
>> > speed 7003.145508 avg 7003.145508 mbytes/s
>> > speed 8427.357422 avg 7715.251465 mbytes/s
>> > speed 9203.049805 avg 8211.184570 mbytes/s
>> > speed 9845.956055 avg 8619.877930 mbytes/s
>> > speed 9882.748047 avg 8872.452148 mbytes/s
>> > lauri@rv:~$ ./cl_slow_test 1 5
>> > got 1 platforms 1 devices
>> > speed 1599.803589 avg 1599.803589 mbytes/s
>> > speed 1665.426392 avg 1632.614990 mbytes/s
>> > speed 146.137253 avg 1137.122437 mbytes/s
>> > speed 121.056877 avg 883.106018 mbytes/s
>> > speed 122.428970 avg 730.970581 mbytes/s
>> >
>> > I also tried latest amd-staging kernel 
>> > https://github.com/M-Bab/linux-kernel-amdgpu-binaries and it had the same 
>> > issue.
>> >
>> > Can anyone point me into right direction?
>> >
>> > Br,
>> > Lauri
>> > ___
>> > amd-gfx mailing list
>> > amd-gfx@lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd

Re: [PATCH] drm/amdgpu: Partially revert commit 2dc80b006

2018-06-13 Thread Jan Vesely
thank you. Every effort to make the patches easier to follow is appreciated.

Jan

On Wed, Jun 13, 2018 at 10:39 AM, Zhu, Rex  wrote:

> Hi Jan,
>
>
> Thanks for your suggestion. Obvious the patch title was inappropriate. I
> will update it.
>
> In fact, I originally plan to send this patch internally for discussion.
> So just thought that "Revert x" can cause the attention.
>
> Best Regards
> Rex
>
>
>
>
> --
> *From:* jv...@scarletmail.rutgers.edu  on
> behalf of Jan Vesely 
> *Sent:* Wednesday, June 13, 2018 9:30 PM
> *To:* Koenig, Christian
> *Cc:* Zhu, Rex; amd-gfx list
> *Subject:* Re: [PATCH] drm/amdgpu: Partially revert commit 2dc80b006
>
> Hi,
>
> can you please improve the commit message?
> seeing "Revert $HASH" conveys zero information about the code change.
> I'm sorry for bringing this up again, but following AMDGPU/Radeon driver
> development is an exercise in frustration for anyone who is not on AMD's
> payroll.
> git commit logs like:
> "revert XYZ" or "fix bug #123" make it really cumbersome to actually look
> at history and pick interesting/breaking commits.
>
> thanks,
> Jan
>
> On Wed, Jun 13, 2018 at 8:46 AM, Christian König <
> ckoenig.leichtzumer...@gmail.com> wrote:
>
> Am 13.06.2018 um 13:40 schrieb Rex Zhu:
>
> Move the CG enablement out of delay worker thread.
>
> 1. CG/PG enablement are part of gpu hw ip initialize, we should
> wait for them complete. otherwise, there are some potential conflicts,
> for example, Suspend and CG enablement concurrently.
> 2. better run ib test after hw initialize completely. That is to say,
> ib test should be after CG/PG enablement. otherwise, the test will
> not cover the cg/pg/poweroff enable case.
>
> Signed-off-by: Rex Zhu 
>
>
> Yeah, that thought came to my mind as well.
>
> Essentially the IB test should simulate a submission from userspace to
> make sure that the stack is working as expected. I think it was just moved
> before CG/PG to avoid issues with that, which is actually not very clever.
>
> Patch is Reviewed-by: Christian König , but
> there could be some fallout we could need to deal with.
>
> Thanks,
> Christian.
>
>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 --
>   1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 9647f54..90b78c7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1709,10 +1709,6 @@ static int amdgpu_device_ip_late_set_cg_state(struct
> amdgpu_device *adev)
> if (amdgpu_emu_mode == 1)
> return 0;
>   - r = amdgpu_ib_ring_tests(adev);
> -   if (r)
> -   DRM_ERROR("ib ring test failed (%d).\n", r);
> -
> for (i = 0; i < adev->num_ip_blocks; i++) {
> if (!adev->ip_blocks[i].status.valid)
> continue;
> @@ -1793,6 +1789,9 @@ static int amdgpu_device_ip_late_init(struct
> amdgpu_device *adev)
> }
> }
>   + amdgpu_device_ip_late_set_cg_state(adev);
> +   amdgpu_device_ip_late_set_pg_state(adev);
> +
> queue_delayed_work(system_wq, &adev->late_init_work,
>msecs_to_jiffies(AMDGPU_RESUME_MS));
>   @@ -1921,8 +1920,11 @@ static void 
> amdgpu_device_ip_late_init_func_handler(struct
> work_struct *work)
>   {
> struct amdgpu_device *adev =
> container_of(work, struct amdgpu_device,
> late_init_work.work);
> -   amdgpu_device_ip_late_set_cg_state(adev);
> -   amdgpu_device_ip_late_set_pg_state(adev);
> +   int r;
> +
> +   r = amdgpu_ib_ring_tests(adev);
> +   if (r)
> +   DRM_ERROR("ib ring test failed (%d).\n", r);
>   }
> /**
>
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Partially revert commit 2dc80b006

2018-06-13 Thread Jan Vesely
Hi,

can you please improve the commit message?
seeing "Revert $HASH" conveys zero information about the code change.
I'm sorry for bringing this up again, but following AMDGPU/Radeon driver
development is an exercise in frustration for anyone who is not on AMD's
payroll.
git commit logs like:
"revert XYZ" or "fix bug #123" make it really cumbersome to actually look
at history and pick interesting/breaking commits.

thanks,
Jan

On Wed, Jun 13, 2018 at 8:46 AM, Christian König <
ckoenig.leichtzumer...@gmail.com> wrote:

> Am 13.06.2018 um 13:40 schrieb Rex Zhu:
>
>> Move the CG enablement out of delay worker thread.
>>
>> 1. CG/PG enablement are part of gpu hw ip initialize, we should
>> wait for them complete. otherwise, there are some potential conflicts,
>> for example, Suspend and CG enablement concurrently.
>> 2. better run ib test after hw initialize completely. That is to say,
>> ib test should be after CG/PG enablement. otherwise, the test will
>> not cover the cg/pg/poweroff enable case.
>>
>> Signed-off-by: Rex Zhu 
>>
>
> Yeah, that thought came to my mind as well.
>
> Essentially the IB test should simulate a submission from userspace to
> make sure that the stack is working as expected. I think it was just moved
> before CG/PG to avoid issues with that, which is actually not very clever.
>
> Patch is Reviewed-by: Christian König , but
> there could be some fallout we could need to deal with.
>
> Thanks,
> Christian.
>
>
> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 --
>>   1 file changed, 8 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 9647f54..90b78c7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -1709,10 +1709,6 @@ static int amdgpu_device_ip_late_set_cg_state(struct
>> amdgpu_device *adev)
>> if (amdgpu_emu_mode == 1)
>> return 0;
>>   - r = amdgpu_ib_ring_tests(adev);
>> -   if (r)
>> -   DRM_ERROR("ib ring test failed (%d).\n", r);
>> -
>> for (i = 0; i < adev->num_ip_blocks; i++) {
>> if (!adev->ip_blocks[i].status.valid)
>> continue;
>> @@ -1793,6 +1789,9 @@ static int amdgpu_device_ip_late_init(struct
>> amdgpu_device *adev)
>> }
>> }
>>   + amdgpu_device_ip_late_set_cg_state(adev);
>> +   amdgpu_device_ip_late_set_pg_state(adev);
>> +
>> queue_delayed_work(system_wq, &adev->late_init_work,
>>msecs_to_jiffies(AMDGPU_RESUME_MS));
>>   @@ -1921,8 +1920,11 @@ static void 
>> amdgpu_device_ip_late_init_func_handler(struct
>> work_struct *work)
>>   {
>> struct amdgpu_device *adev =
>> container_of(work, struct amdgpu_device,
>> late_init_work.work);
>> -   amdgpu_device_ip_late_set_cg_state(adev);
>> -   amdgpu_device_ip_late_set_pg_state(adev);
>> +   int r;
>> +
>> +   r = amdgpu_ib_ring_tests(adev);
>> +   if (r)
>> +   DRM_ERROR("ib ring test failed (%d).\n", r);
>>   }
>> /**
>>
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH libdrm 3/3] amdgpu: Destroy fd_hash table when the last device is removed.

2018-05-24 Thread Jan Vesely
On Fri, 2018-05-18 at 13:00 -0400, Jan Vesely wrote:
> Fixes memory leak on module unload.
> Analogous to mesa commit of the same name.
> Signed-off-by: Jan Vesely 
> ---
>  amdgpu/amdgpu_device.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/amdgpu/amdgpu_device.c b/amdgpu/amdgpu_device.c
> index e23dd3b3..34ac95b8 100644
> --- a/amdgpu/amdgpu_device.c
> +++ b/amdgpu/amdgpu_device.c
> @@ -128,6 +128,10 @@ static void 
> amdgpu_device_free_internal(amdgpu_device_handle dev)
>  {
>   pthread_mutex_lock(&fd_mutex);
>   util_hash_table_remove(fd_tab, UINT_TO_PTR(dev->fd));
> + if (util_hash_table_count(fd_tab) == 0) {
> + util_hash_table_destroy(fd_tab);
> + fd_tab = NULL;
> + }
>   close(dev->fd);
>   if ((dev->flink_fd >= 0) && (dev->fd != dev->flink_fd))
>   close(dev->flink_fd);

gentle ping.

Jan

signature.asc
Description: This is a digitally signed message part
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH libdrm 3/3] amdgpu: Destroy fd_hash table when the last device is removed.

2018-05-18 Thread Jan Vesely
Fixes memory leak on module unload.
Analogous to mesa commit of the same name.
Signed-off-by: Jan Vesely 
---
 amdgpu/amdgpu_device.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/amdgpu/amdgpu_device.c b/amdgpu/amdgpu_device.c
index e23dd3b3..34ac95b8 100644
--- a/amdgpu/amdgpu_device.c
+++ b/amdgpu/amdgpu_device.c
@@ -128,6 +128,10 @@ static void 
amdgpu_device_free_internal(amdgpu_device_handle dev)
 {
pthread_mutex_lock(&fd_mutex);
util_hash_table_remove(fd_tab, UINT_TO_PTR(dev->fd));
+   if (util_hash_table_count(fd_tab) == 0) {
+   util_hash_table_destroy(fd_tab);
+   fd_tab = NULL;
+   }
close(dev->fd);
if ((dev->flink_fd >= 0) && (dev->fd != dev->flink_fd))
close(dev->flink_fd);
-- 
2.17.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH libdrm v2 1/3] amdgpu: Take lock before removing devices from fd_tab hash table.

2018-05-18 Thread Jan Vesely
Close the file descriptors under lock as well.
v2: close fds after removing from hash table

Signed-off-by: Jan Vesely 
---
 amdgpu/amdgpu_device.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/amdgpu/amdgpu_device.c b/amdgpu/amdgpu_device.c
index 983b19ab..e23dd3b3 100644
--- a/amdgpu/amdgpu_device.c
+++ b/amdgpu/amdgpu_device.c
@@ -126,6 +126,13 @@ static int amdgpu_get_auth(int fd, int *auth)
 
 static void amdgpu_device_free_internal(amdgpu_device_handle dev)
 {
+   pthread_mutex_lock(&fd_mutex);
+   util_hash_table_remove(fd_tab, UINT_TO_PTR(dev->fd));
+   close(dev->fd);
+   if ((dev->flink_fd >= 0) && (dev->fd != dev->flink_fd))
+   close(dev->flink_fd);
+   pthread_mutex_unlock(&fd_mutex);
+
amdgpu_vamgr_deinit(&dev->vamgr_32);
amdgpu_vamgr_deinit(&dev->vamgr);
amdgpu_vamgr_deinit(&dev->vamgr_high_32);
@@ -133,10 +140,6 @@ static void 
amdgpu_device_free_internal(amdgpu_device_handle dev)
util_hash_table_destroy(dev->bo_flink_names);
util_hash_table_destroy(dev->bo_handles);
pthread_mutex_destroy(&dev->bo_table_mutex);
-   util_hash_table_remove(fd_tab, UINT_TO_PTR(dev->fd));
-   close(dev->fd);
-   if ((dev->flink_fd >= 0) && (dev->fd != dev->flink_fd))
-   close(dev->flink_fd);
free(dev->marketing_name);
free(dev);
 }
-- 
2.17.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH libdrm 2/3] amdgpu/util_hash_table: Add helper function to count the number of entries in hash table

2018-05-18 Thread Jan Vesely
Analogous to the mesa commit of the same name.
Signed-off-by: Jan Vesely 
---
 amdgpu/util_hash_table.c | 12 
 amdgpu/util_hash_table.h |  2 ++
 2 files changed, 14 insertions(+)

diff --git a/amdgpu/util_hash_table.c b/amdgpu/util_hash_table.c
index 89a8bf9b..e06d4415 100644
--- a/amdgpu/util_hash_table.c
+++ b/amdgpu/util_hash_table.c
@@ -237,6 +237,18 @@ drm_private void util_hash_table_foreach(struct 
util_hash_table *ht,
}
 }
 
+static void util_hash_table_inc(void *k, void *v, void *d)
+{
+   ++*(size_t *)d;
+}
+
+drm_private size_t util_hash_table_count(struct util_hash_table *ht)
+{
+   size_t count = 0;
+   util_hash_table_foreach(ht, util_hash_table_inc, &count);
+   return count;
+}
+
 drm_private void util_hash_table_destroy(struct util_hash_table *ht)
 {
struct util_hash_iter iter;
diff --git a/amdgpu/util_hash_table.h b/amdgpu/util_hash_table.h
index 5e295a81..3ab81a12 100644
--- a/amdgpu/util_hash_table.h
+++ b/amdgpu/util_hash_table.h
@@ -64,6 +64,8 @@ drm_private void util_hash_table_foreach(struct 
util_hash_table *ht,
void (*callback)(void *key, void *value, void *data),
void *data);
 
+drm_private size_t util_hash_table_count(struct util_hash_table *ht);
+
 drm_private void util_hash_table_destroy(struct util_hash_table *ht);
 
 #endif /* U_HASH_TABLE_H_ */
-- 
2.17.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH libdrm 1/1] amdgpu: Take lock before removing devices from fd_tab hash table.

2018-05-17 Thread Jan Vesely
On Thu, 2018-05-10 at 19:33 -0400, Jan Vesely wrote:
> Close the file descriptors under lock as well.
> 
> Signed-off-by: Jan Vesely 
> ---
>  amdgpu/amdgpu_device.c | 11 +++
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/amdgpu/amdgpu_device.c b/amdgpu/amdgpu_device.c
> index 983b19ab..f71fc050 100644
> --- a/amdgpu/amdgpu_device.c
> +++ b/amdgpu/amdgpu_device.c
> @@ -126,6 +126,13 @@ static int amdgpu_get_auth(int fd, int *auth)
>  
>  static void amdgpu_device_free_internal(amdgpu_device_handle dev)
>  {
> + pthread_mutex_lock(&fd_mutex);
> + close(dev->fd);
> + if ((dev->flink_fd >= 0) && (dev->fd != dev->flink_fd))
> + close(dev->flink_fd);
> + util_hash_table_remove(fd_tab, UINT_TO_PTR(dev->fd));
> + pthread_mutex_unlock(&fd_mutex);
> +
>   amdgpu_vamgr_deinit(&dev->vamgr_32);
>   amdgpu_vamgr_deinit(&dev->vamgr);
>   amdgpu_vamgr_deinit(&dev->vamgr_high_32);
> @@ -133,10 +140,6 @@ static void 
> amdgpu_device_free_internal(amdgpu_device_handle dev)
>   util_hash_table_destroy(dev->bo_flink_names);
>   util_hash_table_destroy(dev->bo_handles);
>   pthread_mutex_destroy(&dev->bo_table_mutex);
> - util_hash_table_remove(fd_tab, UINT_TO_PTR(dev->fd));
> - close(dev->fd);
> - if ((dev->flink_fd >= 0) && (dev->fd != dev->flink_fd))
> - close(dev->flink_fd);
>   free(dev->marketing_name);
>   free(dev);
>  }
ping.

Jan

signature.asc
Description: This is a digitally signed message part
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH libdrm 1/1] amdgpu: Take lock before removing devices from fd_tab hash table.

2018-05-10 Thread Jan Vesely
Close the file descriptors under lock as well.

Signed-off-by: Jan Vesely 
---
 amdgpu/amdgpu_device.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/amdgpu/amdgpu_device.c b/amdgpu/amdgpu_device.c
index 983b19ab..f71fc050 100644
--- a/amdgpu/amdgpu_device.c
+++ b/amdgpu/amdgpu_device.c
@@ -126,6 +126,13 @@ static int amdgpu_get_auth(int fd, int *auth)
 
 static void amdgpu_device_free_internal(amdgpu_device_handle dev)
 {
+   pthread_mutex_lock(&fd_mutex);
+   close(dev->fd);
+   if ((dev->flink_fd >= 0) && (dev->fd != dev->flink_fd))
+   close(dev->flink_fd);
+   util_hash_table_remove(fd_tab, UINT_TO_PTR(dev->fd));
+   pthread_mutex_unlock(&fd_mutex);
+
amdgpu_vamgr_deinit(&dev->vamgr_32);
amdgpu_vamgr_deinit(&dev->vamgr);
amdgpu_vamgr_deinit(&dev->vamgr_high_32);
@@ -133,10 +140,6 @@ static void 
amdgpu_device_free_internal(amdgpu_device_handle dev)
util_hash_table_destroy(dev->bo_flink_names);
util_hash_table_destroy(dev->bo_handles);
pthread_mutex_destroy(&dev->bo_table_mutex);
-   util_hash_table_remove(fd_tab, UINT_TO_PTR(dev->fd));
-   close(dev->fd);
-   if ((dev->flink_fd >= 0) && (dev->fd != dev->flink_fd))
-   close(dev->flink_fd);
free(dev->marketing_name);
free(dev);
 }
-- 
2.17.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/1] drm/amdkfd: Do not ignore requested queue size during allocation

2017-11-30 Thread Jan Vesely
On Wed, 2017-11-29 at 16:58 -0500, Felix Kuehling wrote:
> You can see the state of the queues in debugfs:
> /sys/kernel/debug/kfd/... You can look at MQDs and HQDs.

thanks. how do I decode the information?
The rptr always stops at pos 60 which looks like this in mqds:

 DIQ on device 45a2
: c0310800 4000      

0020:    0001    

0040:        

0060:        


If I understood correctly that's the queue dump, so those fs look
wrong

> 
> If your application isn't stopping queues deliberately, queues get
> disabled by evictions, usually temporarily. You'll see kernel messages
> when that happens.
> 
> A VM fault will result in queues of the offending process getting
> disabled permanently. Again, you'll see messages about that in the
> kernel log.
> 
> The RPTR can also stop advancing if you have an infinite loop in a
> shader program, or just a shader that takes a very long time to execute.
> Or maybe if you have some dependencies (barriers) in your AQL packets
> that never get satisfied.
> 
> The function you changed only affects the HIQ, the queue that KFD uses
> to control the HWS. It does not affect user mode queues. If your problem
> is with a user mode queue, your change should have no effect at all.

It's not a userspace queue that stops. I'm using kernel dbgdev to issue
wave_resume commands. (waves are halted after executing
s_sendmsg_halt).
I bumped KFD_KERNEL_QUEUE_SIZE to 16KB to make sure all 320 resume
commads fit (otherwise I get spurious ENOMEM when the queue is full but
still advancing).

thanks,
Jan

> 
> Regards,
>   Felix
> 
> 
> On 2017-11-29 04:43 PM, Jan Vesely wrote:
> > On Mon, 2017-11-20 at 14:22 -0500, Felix Kuehling wrote:
> > > I think this patch is not correct. The EOP-mem is not associated with
> > > the queue size. The EOP buffer is a separate buffer used by the firmware
> > > to handle command completion. As I understand it, this allows more
> > > concurrency, while still making it look like all commands in the queue
> > > are completing in order.
> > 
> > thanks for the explanation. I was looking for a source of a CP hang
> > (rptr stops advancing), but bumping the eop size actually mode things
> > worse. Is there a way to find out if a queue got disabled and for what
> > reason? (I'm running ROCK-1.6.x based kernel)
> > 
> > thanks,
> > Jan
> > 
> > > Regards,
> > >   Felix
> > > 
> > > 
> > > On 2017-11-19 03:19 AM, Oded Gabbay wrote:
> > > > On Thu, Nov 16, 2017 at 11:36 PM, Jan Vesely  
> > > > wrote:
> > > > > Signed-off-by: Jan Vesely 
> > > > > ---
> > > > >  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c | 5 +++--
> > > > >  1 file changed, 3 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c 
> > > > > b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
> > > > > index f1d48281e322..b3bee39661ab 100644
> > > > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
> > > > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
> > > > > @@ -37,15 +37,16 @@ static bool initialize_vi(struct kernel_queue 
> > > > > *kq, struct kfd_dev *dev,
> > > > > enum kfd_queue_type type, unsigned int 
> > > > > queue_size)
> > > > >  {
> > > > > int retval;
> > > > > +   unsigned int size = ALIGN(queue_size, PAGE_SIZE);
> > > > > 
> > > > > -   retval = kfd_gtt_sa_allocate(dev, PAGE_SIZE, &kq->eop_mem);
> > > > > +   retval = kfd_gtt_sa_allocate(dev, size, &kq->eop_mem);
> > > > > if (retval != 0)
> > > > > return false;
> > > > > 
> > > > > kq->eop_gpu_addr = kq->eop_mem->gpu_addr;
> > > > > kq->eop_kernel_addr = kq->eop_mem->cpu_ptr;
> > > > > 
> > > > > -   memset(kq->eop_kernel_addr, 0, PAGE_SIZE);
> > > > > +   memset(kq->eop_kernel_addr, 0, size);
> > > > > 
> > > > > return true;
> > > > >  }
> > > > > --
> > > > > 2.13.6
> > > > > 
> > > > > ___
> > > > > amd-gfx mailing list
> > > > > amd-gfx@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> > > > 
> > > > Thanks!
> > > > Applied to -next tree
> > > > Oded
> > > > ___
> > > > amd-gfx mailing list
> > > > amd-gfx@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> 
> 


signature.asc
Description: This is a digitally signed message part
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/1] drm/amdkfd: Do not ignore requested queue size during allocation

2017-11-29 Thread Jan Vesely
On Mon, 2017-11-20 at 14:22 -0500, Felix Kuehling wrote:
> I think this patch is not correct. The EOP-mem is not associated with
> the queue size. The EOP buffer is a separate buffer used by the firmware
> to handle command completion. As I understand it, this allows more
> concurrency, while still making it look like all commands in the queue
> are completing in order.

thanks for the explanation. I was looking for a source of a CP hang
(rptr stops advancing), but bumping the eop size actually mode things
worse. Is there a way to find out if a queue got disabled and for what
reason? (I'm running ROCK-1.6.x based kernel)

thanks,
Jan

> 
> Regards,
>   Felix
> 
> 
> On 2017-11-19 03:19 AM, Oded Gabbay wrote:
> > On Thu, Nov 16, 2017 at 11:36 PM, Jan Vesely  wrote:
> > > Signed-off-by: Jan Vesely 
> > > ---
> > >  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c | 5 +++--
> > >  1 file changed, 3 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c 
> > > b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
> > > index f1d48281e322..b3bee39661ab 100644
> > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
> > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
> > > @@ -37,15 +37,16 @@ static bool initialize_vi(struct kernel_queue *kq, 
> > > struct kfd_dev *dev,
> > > enum kfd_queue_type type, unsigned int queue_size)
> > >  {
> > > int retval;
> > > +   unsigned int size = ALIGN(queue_size, PAGE_SIZE);
> > > 
> > > -   retval = kfd_gtt_sa_allocate(dev, PAGE_SIZE, &kq->eop_mem);
> > > +   retval = kfd_gtt_sa_allocate(dev, size, &kq->eop_mem);
> > > if (retval != 0)
> > > return false;
> > > 
> > > kq->eop_gpu_addr = kq->eop_mem->gpu_addr;
> > > kq->eop_kernel_addr = kq->eop_mem->cpu_ptr;
> > > 
> > > -   memset(kq->eop_kernel_addr, 0, PAGE_SIZE);
> > > +   memset(kq->eop_kernel_addr, 0, size);
> > > 
> > > return true;
> > >  }
> > > --
> > > 2.13.6
> > > 
> > > ___
> > > amd-gfx mailing list
> > > amd-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> > 
> > Thanks!
> > Applied to -next tree
> > Oded
> > ___
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> 
> 


signature.asc
Description: This is a digitally signed message part
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm/amdgpu:fix virtual dce bug

2017-11-16 Thread Jan Vesely
On Fri, 2017-11-17 at 04:26 +, Liu, Monk wrote:
> I think it's already clear enough 

nice. what a friendly response. good job!

"fix a bug" is definitely not descriptive of the change, and the commit
message does not even parse as a sentence.


Jan

> 
> -Original Message-
> From: Jan Vesely [mailto:jv...@scarletmail.rutgers.edu] On Behalf Of Jan 
> Vesely
> Sent: 2017年11月17日 0:40
> To: Liu, Monk ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 1/2] drm/amdgpu:fix virtual dce bug
> 
> On Thu, 2017-11-16 at 11:14 +0800, Monk Liu wrote:
> > this fix the issue that access memory after freed after driver 
> > unloaded.
> 
> can you please change the patch subject and commit message to something more 
> descriptive?
> 
> Jan
> 
> > 
> > Change-Id: I64e2488c18f5dc044b57c74567785da21fc028da
> > Signed-off-by: Monk Liu 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/dce_virtual.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/dce_virtual.c 
> > b/drivers/gpu/drm/amd/amdgpu/dce_virtual.c
> > index a8829af..39460eb 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/dce_virtual.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/dce_virtual.c
> > @@ -437,6 +437,8 @@ static int dce_virtual_sw_fini(void *handle)
> > drm_kms_helper_poll_fini(adev->ddev);
> >  
> > drm_mode_config_cleanup(adev->ddev);
> > +   /* clear crtcs pointer to avoid dce irq finish routine access freed 
> > data */
> > +   memset(adev->mode_info.crtcs, 0, sizeof(adev->mode_info.crtcs[0]) * 
> > +AMDGPU_MAX_CRTCS);
> > adev->mode_info.mode_config_initialized = false;
> > return 0;
> >  }
> > @@ -723,7 +725,7 @@ static void 
> > dce_virtual_set_crtc_vblank_interrupt_state(struct amdgpu_device *ad
> > int crtc,
> > enum 
> > amdgpu_interrupt_state state)  {
> > -   if (crtc >= adev->mode_info.num_crtc) {
> > +   if (crtc >= adev->mode_info.num_crtc || 
> > +!adev->mode_info.crtcs[crtc]) {
> > DRM_DEBUG("invalid crtc %d\n", crtc);
> > return;
> > }

-- 
Jan Vesely 

signature.asc
Description: This is a digitally signed message part
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/1] drm/amdkfd: Do not ignore requested queue size during allocation

2017-11-16 Thread Jan Vesely
Signed-off-by: Jan Vesely 
---
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
index f1d48281e322..b3bee39661ab 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue_vi.c
@@ -37,15 +37,16 @@ static bool initialize_vi(struct kernel_queue *kq, struct 
kfd_dev *dev,
enum kfd_queue_type type, unsigned int queue_size)
 {
int retval;
+   unsigned int size = ALIGN(queue_size, PAGE_SIZE);
 
-   retval = kfd_gtt_sa_allocate(dev, PAGE_SIZE, &kq->eop_mem);
+   retval = kfd_gtt_sa_allocate(dev, size, &kq->eop_mem);
if (retval != 0)
return false;
 
kq->eop_gpu_addr = kq->eop_mem->gpu_addr;
kq->eop_kernel_addr = kq->eop_mem->cpu_ptr;
 
-   memset(kq->eop_kernel_addr, 0, PAGE_SIZE);
+   memset(kq->eop_kernel_addr, 0, size);
 
return true;
 }
-- 
2.13.6

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm/amdgpu:fix virtual dce bug

2017-11-16 Thread Jan Vesely
On Thu, 2017-11-16 at 11:14 +0800, Monk Liu wrote:
> this fix the issue that access memory after freed
> after driver unloaded.

can you please change the patch subject and commit message to something
more descriptive?

Jan

> 
> Change-Id: I64e2488c18f5dc044b57c74567785da21fc028da
> Signed-off-by: Monk Liu 
> ---
>  drivers/gpu/drm/amd/amdgpu/dce_virtual.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/dce_virtual.c 
> b/drivers/gpu/drm/amd/amdgpu/dce_virtual.c
> index a8829af..39460eb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/dce_virtual.c
> +++ b/drivers/gpu/drm/amd/amdgpu/dce_virtual.c
> @@ -437,6 +437,8 @@ static int dce_virtual_sw_fini(void *handle)
>   drm_kms_helper_poll_fini(adev->ddev);
>  
>   drm_mode_config_cleanup(adev->ddev);
> + /* clear crtcs pointer to avoid dce irq finish routine access freed 
> data */
> + memset(adev->mode_info.crtcs, 0, sizeof(adev->mode_info.crtcs[0]) * 
> AMDGPU_MAX_CRTCS);
>   adev->mode_info.mode_config_initialized = false;
>   return 0;
>  }
> @@ -723,7 +725,7 @@ static void 
> dce_virtual_set_crtc_vblank_interrupt_state(struct amdgpu_device *ad
>   int crtc,
>   enum 
> amdgpu_interrupt_state state)
>  {
> - if (crtc >= adev->mode_info.num_crtc) {
> + if (crtc >= adev->mode_info.num_crtc || !adev->mode_info.crtcs[crtc]) {
>   DRM_DEBUG("invalid crtc %d\n", crtc);
>   return;
>   }


signature.asc
Description: This is a digitally signed message part
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 03/11] drm/amdkfd: Rectify the jiffies calculation error with milliseconds

2017-09-15 Thread Jan Vesely
On Fri, 2017-09-15 at 19:42 -0400, Felix Kuehling wrote:
> From: Yong Zhao 
> 
> The timeout in milliseconds should not be regarded as jiffies. This
> commit fixed that.
> 
> Signed-off-by: Yong Zhao 
> Signed-off-by: Felix Kuehling 
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 8 +---
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 +-
>  2 files changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index 5db82b8..3db6a31 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -835,12 +835,14 @@ static int create_queue_cpsch(struct 
> device_queue_manager *dqm, struct queue *q,
>  
>  int amdkfd_fence_wait_timeout(unsigned int *fence_addr,
>   unsigned int fence_value,
> - unsigned long timeout)
> + unsigned long timeout_ms)
>  {
> - timeout += jiffies;
> + unsigned long end_jiffies;
> +
> + end_jiffies = (timeout_ms * HZ / 1000) + jiffies;

msecs_to_jiffies? (jiffies.h)

Jan

>  
>   while (*fence_addr != fence_value) {
> - if (time_after(jiffies, timeout)) {
> + if (time_after(jiffies, end_jiffies)) {
>   pr_err("qcm fence wait loop timeout expired\n");
>   return -ETIME;
>   }
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
> b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> index ef582cc..f8d6a8e 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> @@ -669,7 +669,7 @@ struct kernel_queue *pqm_get_kernel_queue(struct 
> process_queue_manager *pqm,
>  
>  int amdkfd_fence_wait_timeout(unsigned int *fence_addr,
>   unsigned int fence_value,
> - unsigned long timeout);
> + unsigned long timeout_ms);
>  
>  /* Packet Manager */
>  


signature.asc
Description: This is a digitally signed message part
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH libdrm 1/1] amdgpu: Add FX-9800P Bristol Ridge iGPU id

2017-07-28 Thread Jan Vesely
Signed-off-by: Jan Vesely 
---
 data/amdgpu.ids | 1 +
 1 file changed, 1 insertion(+)

diff --git a/data/amdgpu.ids b/data/amdgpu.ids
index 0b98c3c3..f6c65dd9 100644
--- a/data/amdgpu.ids
+++ b/data/amdgpu.ids
@@ -153,6 +153,7 @@
 9874,  C5, AMD Radeon R6 Graphics
 9874,  C6, AMD Radeon R6 Graphics
 9874,  C7, AMD Radeon R5 Graphics
+9874,  C8, AMD Radeon R7 Graphics
 9874,  81, AMD Radeon R6 Graphics
 9874,  87, AMD Radeon R5 Graphics
 9874,  85, AMD Radeon R6 Graphics
-- 
2.13.3

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] iommu/amd: flush IOTLB for specific domains only (v2)

2017-05-19 Thread Jan Vesely
On Fri, 2017-05-19 at 15:32 +0530, arindam.n...@amd.com wrote:
> From: Arindam Nath 
> 
> Change History
> --
> 
> v2: changes suggested by Joerg
> - add flush flag to improve efficiency of flush operation
> 
> v1:
> - The idea behind flush queues is to defer the IOTLB flushing
>   for domains for which the mappings are no longer valid. We
>   add such domains in queue_add(), and when the queue size
>   reaches FLUSH_QUEUE_SIZE, we perform __queue_flush().
> 
>   Since we have already taken lock before __queue_flush()
>   is called, we need to make sure the IOTLB flushing is
>   performed as quickly as possible.
> 
>   In the current implementation, we perform IOTLB flushing
>   for all domains irrespective of which ones were actually
>   added in the flush queue initially. This can be quite
>   expensive especially for domains for which unmapping is
>   not required at this point of time.
> 
>   This patch makes use of domain information in
>   'struct flush_queue_entry' to make sure we only flush
>   IOTLBs for domains who need it, skipping others.

Hi,

just a note, the patch also fixes "AMD-Vi: Completion-Wait loop timed
out" boot hang on my system (carrizo + iceland) [0,1,2]. Presumably the
old loop also tried to flush domains that included powered-off devices.

regards,
Jan

[0] https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/issues/20
[1] https://bugs.freedesktop.org/show_bug.cgi?id=101029
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1448121

> 
> Suggested-by: Joerg Roedel 
> Signed-off-by: Arindam Nath 
> ---
>  drivers/iommu/amd_iommu.c   | 27 ---
>  drivers/iommu/amd_iommu_types.h |  2 ++
>  2 files changed, 22 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 63cacf5..1edeebec 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -2227,15 +2227,26 @@ static struct iommu_group 
> *amd_iommu_device_group(struct device *dev)
>  
>  static void __queue_flush(struct flush_queue *queue)
>  {
> - struct protection_domain *domain;
> - unsigned long flags;
>   int idx;
>  
> - /* First flush TLB of all known domains */
> - spin_lock_irqsave(&amd_iommu_pd_lock, flags);
> - list_for_each_entry(domain, &amd_iommu_pd_list, list)
> - domain_flush_tlb(domain);
> - spin_unlock_irqrestore(&amd_iommu_pd_lock, flags);
> + /* First flush TLB of all domains which were added to flush queue */
> + for (idx = 0; idx < queue->next; ++idx) {
> + struct flush_queue_entry *entry;
> +
> + entry = queue->entries + idx;
> +
> + /*
> +  * There might be cases where multiple IOVA entries for the
> +  * same domain are queued in the flush queue. To avoid
> +  * flushing the same domain again, we check whether the
> +  * flag is set or not. This improves the efficiency of
> +  * flush operation.
> +  */
> + if (!entry->dma_dom->domain.already_flushed) {
> + entry->dma_dom->domain.already_flushed = true;
> + domain_flush_tlb(&entry->dma_dom->domain);
> + }
> + }
>  
>   /* Wait until flushes have completed */
>   domain_flush_complete(NULL);
> @@ -2289,6 +2300,8 @@ static void queue_add(struct dma_ops_domain *dma_dom,
>   pages = __roundup_pow_of_two(pages);
>   address >>= PAGE_SHIFT;
>  
> + dma_dom->domain.already_flushed = false;
> +
>   queue = get_cpu_ptr(&flush_queue);
>   spin_lock_irqsave(&queue->lock, flags);
>  
> diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
> index 4de8f41..4f5519d 100644
> --- a/drivers/iommu/amd_iommu_types.h
> +++ b/drivers/iommu/amd_iommu_types.h
> @@ -454,6 +454,8 @@ struct protection_domain {
>   bool updated;   /* complete domain flush required */
>   unsigned dev_cnt;   /* devices assigned to this domain */
>   unsigned dev_iommu[MAX_IOMMUS]; /* per-IOMMU reference count */
> + bool already_flushed;   /* flag to avoid flushing the same domain again
> +in a single invocation of __queue_flush() */
>  };
>  
>  /*

-- 
Jan Vesely 

signature.asc
Description: This is a digitally signed message part
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx