Re: [PATCH] drm/amdgpu: Fix incorrect return value

2024-04-12 Thread Christian König

Am 03.04.24 um 09:06 schrieb YiPeng Chai:

[Why]
   After calling amdgpu_vram_mgr_reserve_range
multiple times with the same address, calling
amdgpu_vram_mgr_query_page_status will always
return -EBUSY.
   From the second call to amdgpu_vram_mgr_reserve_range,
the same address will be added to the reservations_pending
list again and is never moved to the reserved_pages list
because the address had been reserved.


Well that sounds like a really bad idea to me. Why is the function 
called multiple times with the same address in the first place ?


Apart from that a note on the coding style below.



[How]
   First add the address status check before calling
amdgpu_vram_mgr_do_reserve, if the address is already
reserved, do nothing; If the address is already in the
reservations_pending list, directly reserve memory;
only add new nodes for the addresses that are not in the
reserved_pages list and reservations_pending list.

Signed-off-by: YiPeng Chai 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 28 +---
  1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index 1e36c428d254..0bf3f4092900 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -317,7 +317,6 @@ static void amdgpu_vram_mgr_do_reserve(struct 
ttm_resource_manager *man)
  
  		dev_dbg(adev->dev, "Reservation 0x%llx - %lld, Succeeded\n",

rsv->start, rsv->size);
-
vis_usage = amdgpu_vram_mgr_vis_size(adev, block);
atomic64_add(vis_usage, >vis_usage);
spin_lock(>bdev->lru_lock);
@@ -340,19 +339,30 @@ int amdgpu_vram_mgr_reserve_range(struct amdgpu_vram_mgr 
*mgr,
  uint64_t start, uint64_t size)
  {
struct amdgpu_vram_reservation *rsv;
+   int ret = 0;


Don't initialize local variables when it isn't necessary.

  
-	rsv = kzalloc(sizeof(*rsv), GFP_KERNEL);

-   if (!rsv)
-   return -ENOMEM;
+   ret = amdgpu_vram_mgr_query_page_status(mgr, start);
+   if (!ret)
+   return 0;
+
+   if (ret == -ENOENT) {
+   rsv = kzalloc(sizeof(*rsv), GFP_KERNEL);
+   if (!rsv)
+   return -ENOMEM;
  
-	INIT_LIST_HEAD(>allocated);

-   INIT_LIST_HEAD(>blocks);
+   INIT_LIST_HEAD(>allocated);
+   INIT_LIST_HEAD(>blocks);
  
-	rsv->start = start;

-   rsv->size = size;
+   rsv->start = start;
+   rsv->size = size;
+
+   mutex_lock(>lock);
+   list_add_tail(>blocks, >reservations_pending);
+   mutex_unlock(>lock);
+
+   }


You should probably not lock/unlock here.

Regards,
Christian.

  
  	mutex_lock(>lock);

-   list_add_tail(>blocks, >reservations_pending);
amdgpu_vram_mgr_do_reserve(>manager);
mutex_unlock(>lock);
  




RE: [PATCH] drm/amdgpu: Fix incorrect return value

2024-04-09 Thread Chai, Thomas
[AMD Official Use Only - General]

OK


-
Best Regards,
Thomas

-Original Message-
From: Zhou1, Tao 
Sent: Tuesday, April 9, 2024 10:52 AM
To: Chai, Thomas ; amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Li, Candice ; 
Wang, Yang(Kevin) ; Yang, Stanley 
Subject: RE: [PATCH] drm/amdgpu: Fix incorrect return value

[AMD Official Use Only - General]

> -Original Message-
> From: Chai, Thomas 
> Sent: Wednesday, April 3, 2024 3:07 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Chai, Thomas ; Zhang, Hawking
> ; Zhou1, Tao ; Li, Candice
> ; Wang, Yang(Kevin) ;
> Yang, Stanley ; Chai, Thomas
> 
> Subject: [PATCH] drm/amdgpu: Fix incorrect return value
>
> [Why]
>   After calling amdgpu_vram_mgr_reserve_range multiple times with the
> same address, calling amdgpu_vram_mgr_query_page_status will always
> return - EBUSY.
>   From the second call to amdgpu_vram_mgr_reserve_range, the same
> address will be added to the reservations_pending list again and is
> never moved to the reserved_pages list because the address had been reserved.
>
> [How]
>   First add the address status check before calling
> amdgpu_vram_mgr_do_reserve, if the address is already reserved, do
> nothing; If the address is already in the reservations_pending list,
> directly reserve memory; only add new nodes for the addresses that are
> not in the reserved_pages list and reservations_pending list.
>
> Signed-off-by: YiPeng Chai 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 28
> +---
>  1 file changed, 19 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> index 1e36c428d254..0bf3f4092900 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> @@ -317,7 +317,6 @@ static void amdgpu_vram_mgr_do_reserve(struct
> ttm_resource_manager *man)
>
>   dev_dbg(adev->dev, "Reservation 0x%llx - %lld, Succeeded\n",
>   rsv->start, rsv->size);
> -
>   vis_usage = amdgpu_vram_mgr_vis_size(adev, block);
>   atomic64_add(vis_usage, >vis_usage);
>   spin_lock(>bdev->lru_lock); @@ -340,19 +339,30 @@
> int amdgpu_vram_mgr_reserve_range(struct
> amdgpu_vram_mgr *mgr,
> uint64_t start, uint64_t size)  {
>   struct amdgpu_vram_reservation *rsv;
> + int ret = 0;
>
> - rsv = kzalloc(sizeof(*rsv), GFP_KERNEL);
> - if (!rsv)
> - return -ENOMEM;
> + ret = amdgpu_vram_mgr_query_page_status(mgr, start);
> + if (!ret)
> + return 0;
> +
> + if (ret == -ENOENT) {
> + rsv = kzalloc(sizeof(*rsv), GFP_KERNEL);
> + if (!rsv)
> + return -ENOMEM;
>
> - INIT_LIST_HEAD(>allocated);
> - INIT_LIST_HEAD(>blocks);
> + INIT_LIST_HEAD(>allocated);
> + INIT_LIST_HEAD(>blocks);
>
> - rsv->start = start;
> - rsv->size = size;
> + rsv->start = start;
> + rsv->size = size;
> +
> + mutex_lock(>lock);
> + list_add_tail(>blocks, >reservations_pending);
> + mutex_unlock(>lock);

[Tao] we can drop the mutex_unlock and add if (ret != -ENOENT) for the second 
mutex_lock to avoid unlocking/locking repeatedly.

> +
> + }
>
>   mutex_lock(>lock);
> - list_add_tail(>blocks, >reservations_pending);
>   amdgpu_vram_mgr_do_reserve(>manager);
>   mutex_unlock(>lock);
>
> --
> 2.34.1




RE: [PATCH] drm/amdgpu: Fix incorrect return value

2024-04-08 Thread Zhou1, Tao
[AMD Official Use Only - General]

> -Original Message-
> From: Chai, Thomas 
> Sent: Wednesday, April 3, 2024 3:07 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Chai, Thomas ; Zhang, Hawking
> ; Zhou1, Tao ; Li, Candice
> ; Wang, Yang(Kevin) ; Yang,
> Stanley ; Chai, Thomas 
> Subject: [PATCH] drm/amdgpu: Fix incorrect return value
>
> [Why]
>   After calling amdgpu_vram_mgr_reserve_range multiple times with the same
> address, calling amdgpu_vram_mgr_query_page_status will always return -
> EBUSY.
>   From the second call to amdgpu_vram_mgr_reserve_range, the same address
> will be added to the reservations_pending list again and is never moved to the
> reserved_pages list because the address had been reserved.
>
> [How]
>   First add the address status check before calling
> amdgpu_vram_mgr_do_reserve, if the address is already reserved, do nothing; If
> the address is already in the reservations_pending list, directly reserve 
> memory;
> only add new nodes for the addresses that are not in the reserved_pages list 
> and
> reservations_pending list.
>
> Signed-off-by: YiPeng Chai 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 28 +---
>  1 file changed, 19 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> index 1e36c428d254..0bf3f4092900 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> @@ -317,7 +317,6 @@ static void amdgpu_vram_mgr_do_reserve(struct
> ttm_resource_manager *man)
>
>   dev_dbg(adev->dev, "Reservation 0x%llx - %lld, Succeeded\n",
>   rsv->start, rsv->size);
> -
>   vis_usage = amdgpu_vram_mgr_vis_size(adev, block);
>   atomic64_add(vis_usage, >vis_usage);
>   spin_lock(>bdev->lru_lock);
> @@ -340,19 +339,30 @@ int amdgpu_vram_mgr_reserve_range(struct
> amdgpu_vram_mgr *mgr,
> uint64_t start, uint64_t size)
>  {
>   struct amdgpu_vram_reservation *rsv;
> + int ret = 0;
>
> - rsv = kzalloc(sizeof(*rsv), GFP_KERNEL);
> - if (!rsv)
> - return -ENOMEM;
> + ret = amdgpu_vram_mgr_query_page_status(mgr, start);
> + if (!ret)
> + return 0;
> +
> + if (ret == -ENOENT) {
> + rsv = kzalloc(sizeof(*rsv), GFP_KERNEL);
> + if (!rsv)
> + return -ENOMEM;
>
> - INIT_LIST_HEAD(>allocated);
> - INIT_LIST_HEAD(>blocks);
> + INIT_LIST_HEAD(>allocated);
> + INIT_LIST_HEAD(>blocks);
>
> - rsv->start = start;
> - rsv->size = size;
> + rsv->start = start;
> + rsv->size = size;
> +
> + mutex_lock(>lock);
> + list_add_tail(>blocks, >reservations_pending);
> + mutex_unlock(>lock);

[Tao] we can drop the mutex_unlock and add if (ret != -ENOENT) for the second 
mutex_lock to avoid unlocking/locking repeatedly.

> +
> + }
>
>   mutex_lock(>lock);
> - list_add_tail(>blocks, >reservations_pending);
>   amdgpu_vram_mgr_do_reserve(>manager);
>   mutex_unlock(>lock);
>
> --
> 2.34.1



RE: [PATCH] drm/amdgpu: Fix incorrect return value

2024-04-08 Thread Chai, Thomas
[AMD Official Use Only - General]

-
Best Regards,
Thomas

-Original Message-
From: Zhou1, Tao 
Sent: Monday, April 8, 2024 4:41 PM
To: Chai, Thomas ; amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Li, Candice ; 
Wang, Yang(Kevin) ; Yang, Stanley 
Subject: RE: [PATCH] drm/amdgpu: Fix incorrect return value

[AMD Official Use Only - General]

> -Original Message-
> From: Chai, Thomas 
> Sent: Sunday, April 7, 2024 10:21 AM
> To: Zhou1, Tao ; amd-gfx@lists.freedesktop.org
> Cc: Zhang, Hawking ; Li, Candice
> ; Wang, Yang(Kevin) ;
> Yang, Stanley 
> Subject: RE: [PATCH] drm/amdgpu: Fix incorrect return value
>
> [AMD Official Use Only - General]
>
> -
> Best Regards,
> Thomas
>
> -Original Message-
> From: Zhou1, Tao 
> Sent: Wednesday, April 3, 2024 6:36 PM
> To: Chai, Thomas ; amd-gfx@lists.freedesktop.org
> Cc: Zhang, Hawking ; Li, Candice
> ; Wang, Yang(Kevin) ;
> Yang, Stanley 
> Subject: RE: [PATCH] drm/amdgpu: Fix incorrect return value
>
> [AMD Official Use Only - General]
>
> > -Original Message-
> > From: Chai, Thomas 
> > Sent: Wednesday, April 3, 2024 3:07 PM
> > To: amd-gfx@lists.freedesktop.org
> > Cc: Chai, Thomas ; Zhang, Hawking
> > ; Zhou1, Tao ; Li, Candice
> > ; Wang, Yang(Kevin) ;
> > Yang, Stanley ; Chai, Thomas
> > 
> > Subject: [PATCH] drm/amdgpu: Fix incorrect return value
> >
> > [Why]
> >   After calling amdgpu_vram_mgr_reserve_range multiple times with
> > the same address, calling amdgpu_vram_mgr_query_page_status will
> > always return - EBUSY.
>
> >[Tao] could you explain why we call amdgpu_vram_mgr_reserve_range
> >multiple
> times with the same  address? IIRC, we skip duplicate address before
> reserve memory.
>
> [Thomas]
>When poison creation interrupt is received, since some poisoning
> addresses may have been allocated by some processes, reserving these memories 
> will fail.
> These memory will be tried to reserve again after killing the poisoned
> process in the subsequent poisoning consumption interrupt handler.
> so amdgpu_vram_mgr_reserve_range needs to be called multiple times
> with the same address.
>
> >   From the second call to amdgpu_vram_mgr_reserve_range, the same
> > address will be added to the reservations_pending list again and is
> > never moved to the reserved_pages list because the address had been
> reserved.

>[Tao] but if a page is added to reservations_pending list, it should also be 
>put in data->bps array, and when we call amdgpu_ras_add_bad_pages again, 
>amdgpu_ras_check_bad_page_unlock could ignore this page.
So except for amdgpu_ras_add_bad_pages, would you like to call 
amdgpu_vram_mgr_reserve_range in other place?

[Thomas] Yes,  Since after amdgpu_ras_add_bad_pages is called, the bad pages 
will be saved to eeprom. When a large number of bad pages need to be reserved, 
this will delay subsequent memory reservation.
I want to call amdgpu_vram_mgr_reserve_range to reserve memory immediately when 
driver receives poison creation interrupt, this can reduce the probability of 
bad memory pages being allocated and storing the bad pages to eeprom can be 
done slowly.

> >
> > [How]
> >   First add the address status check before calling
> > amdgpu_vram_mgr_do_reserve, if the address is already reserved, do
> > nothing; If the address is already in the reservations_pending list,
> > directly reserve memory; only add new nodes for the addresses that
> > are not in the reserved_pages list and reservations_pending list.
> >
> > Signed-off-by: YiPeng Chai 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 28
> > +---
> >  1 file changed, 19 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> > index 1e36c428d254..0bf3f4092900 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> > @@ -317,7 +317,6 @@ static void amdgpu_vram_mgr_do_reserve(struct
> > ttm_resource_manager *man)
> >
> >   dev_dbg(adev->dev, "Reservation 0x%llx - %lld, Succeeded\n",
> >   rsv->start, rsv->size);
> > -
> >   vis_usage = amdgpu_vram_mgr_vis_size(adev, block);
> >   atomic64_add(vis_usage, >vis_usage);
> >   spin_lock(>bdev->lru_lock); @@ -340,19 +339,30 @@
> > int amdgpu_vram_mgr_reserve_range(struct
> > amdgpu_vram_mgr *mgr,
> > uint64_t start, uint64_

RE: [PATCH] drm/amdgpu: Fix incorrect return value

2024-04-08 Thread Zhou1, Tao
[AMD Official Use Only - General]

> -Original Message-
> From: Chai, Thomas 
> Sent: Sunday, April 7, 2024 10:21 AM
> To: Zhou1, Tao ; amd-gfx@lists.freedesktop.org
> Cc: Zhang, Hawking ; Li, Candice
> ; Wang, Yang(Kevin) ; Yang,
> Stanley 
> Subject: RE: [PATCH] drm/amdgpu: Fix incorrect return value
>
> [AMD Official Use Only - General]
>
> -
> Best Regards,
> Thomas
>
> -Original Message-
> From: Zhou1, Tao 
> Sent: Wednesday, April 3, 2024 6:36 PM
> To: Chai, Thomas ; amd-gfx@lists.freedesktop.org
> Cc: Zhang, Hawking ; Li, Candice
> ; Wang, Yang(Kevin) ; Yang,
> Stanley 
> Subject: RE: [PATCH] drm/amdgpu: Fix incorrect return value
>
> [AMD Official Use Only - General]
>
> > -Original Message-
> > From: Chai, Thomas 
> > Sent: Wednesday, April 3, 2024 3:07 PM
> > To: amd-gfx@lists.freedesktop.org
> > Cc: Chai, Thomas ; Zhang, Hawking
> > ; Zhou1, Tao ; Li, Candice
> > ; Wang, Yang(Kevin) ;
> > Yang, Stanley ; Chai, Thomas
> > 
> > Subject: [PATCH] drm/amdgpu: Fix incorrect return value
> >
> > [Why]
> >   After calling amdgpu_vram_mgr_reserve_range multiple times with the
> > same address, calling amdgpu_vram_mgr_query_page_status will always
> > return - EBUSY.
>
> >[Tao] could you explain why we call amdgpu_vram_mgr_reserve_range multiple
> times with the same  address? IIRC, we skip duplicate address before reserve
> memory.
>
> [Thomas]
>When poison creation interrupt is received, since some poisoning addresses 
> may
> have been allocated by some processes, reserving these memories will fail.
> These memory will be tried to reserve again after killing the poisoned 
> process in
> the subsequent poisoning consumption interrupt handler.
> so amdgpu_vram_mgr_reserve_range needs to be called multiple times with the
> same address.
>
> >   From the second call to amdgpu_vram_mgr_reserve_range, the same
> > address will be added to the reservations_pending list again and is
> > never moved to the reserved_pages list because the address had been
> reserved.

[Tao] but if a page is added to reservations_pending list, it should also be 
put in data->bps array, and when we call amdgpu_ras_add_bad_pages again, 
amdgpu_ras_check_bad_page_unlock could ignore this page.
So except for amdgpu_ras_add_bad_pages, would you like to call 
amdgpu_vram_mgr_reserve_range in other place?

> >
> > [How]
> >   First add the address status check before calling
> > amdgpu_vram_mgr_do_reserve, if the address is already reserved, do
> > nothing; If the address is already in the reservations_pending list,
> > directly reserve memory; only add new nodes for the addresses that are
> > not in the reserved_pages list and reservations_pending list.
> >
> > Signed-off-by: YiPeng Chai 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 28
> > +---
> >  1 file changed, 19 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> > index 1e36c428d254..0bf3f4092900 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> > @@ -317,7 +317,6 @@ static void amdgpu_vram_mgr_do_reserve(struct
> > ttm_resource_manager *man)
> >
> >   dev_dbg(adev->dev, "Reservation 0x%llx - %lld, Succeeded\n",
> >   rsv->start, rsv->size);
> > -
> >   vis_usage = amdgpu_vram_mgr_vis_size(adev, block);
> >   atomic64_add(vis_usage, >vis_usage);
> >   spin_lock(>bdev->lru_lock); @@ -340,19 +339,30 @@
> > int amdgpu_vram_mgr_reserve_range(struct
> > amdgpu_vram_mgr *mgr,
> > uint64_t start, uint64_t size)  {
> >   struct amdgpu_vram_reservation *rsv;
> > + int ret = 0;
> >
> > - rsv = kzalloc(sizeof(*rsv), GFP_KERNEL);
> > - if (!rsv)
> > - return -ENOMEM;
> > + ret = amdgpu_vram_mgr_query_page_status(mgr, start);
> > + if (!ret)
> > + return 0;
> > +
> > + if (ret == -ENOENT) {
> > + rsv = kzalloc(sizeof(*rsv), GFP_KERNEL);
> > + if (!rsv)
> > + return -ENOMEM;
> >
> > - INIT_LIST_HEAD(>allocated);
> > - INIT_LIST_HEAD(>blocks);
> > + INIT_LIST_HEAD(>allocated);
> > + INIT_LIST_HEAD(>blocks);
> >
> > - rsv->start = start;
> > - rsv->size = size;
> > + rsv->start = start;
> > + rsv->size = size;
> > +
> > + mutex_lock(>lock);
> > + list_add_tail(>blocks, >reservations_pending);
> > + mutex_unlock(>lock);
> > +
> > + }
> >
> >   mutex_lock(>lock);
> > - list_add_tail(>blocks, >reservations_pending);
> >   amdgpu_vram_mgr_do_reserve(>manager);
> >   mutex_unlock(>lock);
> >
> > --
> > 2.34.1
>
>



RE: [PATCH] drm/amdgpu: Fix incorrect return value

2024-04-06 Thread Chai, Thomas
[AMD Official Use Only - General]

-
Best Regards,
Thomas

-Original Message-
From: Zhou1, Tao 
Sent: Wednesday, April 3, 2024 6:36 PM
To: Chai, Thomas ; amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Li, Candice ; 
Wang, Yang(Kevin) ; Yang, Stanley 
Subject: RE: [PATCH] drm/amdgpu: Fix incorrect return value

[AMD Official Use Only - General]

> -Original Message-
> From: Chai, Thomas 
> Sent: Wednesday, April 3, 2024 3:07 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Chai, Thomas ; Zhang, Hawking
> ; Zhou1, Tao ; Li, Candice
> ; Wang, Yang(Kevin) ;
> Yang, Stanley ; Chai, Thomas
> 
> Subject: [PATCH] drm/amdgpu: Fix incorrect return value
>
> [Why]
>   After calling amdgpu_vram_mgr_reserve_range multiple times with the
> same address, calling amdgpu_vram_mgr_query_page_status will always
> return - EBUSY.

>[Tao] could you explain why we call amdgpu_vram_mgr_reserve_range multiple 
>times with the same  address? IIRC, we skip duplicate address before reserve 
>memory.

[Thomas]
   When poison creation interrupt is received, since some poisoning addresses 
may have been allocated by some processes, reserving these memories will fail.
These memory will be tried to reserve again after killing the poisoned process 
in the subsequent poisoning consumption interrupt handler.
so amdgpu_vram_mgr_reserve_range needs to be called multiple times with the 
same address.

>   From the second call to amdgpu_vram_mgr_reserve_range, the same
> address will be added to the reservations_pending list again and is
> never moved to the reserved_pages list because the address had been reserved.
>
> [How]
>   First add the address status check before calling
> amdgpu_vram_mgr_do_reserve, if the address is already reserved, do
> nothing; If the address is already in the reservations_pending list,
> directly reserve memory; only add new nodes for the addresses that are
> not in the reserved_pages list and reservations_pending list.
>
> Signed-off-by: YiPeng Chai 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 28
> +---
>  1 file changed, 19 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> index 1e36c428d254..0bf3f4092900 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> @@ -317,7 +317,6 @@ static void amdgpu_vram_mgr_do_reserve(struct
> ttm_resource_manager *man)
>
>   dev_dbg(adev->dev, "Reservation 0x%llx - %lld, Succeeded\n",
>   rsv->start, rsv->size);
> -
>   vis_usage = amdgpu_vram_mgr_vis_size(adev, block);
>   atomic64_add(vis_usage, >vis_usage);
>   spin_lock(>bdev->lru_lock); @@ -340,19 +339,30 @@
> int amdgpu_vram_mgr_reserve_range(struct
> amdgpu_vram_mgr *mgr,
> uint64_t start, uint64_t size)  {
>   struct amdgpu_vram_reservation *rsv;
> + int ret = 0;
>
> - rsv = kzalloc(sizeof(*rsv), GFP_KERNEL);
> - if (!rsv)
> - return -ENOMEM;
> + ret = amdgpu_vram_mgr_query_page_status(mgr, start);
> + if (!ret)
> + return 0;
> +
> + if (ret == -ENOENT) {
> + rsv = kzalloc(sizeof(*rsv), GFP_KERNEL);
> + if (!rsv)
> + return -ENOMEM;
>
> - INIT_LIST_HEAD(>allocated);
> - INIT_LIST_HEAD(>blocks);
> + INIT_LIST_HEAD(>allocated);
> + INIT_LIST_HEAD(>blocks);
>
> - rsv->start = start;
> - rsv->size = size;
> + rsv->start = start;
> + rsv->size = size;
> +
> + mutex_lock(>lock);
> + list_add_tail(>blocks, >reservations_pending);
> + mutex_unlock(>lock);
> +
> + }
>
>   mutex_lock(>lock);
> - list_add_tail(>blocks, >reservations_pending);
>   amdgpu_vram_mgr_do_reserve(>manager);
>   mutex_unlock(>lock);
>
> --
> 2.34.1




RE: [PATCH] drm/amdgpu: Fix incorrect return value

2024-04-03 Thread Zhou1, Tao
[AMD Official Use Only - General]

> -Original Message-
> From: Chai, Thomas 
> Sent: Wednesday, April 3, 2024 3:07 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Chai, Thomas ; Zhang, Hawking
> ; Zhou1, Tao ; Li, Candice
> ; Wang, Yang(Kevin) ; Yang,
> Stanley ; Chai, Thomas 
> Subject: [PATCH] drm/amdgpu: Fix incorrect return value
>
> [Why]
>   After calling amdgpu_vram_mgr_reserve_range multiple times with the same
> address, calling amdgpu_vram_mgr_query_page_status will always return -
> EBUSY.

[Tao] could you explain why we call amdgpu_vram_mgr_reserve_range multiple 
times with the same
 address? IIRC, we skip duplicate address before reserve memory.

>   From the second call to amdgpu_vram_mgr_reserve_range, the same address
> will be added to the reservations_pending list again and is never moved to the
> reserved_pages list because the address had been reserved.
>
> [How]
>   First add the address status check before calling
> amdgpu_vram_mgr_do_reserve, if the address is already reserved, do nothing; If
> the address is already in the reservations_pending list, directly reserve 
> memory;
> only add new nodes for the addresses that are not in the reserved_pages list 
> and
> reservations_pending list.
>
> Signed-off-by: YiPeng Chai 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 28 +---
>  1 file changed, 19 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> index 1e36c428d254..0bf3f4092900 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
> @@ -317,7 +317,6 @@ static void amdgpu_vram_mgr_do_reserve(struct
> ttm_resource_manager *man)
>
>   dev_dbg(adev->dev, "Reservation 0x%llx - %lld, Succeeded\n",
>   rsv->start, rsv->size);
> -
>   vis_usage = amdgpu_vram_mgr_vis_size(adev, block);
>   atomic64_add(vis_usage, >vis_usage);
>   spin_lock(>bdev->lru_lock);
> @@ -340,19 +339,30 @@ int amdgpu_vram_mgr_reserve_range(struct
> amdgpu_vram_mgr *mgr,
> uint64_t start, uint64_t size)
>  {
>   struct amdgpu_vram_reservation *rsv;
> + int ret = 0;
>
> - rsv = kzalloc(sizeof(*rsv), GFP_KERNEL);
> - if (!rsv)
> - return -ENOMEM;
> + ret = amdgpu_vram_mgr_query_page_status(mgr, start);
> + if (!ret)
> + return 0;
> +
> + if (ret == -ENOENT) {
> + rsv = kzalloc(sizeof(*rsv), GFP_KERNEL);
> + if (!rsv)
> + return -ENOMEM;
>
> - INIT_LIST_HEAD(>allocated);
> - INIT_LIST_HEAD(>blocks);
> + INIT_LIST_HEAD(>allocated);
> + INIT_LIST_HEAD(>blocks);
>
> - rsv->start = start;
> - rsv->size = size;
> + rsv->start = start;
> + rsv->size = size;
> +
> + mutex_lock(>lock);
> + list_add_tail(>blocks, >reservations_pending);
> + mutex_unlock(>lock);
> +
> + }
>
>   mutex_lock(>lock);
> - list_add_tail(>blocks, >reservations_pending);
>   amdgpu_vram_mgr_do_reserve(>manager);
>   mutex_unlock(>lock);
>
> --
> 2.34.1



Re: [PATCH] drm/amdgpu: Fix incorrect return value in sysfs for pp_od_clk_voltage

2020-08-14 Thread Deucher, Alexander
[AMD Public Use]

No need to apologize, it was a good catch.  Something to double check in the 
future if something similar ends up getting applied again.

Thanks!

Alex

From: amd-gfx  on behalf of Matt Coffin 

Sent: Friday, August 14, 2020 3:13 PM
To: amd-gfx@lists.freedesktop.org 
Cc: Quan, Evan ; Koenig, Christian 
; Li, Dennis 
Subject: Re: [PATCH] drm/amdgpu: Fix incorrect return value in sysfs for 
pp_od_clk_voltage

Hi all,

As of 026acaeac2d205f22c0f682cc1c7b1a85b9ccd00 ("drm/amdgpu: revert "fix
system hang issue during GPU reset""), this patch is no longer needed,
and won't apply, because the badly-behaving code was removed; I'll
withdraw this patch for now.

Sorry to those who wasted their time reviewing.

Cheers,
Matt

On 8/13/20 7:15 PM, Matt Coffin wrote:
> The changes in edad8312cbbf9a33c86873fc4093664f150dd5c1 introduced an
> issue with the sysfs interface for pp_od_clk_voltage. It overwrites the
> return value to 0 when it calls another function, then returns 0. The
> intended behavior is that a positive return value indicates the number
> of bytes from the buffer that you processed in that call.
>
> With the 0 return value, clients would submit the same value to be
> written over and over again, resulting in an infinite loop.
>
> This is resolved by returning the count of bytes read (in this case the
> whole message), when the desired return is 0 (success).
>
> Fixes: edad8312cbbf ("drm/amdgpu: fix system hang issue during GPU")
> Bug: 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1245data=02%7C01%7Calexander.deucher%40amd.com%7C3fb1fd9f95ad441a88a208d840862e80%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637330292979204288sdata=iABLxfzQVQa5zBK6a0JozvRYl%2Fg5eTFnfOS86r0g9rU%3Dreserved=0
> Signed-off-by: Matt Coffin 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> index 1705e328c6fc..f00c7ed361d4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> @@ -937,7 +937,11 @@ static ssize_t amdgpu_set_pp_od_clk_voltage(struct 
> device *dev,
>
>  pro_end:
>up_read(>reset_sem);
> - return ret;
> + if (ret) {
> + return ret;
> + } else {
> + return count;
> + }
>  }
>
>  static ssize_t amdgpu_get_pp_od_clk_voltage(struct device *dev,
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=02%7C01%7Calexander.deucher%40amd.com%7C3fb1fd9f95ad441a88a208d840862e80%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637330292979204288sdata=GLWlbhgo0grFjcOh5ZAJWhXGxSm6Ufw8eDFDHZf95Uk%3Dreserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Fix incorrect return value in sysfs for pp_od_clk_voltage

2020-08-14 Thread Matt Coffin
Hi all,

As of 026acaeac2d205f22c0f682cc1c7b1a85b9ccd00 ("drm/amdgpu: revert "fix
system hang issue during GPU reset""), this patch is no longer needed,
and won't apply, because the badly-behaving code was removed; I'll
withdraw this patch for now.

Sorry to those who wasted their time reviewing.

Cheers,
Matt

On 8/13/20 7:15 PM, Matt Coffin wrote:
> The changes in edad8312cbbf9a33c86873fc4093664f150dd5c1 introduced an
> issue with the sysfs interface for pp_od_clk_voltage. It overwrites the
> return value to 0 when it calls another function, then returns 0. The
> intended behavior is that a positive return value indicates the number
> of bytes from the buffer that you processed in that call.
> 
> With the 0 return value, clients would submit the same value to be
> written over and over again, resulting in an infinite loop.
> 
> This is resolved by returning the count of bytes read (in this case the
> whole message), when the desired return is 0 (success).
> 
> Fixes: edad8312cbbf ("drm/amdgpu: fix system hang issue during GPU")
> Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1245
> Signed-off-by: Matt Coffin 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> index 1705e328c6fc..f00c7ed361d4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> @@ -937,7 +937,11 @@ static ssize_t amdgpu_set_pp_od_clk_voltage(struct 
> device *dev,
>  
>  pro_end:
>   up_read(>reset_sem);
> - return ret;
> + if (ret) {
> + return ret;
> + } else {
> + return count;
> + }
>  }
>  
>  static ssize_t amdgpu_get_pp_od_clk_voltage(struct device *dev,
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: Fix incorrect return value in sysfs for pp_od_clk_voltage

2020-08-13 Thread Quan, Evan
[AMD Official Use Only - Internal Distribution Only]

Thanks for catching this. The patch is reviewed-by: Evan Quan 

I have the same copy as Matt. @Li, Dennis maybe you were working on an old copy?

BR
Evan
-Original Message-
From: amd-gfx  On Behalf Of Matt Coffin
Sent: Friday, August 14, 2020 11:14 AM
To: Li, Dennis ; amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian 
Subject: Re: [PATCH] drm/amdgpu: Fix incorrect return value in sysfs for 
pp_od_clk_voltage

Hey Dennis,



Thanks for the testing.



I'm having some issues reproducing, as that command is working fine for me in 
sh, bash, and zsh. So just to confirm a few things while I look at it...



1. What kind of SMU is on whatever card you're testing on? Looks like smu_v11+ 
to me?

2. (shouldn't matter if you're right about which -EINVAL return is being hit), 
but is OverDrive enabled?

3. Is this based off of latest amd-staging-drm-next?



This is the code block I'm seeing on the HEAD of alex's branch... which is a 
bit different from what you pasted.



This error also happens **before** the infinite loop I was fixing with this 
patch, but might as well get both birds with one stone if there's still an 
issue.



while (tmp_str[0]) {

sub_str = strsep(_str, delimiter);

ret = kstrtol(sub_str, 0, [parameter_size]);

if (ret)

return -EINVAL;

parameter_size++;



while (isspace(*tmp_str))

tmp_str++;

}

On 8/13/20 8:14 PM, Li, Dennis wrote:
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi, Matt,
>   With your change, I still could reproduce the following issue:
>
> # echo "s 1 1900" > /sys/class/drm/card0/device/pp_od_clk_voltage
> bash: echo: write error: Invalid argument
>
>  I found that it is related the following lines code, could you help 
> double check it?
>
> while ((sub_str = strsep(_str, delimiter)) != NULL) {  // sub_str will be 
> empty string
> ret = kstrtol(sub_str, 0, [parameter_size]);
> if (ret)
> return -EINVAL; // return here
> parameter_size++;
>
> while (isspace(*tmp_str))
> tmp_str++;
> }
>
> Best Regards
> Dennis Li
> -Original Message-
> From: Matt Coffin 
> Sent: Friday, August 14, 2020 9:15 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Koenig, Christian ; Li, Dennis
> ; Matt Coffin 
> Subject: [PATCH] drm/amdgpu: Fix incorrect return value in sysfs for
> pp_od_clk_voltage
>
> The changes in edad8312cbbf9a33c86873fc4093664f150dd5c1 introduced an issue 
> with the sysfs interface for pp_od_clk_voltage. It overwrites the return 
> value to 0 when it calls another function, then returns 0. The intended 
> behavior is that a positive return value indicates the number of bytes from 
> the buffer that you processed in that call.
>
> With the 0 return value, clients would submit the same value to be written 
> over and over again, resulting in an infinite loop.
>
> This is resolved by returning the count of bytes read (in this case the whole 
> message), when the desired return is 0 (success).
>
> Fixes: edad8312cbbf ("drm/amdgpu: fix system hang issue during GPU")
> Bug:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitl
> ab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1245data=02%7C01%7C
> evan.quan%40amd.com%7C08991e3858f144a4dd0908d840001324%7C3dd8961fe4884
> e608e11a82d994e183d%7C0%7C0%7C637329716385273288sdata=w%2FBlX8CpG
> 0TfTYd1InX8Z%2FMBrRbVu%2FT4zWehWKHClCE%3Dreserved=0
> Signed-off-by: Matt Coffin 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> index 1705e328c6fc..f00c7ed361d4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> @@ -937,7 +937,11 @@ static ssize_t
> amdgpu_set_pp_od_clk_voltage(struct device *dev,
>
>  pro_end:
>  up_read(>reset_sem);
> -return ret;
> +if (ret) {
> +return ret;
> +} else {
> +return count;
> +}
>  }
>
>  static ssize_t amdgpu_get_pp_od_clk_voltage(struct device *dev,
> --
> 2.28.0
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=02%7C01%7Cevan.quan%40amd.com%7C08991e3858f144a4dd0908d840001324%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637329716385273288sdata=IbsPW7%2Fr2HXLxKK4%2FOb6fqFmZXtyivbTsP9ftd7P2Dw%3Dreserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Fix incorrect return value in sysfs for pp_od_clk_voltage

2020-08-13 Thread Matt Coffin
Hey Dennis,



Thanks for the testing.



I'm having some issues reproducing, as that command is working fine for
me in sh, bash, and zsh. So just to confirm a few things while I look at
it...



1. What kind of SMU is on whatever card you're testing on? Looks like
smu_v11+ to me?

2. (shouldn't matter if you're right about which -EINVAL return is being
hit), but is OverDrive enabled?

3. Is this based off of latest amd-staging-drm-next?



This is the code block I'm seeing on the HEAD of alex's branch... which
is a bit different from what you pasted.



This error also happens **before** the infinite loop I was fixing with
this patch, but might as well get both birds with one stone if there's
still an issue.



while (tmp_str[0]) {

sub_str = strsep(_str, delimiter);

ret = kstrtol(sub_str, 0, [parameter_size]);

if (ret)

return -EINVAL;

parameter_size++;



while (isspace(*tmp_str))

tmp_str++;

}

On 8/13/20 8:14 PM, Li, Dennis wrote:
> [AMD Official Use Only - Internal Distribution Only]
> 
> Hi, Matt,
>   With your change, I still could reproduce the following issue:
> 
> # echo "s 1 1900" > /sys/class/drm/card0/device/pp_od_clk_voltage
> bash: echo: write error: Invalid argument
> 
>  I found that it is related the following lines code, could you help 
> double check it?
> 
>   while ((sub_str = strsep(_str, delimiter)) != NULL) {  // sub_str 
> will be empty string
>   ret = kstrtol(sub_str, 0, [parameter_size]);
>   if (ret)
>   return -EINVAL; // return here
>   parameter_size++;
> 
>   while (isspace(*tmp_str))
>   tmp_str++;
>   }
> 
> Best Regards
> Dennis Li
> -Original Message-
> From: Matt Coffin  
> Sent: Friday, August 14, 2020 9:15 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Koenig, Christian ; Li, Dennis 
> ; Matt Coffin 
> Subject: [PATCH] drm/amdgpu: Fix incorrect return value in sysfs for 
> pp_od_clk_voltage
> 
> The changes in edad8312cbbf9a33c86873fc4093664f150dd5c1 introduced an issue 
> with the sysfs interface for pp_od_clk_voltage. It overwrites the return 
> value to 0 when it calls another function, then returns 0. The intended 
> behavior is that a positive return value indicates the number of bytes from 
> the buffer that you processed in that call.
> 
> With the 0 return value, clients would submit the same value to be written 
> over and over again, resulting in an infinite loop.
> 
> This is resolved by returning the count of bytes read (in this case the whole 
> message), when the desired return is 0 (success).
> 
> Fixes: edad8312cbbf ("drm/amdgpu: fix system hang issue during GPU")
> Bug: 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1245data=02%7C01%7CDennis.Li%40amd.com%7C4de8308bf7974ea9e62308d83fef922b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637329646078379799sdata=N9c6e7cUMCDpvBIYUEzxkadJbJdBryXyfhfhb%2BUEwjg%3Dreserved=0
> Signed-off-by: Matt Coffin 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> index 1705e328c6fc..f00c7ed361d4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
> @@ -937,7 +937,11 @@ static ssize_t amdgpu_set_pp_od_clk_voltage(struct 
> device *dev,
>  
>  pro_end:
>   up_read(>reset_sem);
> - return ret;
> + if (ret) {
> + return ret;
> + } else {
> + return count;
> + }
>  }
>  
>  static ssize_t amdgpu_get_pp_od_clk_voltage(struct device *dev,
> --
> 2.28.0
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: Fix incorrect return value in sysfs for pp_od_clk_voltage

2020-08-13 Thread Li, Dennis
[AMD Official Use Only - Internal Distribution Only]

Hi, Matt,
  With your change, I still could reproduce the following issue:

# echo "s 1 1900" > /sys/class/drm/card0/device/pp_od_clk_voltage
bash: echo: write error: Invalid argument

 I found that it is related the following lines code, could you help double 
check it?

while ((sub_str = strsep(_str, delimiter)) != NULL) {  // sub_str 
will be empty string
ret = kstrtol(sub_str, 0, [parameter_size]);
if (ret)
return -EINVAL; // return here
parameter_size++;

while (isspace(*tmp_str))
tmp_str++;
}

Best Regards
Dennis Li
-Original Message-
From: Matt Coffin  
Sent: Friday, August 14, 2020 9:15 AM
To: amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian ; Li, Dennis 
; Matt Coffin 
Subject: [PATCH] drm/amdgpu: Fix incorrect return value in sysfs for 
pp_od_clk_voltage

The changes in edad8312cbbf9a33c86873fc4093664f150dd5c1 introduced an issue 
with the sysfs interface for pp_od_clk_voltage. It overwrites the return value 
to 0 when it calls another function, then returns 0. The intended behavior is 
that a positive return value indicates the number of bytes from the buffer that 
you processed in that call.

With the 0 return value, clients would submit the same value to be written over 
and over again, resulting in an infinite loop.

This is resolved by returning the count of bytes read (in this case the whole 
message), when the desired return is 0 (success).

Fixes: edad8312cbbf ("drm/amdgpu: fix system hang issue during GPU")
Bug: 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1245data=02%7C01%7CDennis.Li%40amd.com%7C4de8308bf7974ea9e62308d83fef922b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637329646078379799sdata=N9c6e7cUMCDpvBIYUEzxkadJbJdBryXyfhfhb%2BUEwjg%3Dreserved=0
Signed-off-by: Matt Coffin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
index 1705e328c6fc..f00c7ed361d4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
@@ -937,7 +937,11 @@ static ssize_t amdgpu_set_pp_od_clk_voltage(struct device 
*dev,
 
 pro_end:
up_read(>reset_sem);
-   return ret;
+   if (ret) {
+   return ret;
+   } else {
+   return count;
+   }
 }
 
 static ssize_t amdgpu_get_pp_od_clk_voltage(struct device *dev,
--
2.28.0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx