RE: [PATCH 1/2] drm/amdgpu: skip SMU FW reloading in runpm BACO case (v2)

2022-07-13 Thread Deucher, Alexander
[AMD Official Use Only - General]

> -Original Message-
> From: Chen, Guchun 
> Sent: Wednesday, July 13, 2022 11:53 PM
> To: Alex Deucher 
> Cc: amd-gfx list ; Deucher, Alexander
> ; Zhang, Hawking
> ; Lazar, Lijo ; Quan, Evan
> ; Feng, Kenneth 
> Subject: RE: [PATCH 1/2] drm/amdgpu: skip SMU FW reloading in runpm
> BACO case (v2)
> 
> Sounds good, Alex. Let me update it soon.
> 
> Also, after a discussion with Lijo, once we introduce rpm mode, we think it's
> fine to drop adev->runpm indicator by rpm mode, as it's a bit overlapped with
> latter one.
> So for the check like adev->runpm, we can use 'rpm_mode !=
> AMDGPU_RUNPM_NONE' instead.
> If it makes sense, I will provide a follow up patch as well to optimize it.

Sure.  Sounds good.

Alex

> 
> Regards,
> Guchun
> 
> -Original Message-
> From: Alex Deucher 
> Sent: Thursday, July 14, 2022 11:44 AM
> To: Chen, Guchun 
> Cc: amd-gfx list ; Deucher, Alexander
> ; Zhang, Hawking
> ; Lazar, Lijo ; Quan, Evan
> ; Feng, Kenneth 
> Subject: Re: [PATCH 1/2] drm/amdgpu: skip SMU FW reloading in runpm
> BACO case (v2)
> 
> On Wed, Jul 13, 2022 at 11:15 PM Chen, Guchun 
> wrote:
> >
> > Re: I think this would be better as:
> > if (adev->in_runpm && (adev->pm.rpm_mode != AMDGPU_RUNPM_BOCO))
> or something like that.
> >
> > Yes, patch 2 in this series addresses it. Patch 1 intends to fix SMU 
> > reloading,
> while patch 2 focus on fixing race issue when getting feature mask during
> runtime cycle.
> 
> I get that, but I think it would be better to switch the order of the patches 
> and
> then use the rpm_mode in this patch as well.  That way we are consistent and
> we don't miss some case if we change the BACO or BOCO logic in the future.
> 
> Alex
> 
> >
> > Regards,
> > Guchun
> >
> > -Original Message-
> > From: Alex Deucher 
> > Sent: Wednesday, July 13, 2022 11:31 PM
> > To: Chen, Guchun 
> > Cc: amd-gfx list ; Deucher, Alexander
> > ; Zhang, Hawking
> ;
> > Lazar, Lijo ; Quan, Evan ;
> > Feng, Kenneth 
> > Subject: Re: [PATCH 1/2] drm/amdgpu: skip SMU FW reloading in runpm
> > BACO case (v2)
> >
> > On Tue, Jul 12, 2022 at 11:18 PM Guchun Chen 
> wrote:
> > >
> > > SMU is always alive, so it's fine to skip SMU FW reloading when
> > > runpm resumed from BACO, this can avoid some race issues when
> > > resuming SMU FW.
> > >
> > > v2: Exclude boco case if an ASIC supports both boco and baco
> > >
> > > Suggested-by: Evan Quan 
> > > Signed-off-by: Guchun Chen 
> > > ---
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 8 
> > >  1 file changed, 8 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > > index e9411c28d88b..de59dc051340 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > > @@ -2348,6 +2348,14 @@ static int psp_load_smu_fw(struct
> psp_context *psp)
> > > >firmware.ucode[AMDGPU_UCODE_ID_SMC];
> > > struct amdgpu_ras *ras = psp->ras_context.ras;
> > >
> > > +   /* Skip SMU FW reloading in case of using BACO for runpm only,
> > > +* as SMU is always alive.
> > > +*/
> > > +   if (adev->in_runpm &&
> > > +   !amdgpu_device_supports_boco(adev_to_drm(adev)) &&
> > > +   amdgpu_device_supports_baco(adev_to_drm(adev)))
> >
> > I think this would be better as:
> > if (adev->in_runpm && (adev->pm.rpm_mode != AMDGPU_RUNPM_BOCO))
> or something like that.
> >
> > Alex
> >
> > > +   return 0;
> > > +
> > > if (!ucode->fw || amdgpu_sriov_vf(psp->adev))
> > > return 0;
> > >
> > > --
> > > 2.17.1
> > >


RE: [PATCH 1/2] drm/amdgpu: skip SMU FW reloading in runpm BACO case (v2)

2022-07-13 Thread Chen, Guchun
Sounds good, Alex. Let me update it soon.

Also, after a discussion with Lijo, once we introduce rpm mode, we think it's 
fine to drop adev->runpm indicator by rpm mode, as it's a bit overlapped with 
latter one.
So for the check like adev->runpm, we can use 'rpm_mode != AMDGPU_RUNPM_NONE' 
instead.
If it makes sense, I will provide a follow up patch as well to optimize it.

Regards,
Guchun

-Original Message-
From: Alex Deucher  
Sent: Thursday, July 14, 2022 11:44 AM
To: Chen, Guchun 
Cc: amd-gfx list ; Deucher, Alexander 
; Zhang, Hawking ; Lazar, 
Lijo ; Quan, Evan ; Feng, Kenneth 

Subject: Re: [PATCH 1/2] drm/amdgpu: skip SMU FW reloading in runpm BACO case 
(v2)

On Wed, Jul 13, 2022 at 11:15 PM Chen, Guchun  wrote:
>
> Re: I think this would be better as:
> if (adev->in_runpm && (adev->pm.rpm_mode != AMDGPU_RUNPM_BOCO)) or something 
> like that.
>
> Yes, patch 2 in this series addresses it. Patch 1 intends to fix SMU 
> reloading, while patch 2 focus on fixing race issue when getting feature mask 
> during runtime cycle.

I get that, but I think it would be better to switch the order of the patches 
and then use the rpm_mode in this patch as well.  That way we are consistent 
and we don't miss some case if we change the BACO or BOCO logic in the future.

Alex

>
> Regards,
> Guchun
>
> -Original Message-
> From: Alex Deucher 
> Sent: Wednesday, July 13, 2022 11:31 PM
> To: Chen, Guchun 
> Cc: amd-gfx list ; Deucher, Alexander 
> ; Zhang, Hawking ; 
> Lazar, Lijo ; Quan, Evan ; 
> Feng, Kenneth 
> Subject: Re: [PATCH 1/2] drm/amdgpu: skip SMU FW reloading in runpm 
> BACO case (v2)
>
> On Tue, Jul 12, 2022 at 11:18 PM Guchun Chen  wrote:
> >
> > SMU is always alive, so it's fine to skip SMU FW reloading when 
> > runpm resumed from BACO, this can avoid some race issues when 
> > resuming SMU FW.
> >
> > v2: Exclude boco case if an ASIC supports both boco and baco
> >
> > Suggested-by: Evan Quan 
> > Signed-off-by: Guchun Chen 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 8 
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > index e9411c28d88b..de59dc051340 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > @@ -2348,6 +2348,14 @@ static int psp_load_smu_fw(struct psp_context *psp)
> > >firmware.ucode[AMDGPU_UCODE_ID_SMC];
> > struct amdgpu_ras *ras = psp->ras_context.ras;
> >
> > +   /* Skip SMU FW reloading in case of using BACO for runpm only,
> > +* as SMU is always alive.
> > +*/
> > +   if (adev->in_runpm &&
> > +   !amdgpu_device_supports_boco(adev_to_drm(adev)) &&
> > +   amdgpu_device_supports_baco(adev_to_drm(adev)))
>
> I think this would be better as:
> if (adev->in_runpm && (adev->pm.rpm_mode != AMDGPU_RUNPM_BOCO)) or something 
> like that.
>
> Alex
>
> > +   return 0;
> > +
> > if (!ucode->fw || amdgpu_sriov_vf(psp->adev))
> > return 0;
> >
> > --
> > 2.17.1
> >


Re: [PATCH 1/2] drm/amdgpu: skip SMU FW reloading in runpm BACO case (v2)

2022-07-13 Thread Alex Deucher
On Wed, Jul 13, 2022 at 11:15 PM Chen, Guchun  wrote:
>
> Re: I think this would be better as:
> if (adev->in_runpm && (adev->pm.rpm_mode != AMDGPU_RUNPM_BOCO)) or something 
> like that.
>
> Yes, patch 2 in this series addresses it. Patch 1 intends to fix SMU 
> reloading, while patch 2 focus on fixing race issue when getting feature mask 
> during runtime cycle.

I get that, but I think it would be better to switch the order of the
patches and then use the rpm_mode in this patch as well.  That way we
are consistent and we don't miss some case if we change the BACO or
BOCO logic in the future.

Alex

>
> Regards,
> Guchun
>
> -Original Message-
> From: Alex Deucher 
> Sent: Wednesday, July 13, 2022 11:31 PM
> To: Chen, Guchun 
> Cc: amd-gfx list ; Deucher, Alexander 
> ; Zhang, Hawking ; Lazar, 
> Lijo ; Quan, Evan ; Feng, Kenneth 
> 
> Subject: Re: [PATCH 1/2] drm/amdgpu: skip SMU FW reloading in runpm BACO case 
> (v2)
>
> On Tue, Jul 12, 2022 at 11:18 PM Guchun Chen  wrote:
> >
> > SMU is always alive, so it's fine to skip SMU FW reloading when runpm
> > resumed from BACO, this can avoid some race issues when resuming SMU
> > FW.
> >
> > v2: Exclude boco case if an ASIC supports both boco and baco
> >
> > Suggested-by: Evan Quan 
> > Signed-off-by: Guchun Chen 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 8 
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > index e9411c28d88b..de59dc051340 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> > @@ -2348,6 +2348,14 @@ static int psp_load_smu_fw(struct psp_context *psp)
> > >firmware.ucode[AMDGPU_UCODE_ID_SMC];
> > struct amdgpu_ras *ras = psp->ras_context.ras;
> >
> > +   /* Skip SMU FW reloading in case of using BACO for runpm only,
> > +* as SMU is always alive.
> > +*/
> > +   if (adev->in_runpm &&
> > +   !amdgpu_device_supports_boco(adev_to_drm(adev)) &&
> > +   amdgpu_device_supports_baco(adev_to_drm(adev)))
>
> I think this would be better as:
> if (adev->in_runpm && (adev->pm.rpm_mode != AMDGPU_RUNPM_BOCO)) or something 
> like that.
>
> Alex
>
> > +   return 0;
> > +
> > if (!ucode->fw || amdgpu_sriov_vf(psp->adev))
> > return 0;
> >
> > --
> > 2.17.1
> >


RE: [PATCH 1/2] drm/amdgpu: skip SMU FW reloading in runpm BACO case (v2)

2022-07-13 Thread Chen, Guchun
Re: I think this would be better as:
if (adev->in_runpm && (adev->pm.rpm_mode != AMDGPU_RUNPM_BOCO)) or something 
like that.

Yes, patch 2 in this series addresses it. Patch 1 intends to fix SMU reloading, 
while patch 2 focus on fixing race issue when getting feature mask during 
runtime cycle.

Regards,
Guchun

-Original Message-
From: Alex Deucher  
Sent: Wednesday, July 13, 2022 11:31 PM
To: Chen, Guchun 
Cc: amd-gfx list ; Deucher, Alexander 
; Zhang, Hawking ; Lazar, 
Lijo ; Quan, Evan ; Feng, Kenneth 

Subject: Re: [PATCH 1/2] drm/amdgpu: skip SMU FW reloading in runpm BACO case 
(v2)

On Tue, Jul 12, 2022 at 11:18 PM Guchun Chen  wrote:
>
> SMU is always alive, so it's fine to skip SMU FW reloading when runpm 
> resumed from BACO, this can avoid some race issues when resuming SMU 
> FW.
>
> v2: Exclude boco case if an ASIC supports both boco and baco
>
> Suggested-by: Evan Quan 
> Signed-off-by: Guchun Chen 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> index e9411c28d88b..de59dc051340 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> @@ -2348,6 +2348,14 @@ static int psp_load_smu_fw(struct psp_context *psp)
> >firmware.ucode[AMDGPU_UCODE_ID_SMC];
> struct amdgpu_ras *ras = psp->ras_context.ras;
>
> +   /* Skip SMU FW reloading in case of using BACO for runpm only,
> +* as SMU is always alive.
> +*/
> +   if (adev->in_runpm &&
> +   !amdgpu_device_supports_boco(adev_to_drm(adev)) &&
> +   amdgpu_device_supports_baco(adev_to_drm(adev)))

I think this would be better as:
if (adev->in_runpm && (adev->pm.rpm_mode != AMDGPU_RUNPM_BOCO)) or something 
like that.

Alex

> +   return 0;
> +
> if (!ucode->fw || amdgpu_sriov_vf(psp->adev))
> return 0;
>
> --
> 2.17.1
>


Re: [PATCH] drm/amdgpu: Call trace info was found in dmesg when loading amdgpu

2022-07-13 Thread JingWen Chen
feel free to add

Reviewed-by: Jingwen Chen 

On 7/14/22 10:31 AM, lin cao wrote:
> In the case of SRIOV, the register smnMp1_PMI_3_FIFO will get an invalid
> value which will cause the "shift out of bound". In Ubuntu22.04, this
> issue will be checked an related call trace will be reported in dmesg.
>
> Signed-off-by: lin cao 
> ---
>  drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> index b71860e5324a..fa520d79ef67 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> @@ -886,6 +886,7 @@ static void sienna_cichlid_stb_init(struct smu_context 
> *smu);
>  
>  static int sienna_cichlid_init_smc_tables(struct smu_context *smu)
>  {
> + struct amdgpu_device *adev = smu->adev;
>   int ret = 0;
>  
>   ret = sienna_cichlid_tables_init(smu);
> @@ -896,7 +897,8 @@ static int sienna_cichlid_init_smc_tables(struct 
> smu_context *smu)
>   if (ret)
>   return ret;
>  
> - sienna_cichlid_stb_init(smu);
> + if (!amdgpu_sriov_vf(adev))
> + sienna_cichlid_stb_init(smu);
>  
>   return smu_v11_0_init_smc_tables(smu);
>  }


[PATCH 2/2] libsubcmd: Fix use-after-free for realloc(..., 0)

2022-07-13 Thread Tales Aparecida
From: Kees Cook 

GCC 12 correctly reports a potential use-after-free condition in the
xrealloc helper. Fix the warning by avoiding an implicit "free(ptr)"
when size == 0:

In file included from help.c:12:
In function 'xrealloc',
inlined from 'add_cmdname' at help.c:24:2: subcmd-util.h:56:23: error: 
pointer may be used after 'realloc' [-Werror=use-after-free]
   56 | ret = realloc(ptr, size);
  |   ^~
subcmd-util.h:52:21: note: call to 'realloc' here
   52 | void *ret = realloc(ptr, size);
  | ^~
subcmd-util.h:58:31: error: pointer may be used after 'realloc' 
[-Werror=use-after-free]
   58 | ret = realloc(ptr, 1);
  |   ^~~
subcmd-util.h:52:21: note: call to 'realloc' here
   52 | void *ret = realloc(ptr, size);
  | ^~

Fixes: 2f4ce5ec1d447beb ("perf tools: Finalize subcmd independence")
Reported-by: Valdis Klētnieks 
Signed-off-by: Kees Kook 
Tested-by: Valdis Klētnieks 
Tested-by: Justin M. Forbes 
Acked-by: Josh Poimboeuf 
Cc: linux-harden...@vger.kernel.org
Cc: Valdis Klētnieks 
Link: http://lore.kernel.org/lkml/20220213182443.4037039-1-keesc...@chromium.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/lib/subcmd/subcmd-util.h | 11 ++-
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/tools/lib/subcmd/subcmd-util.h b/tools/lib/subcmd/subcmd-util.h
index 794a375dad36..b2aec04fce8f 100644
--- a/tools/lib/subcmd/subcmd-util.h
+++ b/tools/lib/subcmd/subcmd-util.h
@@ -50,15 +50,8 @@ static NORETURN inline void die(const char *err, ...)
 static inline void *xrealloc(void *ptr, size_t size)
 {
void *ret = realloc(ptr, size);
-   if (!ret && !size)
-   ret = realloc(ptr, 1);
-   if (!ret) {
-   ret = realloc(ptr, size);
-   if (!ret && !size)
-   ret = realloc(ptr, 1);
-   if (!ret)
-   die("Out of memory, realloc failed");
-   }
+   if (!ret)
+   die("Out of memory, realloc failed");
return ret;
 }
 
-- 
2.37.0



[PATCH 1/2] objtool: Fix truncated string warning

2022-07-13 Thread Tales Aparecida
From: Sergei Trofimovich 

On GCC 12, the build fails due to a possible truncated string:

check.c: In function 'validate_call':
check.c:2865:58: error: '%d' directive output may be truncated writing 
between 1 and 10 bytes into a region of size 9 [-Werror=format-truncation=]
 2865 | snprintf(pvname, sizeof(pvname), "pv_ops[%d]", idx);
  |  ^~

In theory it's a valid bug:

static char pvname[16];
int idx;
...
idx = (rel->addend / sizeof(void *));
snprintf(pvname, sizeof(pvname), "pv_ops[%d]", idx);

There are only 7 chars for %d while it could take up to 9, so the
printed "pv_ops[%d]" string could get truncated.

In reality the bug should never happen, because pv_ops only has ~80
entries, so 7 chars for the integer is more than enough.  Still, it's
worth fixing.  Bump the buffer size by 2 bytes to silence the warning.

[ jpoimboe: changed size to 19; massaged changelog ]

Fixes: db2b0c5d7b6f ("objtool: Support pv_opsindirect calls for noinstr")
Reported-by: Adam Borowski 
Reported-by: Martin Liška 
Signed-off-by: Sergei Trofimovich 
Signed-off-by: Josh Poimboeuf 
Link: https://lore.kernel.org/r/20220120233748.2062559-1-sly...@gmail.com
---
 tools/objtool/check.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 21735829b860..750ef1c446c8 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2823,7 +2823,7 @@ static inline bool func_uaccess_safe(struct symbol *func)
 
 static inline const char *call_dest_name(struct instruction *insn)
 {
-   static char pvname[16];
+   static char pvname[19];
struct reloc *rel;
int idx;
 
-- 
2.37.0



[GIT CHERRY-PICK 0/2] Fix compilation errors on GCC12

2022-07-13 Thread Tales Aparecida
Hello Alex,

I believe you are already working on a rebase right now, but could you please
cherry-pick these two commits from torvalds/master to fix compilation errors 
raised by
GCC12 in the meantime?

SHA-1:82880283d7fcd0a1d20964a56d6d1a5cc0df0713
patch-id: 684ed745d944c90c2aae3c9eda5a4f5aa9cd48e5

SHA-1:52a9dab6d892763b2a8334a568bd4e2c1a6fde66
patch-id: 6b15e90354234809c3e054332d5d37612c5995dc

Thanks in advance,
Tales

Kees Cook (1):
  libsubcmd: Fix use-after-free for realloc(..., 0)

Sergei Trofimovich (1):
  objtool: Fix truncated string warning

 tools/lib/subcmd/subcmd-util.h | 11 ++-
 tools/objtool/check.c  |  2 +-
 2 files changed, 3 insertions(+), 10 deletions(-)

-- 
2.37.0



[PATCH] drm/amdgpu: Call trace info was found in dmesg when loading amdgpu

2022-07-13 Thread lin cao
In the case of SRIOV, the register smnMp1_PMI_3_FIFO will get an invalid
value which will cause the "shift out of bound". In Ubuntu22.04, this
issue will be checked an related call trace will be reported in dmesg.

Signed-off-by: lin cao 
---
 drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index b71860e5324a..fa520d79ef67 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -886,6 +886,7 @@ static void sienna_cichlid_stb_init(struct smu_context 
*smu);
 
 static int sienna_cichlid_init_smc_tables(struct smu_context *smu)
 {
+   struct amdgpu_device *adev = smu->adev;
int ret = 0;
 
ret = sienna_cichlid_tables_init(smu);
@@ -896,7 +897,8 @@ static int sienna_cichlid_init_smc_tables(struct 
smu_context *smu)
if (ret)
return ret;
 
-   sienna_cichlid_stb_init(smu);
+   if (!amdgpu_sriov_vf(adev))
+   sienna_cichlid_stb_init(smu);
 
return smu_v11_0_init_smc_tables(smu);
 }
-- 
2.25.1



Re: [PATCH] drm/amd/display: Add missing hard-float compile flags for PPC64 builds

2022-07-13 Thread Alex Deucher
On Wed, Jul 13, 2022 at 7:09 PM Guenter Roeck  wrote:
>
> On Wed, Jul 13, 2022 at 05:20:40PM -0400, Alex Deucher wrote:
> > >
> > > The problem is not the FPU operations, but the fact that soft-float
> > > and hard-float compiled code is linked together. The soft-float and
> > > hard-float ABIs on powerpc are not compatible, so one ends up with
> > > an object file which is partially soft-float and partially hard-float
> > > compiled and thus uses different ABIs. That can only create chaos,
> > > so the linker complains about it.
> >
> > I get that, I just don't see why only DCN 3.1.x files have this
> > problem.  The DCN 2.x files should as well.
> >
>
> Seen in drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile:
>
> # prevent build errors regarding soft-float vs hard-float FP ABI tags
> # this code is currently unused on ppc64, as it applies to Renoir APUs only
> ifdef CONFIG_PPC64
> CFLAGS_$(AMDDALPATH)/dc/clk_mgr/dcn21/rn_clk_mgr.o := $(call 
> cc-option,-mno-gnu-attribute)
> endif
>
> Does that explain it ?

I would expect to see it in dcn20_resource.c and dcn30_clk_mgr.c for
example.  They follow the same pattern as the dcn 3.1.x files.  They
call functions that use FP, but maybe there is some FP code still in
those functions that we missed somehow.

Alex


Re: [PATCH] drm/amd/display: Add missing hard-float compile flags for PPC64 builds

2022-07-13 Thread Guenter Roeck
On Wed, Jul 13, 2022 at 05:20:40PM -0400, Alex Deucher wrote:
> >
> > The problem is not the FPU operations, but the fact that soft-float
> > and hard-float compiled code is linked together. The soft-float and
> > hard-float ABIs on powerpc are not compatible, so one ends up with
> > an object file which is partially soft-float and partially hard-float
> > compiled and thus uses different ABIs. That can only create chaos,
> > so the linker complains about it.
> 
> I get that, I just don't see why only DCN 3.1.x files have this
> problem.  The DCN 2.x files should as well.
> 

Seen in drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile:

# prevent build errors regarding soft-float vs hard-float FP ABI tags
# this code is currently unused on ppc64, as it applies to Renoir APUs only
ifdef CONFIG_PPC64
CFLAGS_$(AMDDALPATH)/dc/clk_mgr/dcn21/rn_clk_mgr.o := $(call 
cc-option,-mno-gnu-attribute)
endif

Does that explain it ?

Guenter


Re: Linux 5.19-rc6

2022-07-13 Thread Sudip Mukherjee
On Thu, Jul 14, 2022 at 12:12 AM Guenter Roeck  wrote:
>
> On Thu, Jul 14, 2022 at 12:09:24AM +0100, Sudip Mukherjee wrote:
> > On Wed, Jul 13, 2022 at 11:56 PM Guenter Roeck  wrote:
> > >
> > > On Wed, Jul 13, 2022 at 10:50:06PM +0100, Sudip Mukherjee wrote:
> > > > On Wed, Jul 13, 2022 at 10:45 PM Linus Torvalds
> > > >  wrote:
> > > > >
> > > > > On Wed, Jul 13, 2022 at 2:36 PM Sudip Mukherjee
> > > > >  wrote:
> > > > > >
> > > > > > > >
> > > > > > > > https://lore.kernel.org/all/20220524025139.40212-1-wangkefeng.w...@huawei.com/
> > > > > > >
> > > > > > > That patch looks sane to me, but I guess Guenter would need to 
> > > > > > > check
> > > > > >
> > > > > > I still see the failure in my builds with this patch. But 
> > > > > > surprisingly
> > > > > > I dont see the build failure (with or without this patch) with 
> > > > > > gcc-12,
> > > > > > only with gcc-11.
> > > > >
> > > > > Arrghs. "build failure"?
> > > >
> > > > Uhh.. no, sorry.. I meant the same problem which Guenter reported with
> > > > powerpc64-linux-ld, hard float and soft float.
> > > > But I dont see this problem with gcc-12, only with gcc-11.
> > > >
> > >
> > > Weird. It works for me with gcc 11.3.0 / binutils 2.38 as well as with
> > > gcc 11.2.0 / binutils 2.36.1.
> >
> > Its entirely possible that I have messed up, there are references to
> > many patches in this thread. :)
> > Can you please paste the link of the patch that you say is working for
> > you. I will try a clean build with that.
> >
>
> The patch is at:
>
> https://lore.kernel.org/lkml/20220618232737.2036722-1-li...@roeck-us.net/raw

Thanks, that works. tested with gcc-11.3.1, and binutils 2.38 on top
of latest mainline (4a57a8400075bc5287c5c877702c68aeae2a033d)

When I said I still had the problem, I tested with
https://lore.kernel.org/all/20220524025139.40212-1-wangkefeng.w...@huawei.com/


-- 
Regards
Sudip


Re: Linux 5.19-rc6

2022-07-13 Thread Guenter Roeck
On Wed, Jul 13, 2022 at 10:50:06PM +0100, Sudip Mukherjee wrote:
> On Wed, Jul 13, 2022 at 10:45 PM Linus Torvalds
>  wrote:
> >
> > On Wed, Jul 13, 2022 at 2:36 PM Sudip Mukherjee
> >  wrote:
> > >
> > > > >
> > > > > https://lore.kernel.org/all/20220524025139.40212-1-wangkefeng.w...@huawei.com/
> > > >
> > > > That patch looks sane to me, but I guess Guenter would need to check
> > >
> > > I still see the failure in my builds with this patch. But surprisingly
> > > I dont see the build failure (with or without this patch) with gcc-12,
> > > only with gcc-11.
> >
> > Arrghs. "build failure"?
> 
> Uhh.. no, sorry.. I meant the same problem which Guenter reported with
> powerpc64-linux-ld, hard float and soft float.
> But I dont see this problem with gcc-12, only with gcc-11.

I am wondering ... you say "my builds". Is this possibly not
allmodconfig ? It may well be that there are other configurations
which still have a problem after my patch has been applied.

Guenter


Re: Linux 5.19-rc6

2022-07-13 Thread Guenter Roeck
On Thu, Jul 14, 2022 at 12:26:27AM +0100, Sudip Mukherjee wrote:
> On Thu, Jul 14, 2022 at 12:12 AM Guenter Roeck  wrote:
> >
> > On Thu, Jul 14, 2022 at 12:09:24AM +0100, Sudip Mukherjee wrote:
> > > On Wed, Jul 13, 2022 at 11:56 PM Guenter Roeck  wrote:
> > > >
> > > > On Wed, Jul 13, 2022 at 10:50:06PM +0100, Sudip Mukherjee wrote:
> > > > > On Wed, Jul 13, 2022 at 10:45 PM Linus Torvalds
> > > > >  wrote:
> > > > > >
> > > > > > On Wed, Jul 13, 2022 at 2:36 PM Sudip Mukherjee
> > > > > >  wrote:
> > > > > > >
> > > > > > > > >
> > > > > > > > > https://lore.kernel.org/all/20220524025139.40212-1-wangkefeng.w...@huawei.com/
> > > > > > > >
> > > > > > > > That patch looks sane to me, but I guess Guenter would need to 
> > > > > > > > check
> > > > > > >
> > > > > > > I still see the failure in my builds with this patch. But 
> > > > > > > surprisingly
> > > > > > > I dont see the build failure (with or without this patch) with 
> > > > > > > gcc-12,
> > > > > > > only with gcc-11.
> > > > > >
> > > > > > Arrghs. "build failure"?
> > > > >
> > > > > Uhh.. no, sorry.. I meant the same problem which Guenter reported with
> > > > > powerpc64-linux-ld, hard float and soft float.
> > > > > But I dont see this problem with gcc-12, only with gcc-11.
> > > > >
> > > >
> > > > Weird. It works for me with gcc 11.3.0 / binutils 2.38 as well as with
> > > > gcc 11.2.0 / binutils 2.36.1.
> > >
> > > Its entirely possible that I have messed up, there are references to
> > > many patches in this thread. :)
> > > Can you please paste the link of the patch that you say is working for
> > > you. I will try a clean build with that.
> > >
> >
> > The patch is at:
> >
> > https://lore.kernel.org/lkml/20220618232737.2036722-1-li...@roeck-us.net/raw
> 
> Thanks, that works. tested with gcc-11.3.1, and binutils 2.38 on top
> of latest mainline (4a57a8400075bc5287c5c877702c68aeae2a033d)
> 
> When I said I still had the problem, I tested with
> https://lore.kernel.org/all/20220524025139.40212-1-wangkefeng.w...@huawei.com/

Makes sense. That was the patch fixing the runtime problem.

Guenter


Re: Linux 5.19-rc6

2022-07-13 Thread Sudip Mukherjee
On Wed, Jul 13, 2022 at 11:56 PM Guenter Roeck  wrote:
>
> On Wed, Jul 13, 2022 at 10:50:06PM +0100, Sudip Mukherjee wrote:
> > On Wed, Jul 13, 2022 at 10:45 PM Linus Torvalds
> >  wrote:
> > >
> > > On Wed, Jul 13, 2022 at 2:36 PM Sudip Mukherjee
> > >  wrote:
> > > >
> > > > > >
> > > > > > https://lore.kernel.org/all/20220524025139.40212-1-wangkefeng.w...@huawei.com/
> > > > >
> > > > > That patch looks sane to me, but I guess Guenter would need to check
> > > >
> > > > I still see the failure in my builds with this patch. But surprisingly
> > > > I dont see the build failure (with or without this patch) with gcc-12,
> > > > only with gcc-11.
> > >
> > > Arrghs. "build failure"?
> >
> > Uhh.. no, sorry.. I meant the same problem which Guenter reported with
> > powerpc64-linux-ld, hard float and soft float.
> > But I dont see this problem with gcc-12, only with gcc-11.
> >
>
> Weird. It works for me with gcc 11.3.0 / binutils 2.38 as well as with
> gcc 11.2.0 / binutils 2.36.1.

Its entirely possible that I have messed up, there are references to
many patches in this thread. :)
Can you please paste the link of the patch that you say is working for
you. I will try a clean build with that.


-- 
Regards
Sudip


Re: Linux 5.19-rc6

2022-07-13 Thread Guenter Roeck
On Wed, Jul 13, 2022 at 10:50:06PM +0100, Sudip Mukherjee wrote:
> On Wed, Jul 13, 2022 at 10:45 PM Linus Torvalds
>  wrote:
> >
> > On Wed, Jul 13, 2022 at 2:36 PM Sudip Mukherjee
> >  wrote:
> > >
> > > > >
> > > > > https://lore.kernel.org/all/20220524025139.40212-1-wangkefeng.w...@huawei.com/
> > > >
> > > > That patch looks sane to me, but I guess Guenter would need to check
> > >
> > > I still see the failure in my builds with this patch. But surprisingly
> > > I dont see the build failure (with or without this patch) with gcc-12,
> > > only with gcc-11.
> >
> > Arrghs. "build failure"?
> 
> Uhh.. no, sorry.. I meant the same problem which Guenter reported with
> powerpc64-linux-ld, hard float and soft float.
> But I dont see this problem with gcc-12, only with gcc-11.
> 

Weird. It works for me with gcc 11.3.0 / binutils 2.38 as well as with
gcc 11.2.0 / binutils 2.36.1.

Guenter


Re: [PATCH] drm/amd/display: Add missing hard-float compile flags for PPC64 builds

2022-07-13 Thread Guenter Roeck
On Wed, Jul 13, 2022 at 05:20:40PM -0400, Alex Deucher wrote:
[ ... ]
> > The problem is not the FPU operations, but the fact that soft-float
> > and hard-float compiled code is linked together. The soft-float and
> > hard-float ABIs on powerpc are not compatible, so one ends up with
> > an object file which is partially soft-float and partially hard-float
> > compiled and thus uses different ABIs. That can only create chaos,
> > so the linker complains about it.
> 
> I get that, I just don't see why only DCN 3.1.x files have this
> problem.  The DCN 2.x files should as well.
> 

No idea. Maybe ppc:allmodconfig only builds DCN 3.1.x, and other builds
don't use -Werror and the warning is ignored.

Guenter


Re: Linux 5.19-rc6

2022-07-13 Thread Guenter Roeck
On Thu, Jul 14, 2022 at 12:09:24AM +0100, Sudip Mukherjee wrote:
> On Wed, Jul 13, 2022 at 11:56 PM Guenter Roeck  wrote:
> >
> > On Wed, Jul 13, 2022 at 10:50:06PM +0100, Sudip Mukherjee wrote:
> > > On Wed, Jul 13, 2022 at 10:45 PM Linus Torvalds
> > >  wrote:
> > > >
> > > > On Wed, Jul 13, 2022 at 2:36 PM Sudip Mukherjee
> > > >  wrote:
> > > > >
> > > > > > >
> > > > > > > https://lore.kernel.org/all/20220524025139.40212-1-wangkefeng.w...@huawei.com/
> > > > > >
> > > > > > That patch looks sane to me, but I guess Guenter would need to check
> > > > >
> > > > > I still see the failure in my builds with this patch. But surprisingly
> > > > > I dont see the build failure (with or without this patch) with gcc-12,
> > > > > only with gcc-11.
> > > >
> > > > Arrghs. "build failure"?
> > >
> > > Uhh.. no, sorry.. I meant the same problem which Guenter reported with
> > > powerpc64-linux-ld, hard float and soft float.
> > > But I dont see this problem with gcc-12, only with gcc-11.
> > >
> >
> > Weird. It works for me with gcc 11.3.0 / binutils 2.38 as well as with
> > gcc 11.2.0 / binutils 2.36.1.
> 
> Its entirely possible that I have messed up, there are references to
> many patches in this thread. :)
> Can you please paste the link of the patch that you say is working for
> you. I will try a clean build with that.
> 

The patch is at:

https://lore.kernel.org/lkml/20220618232737.2036722-1-li...@roeck-us.net/raw

Guenter


Re: Linux 5.19-rc6

2022-07-13 Thread Sudip Mukherjee
On Wed, Jul 13, 2022 at 10:45 PM Linus Torvalds
 wrote:
>
> On Wed, Jul 13, 2022 at 2:36 PM Sudip Mukherjee
>  wrote:
> >
> > > >
> > > > https://lore.kernel.org/all/20220524025139.40212-1-wangkefeng.w...@huawei.com/
> > >
> > > That patch looks sane to me, but I guess Guenter would need to check
> >
> > I still see the failure in my builds with this patch. But surprisingly
> > I dont see the build failure (with or without this patch) with gcc-12,
> > only with gcc-11.
>
> Arrghs. "build failure"?

Uhh.. no, sorry.. I meant the same problem which Guenter reported with
powerpc64-linux-ld, hard float and soft float.
But I dont see this problem with gcc-12, only with gcc-11.


-- 
Regards
Sudip


Re: Linux 5.19-rc6

2022-07-13 Thread Linus Torvalds
On Wed, Jul 13, 2022 at 2:36 PM Sudip Mukherjee
 wrote:
>
> > >
> > > https://lore.kernel.org/all/20220524025139.40212-1-wangkefeng.w...@huawei.com/
> >
> > That patch looks sane to me, but I guess Guenter would need to check
>
> I still see the failure in my builds with this patch. But surprisingly
> I dont see the build failure (with or without this patch) with gcc-12,
> only with gcc-11.

Arrghs. "build failure"?

So is there another problem than the runtime issue that Guenter reports:

  OF: amba_device_add() failed (-19) for /amba/smc@1010

in this area? That patch looks harmless from a build standpoint, but
that's not saying much, so can you please quote the actual build
failure here?

  Linus


Re: Linux 5.19-rc6

2022-07-13 Thread Sudip Mukherjee
On Wed, Jul 13, 2022 at 9:42 PM Linus Torvalds
 wrote:
>
> On Wed, Jul 13, 2022 at 12:49 PM Russell King (Oracle)
>  wrote:
> >
> > There may be a patch that solves that, but it's never been submitted to
> > my patch system:
> >
> > https://lore.kernel.org/all/20220524025139.40212-1-wangkefeng.w...@huawei.com/
>
> That patch looks sane to me, but I guess Guenter would need to check

I still see the failure in my builds with this patch. But surprisingly
I dont see the build failure (with or without this patch) with gcc-12,
only with gcc-11.


-- 
Regards
Sudip


Re: Linux 5.19-rc6

2022-07-13 Thread Linus Torvalds
On Wed, Jul 13, 2022 at 2:01 PM Alex Deucher  wrote:
>
> If you want to apply Guenter's patch original patch:
> https://patchwork.freedesktop.org/patch/490184/
> That's fine with me.

Honestly, by this time I feel that it's too little, too late.

The ppc people apparently didn't care at all about the fact that this
driver didn't compile.

At least Michael Ellerman and Daniel Axtens were cc'd on that thread
with the proposed fix originally.

I don't see any replies from ppc people as to why it happened, even
though apparently a bog-standard "make allmodconfig" just doesn't
build.

I'd try it myself, since I do have a cross-build environment for some
earlier cross-build verification I did.

But since my upgrade to F36 it now uses gcc-12, and possibly due to
that I get hundreds of errors long before I get to any drm drivers:

  Cannot find symbol for section 19: .text.create_section_mapping.
  arch/powerpc/mm/mem.o: failed
  ...
  Cannot find symbol for section 19: .text.cpu_show_meltdown.
  drivers/base/cpu.o: failed
  Error: External symbol 'memset' referenced from prom_init.c

this cross environment used to work for me, but something changed (I
mention gcc-12, but honestly, that's based on nothing at all, except
for the few problems it caused on x86-64. It could be something
entirely unrelated, but it does look like some bad interaction with
-ffunction-sections).

So considering that the ppc people ignored this whole issue since the
merge window, I think it's entirely unreasonable to then apply a
ppc-specific patch for this at this time, when people literally asked
"why is this needed", and there was no reply from the powerpc side.

Does any of that sound like "we should support this driver on powerpc" to you?

 Linus


Re: [PATCH] drm/amd/display: Add missing hard-float compile flags for PPC64 builds

2022-07-13 Thread Guenter Roeck

On 7/13/22 13:57, Alex Deucher wrote:

On Thu, Jun 30, 2022 at 5:01 PM Rodrigo Siqueira Jordao
 wrote:




On 2022-06-18 19:27, Guenter Roeck wrote:

ppc:allmodconfig builds fail with the following error.

powerpc64-linux-ld:
   drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o
   uses hard float,
   drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o
   uses soft float
powerpc64-linux-ld:
   failed to merge target specific data of file
   drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o
powerpc64-linux-ld:
   drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o
   uses hard float,
   drivers/gpu/drm/amd/amdgpu/../display/dc/dcn315/dcn315_resource.o
   uses soft float
powerpc64-linux-ld:
   failed to merge target specific data of
   file drivers/gpu/drm/amd/amdgpu/../display/dc/dcn315/dcn315_resource.o
powerpc64-linux-ld:
   drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o
   uses hard float,
   drivers/gpu/drm/amd/amdgpu/../display/dc/dcn316/dcn316_resource.o
   uses soft float
powerpc64-linux-ld:
   failed to merge target specific data of file
   drivers/gpu/drm/amd/amdgpu/../display/dc/dcn316/dcn316_resource.o

The problem was introduced with commit 41b7a347bf14 ("powerpc: Book3S
64-bit outline-only KASAN support") which adds support for KASAN. This
commit in turn enables DRM_AMD_DC_DCN because KCOV_INSTRUMENT_ALL and
KCOV_ENABLE_COMPARISONS are no longer enabled. As result, new files are
compiled which lack the selection of hard-float.

Fixes: 41b7a347bf14 ("powerpc: Book3S 64-bit outline-only KASAN support")
Cc: Michael Ellerman 
Cc: Daniel Axtens 
Signed-off-by: Guenter Roeck 
---
   drivers/gpu/drm/amd/display/dc/dcn31/Makefile  | 4 
   drivers/gpu/drm/amd/display/dc/dcn315/Makefile | 4 
   drivers/gpu/drm/amd/display/dc/dcn316/Makefile | 4 
   3 files changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
index ec041e3cda30..74be02114ae4 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
@@ -15,6 +15,10 @@ DCN31 = dcn31_resource.o dcn31_hubbub.o dcn31_hwseq.o 
dcn31_init.o dcn31_hubp.o
   dcn31_apg.o dcn31_hpo_dp_stream_encoder.o dcn31_hpo_dp_link_encoder.o \
   dcn31_afmt.o dcn31_vpg.o

+ifdef CONFIG_PPC64
+CFLAGS_$(AMDDALPATH)/dc/dcn31/dcn31_resource.o := -mhard-float -maltivec
+endif
+
   AMD_DAL_DCN31 = $(addprefix $(AMDDALPATH)/dc/dcn31/,$(DCN31))

   AMD_DISPLAY_FILES += $(AMD_DAL_DCN31)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn315/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn315/Makefile
index 59381d24800b..1395c1ced8c5 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn315/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn315/Makefile
@@ -25,6 +25,10 @@

   DCN315 = dcn315_resource.o

+ifdef CONFIG_PPC64
+CFLAGS_$(AMDDALPATH)/dc/dcn315/dcn315_resource.o := -mhard-float -maltivec
+endif
+
   AMD_DAL_DCN315 = $(addprefix $(AMDDALPATH)/dc/dcn315/,$(DCN315))

   AMD_DISPLAY_FILES += $(AMD_DAL_DCN315)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn316/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn316/Makefile
index 819d44a9439b..c3d2dd78f1e2 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn316/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn316/Makefile
@@ -25,6 +25,10 @@

   DCN316 = dcn316_resource.o

+ifdef CONFIG_PPC64
+CFLAGS_$(AMDDALPATH)/dc/dcn316/dcn316_resource.o := -mhard-float -maltivec
+endif
+
   AMD_DAL_DCN316 = $(addprefix $(AMDDALPATH)/dc/dcn316/,$(DCN316))

   AMD_DISPLAY_FILES += $(AMD_DAL_DCN316)


Hi,

I don't want to re-introduce those FPU flags for DCN31/DCN314/DCN316
since we fully isolate FPU operations for those ASICs inside the DML


I don't understand why we don't need to add the hard-float flags back
on the other DCN blocks.  Did we miss something in the DML cleanup for
DCN 3.1.x?  Anyway, at this point, the patch is:
Acked-by: Alex Deucher 
We can sort the rest out for 5.20.



The problem is not the FPU operations, but the fact that soft-float
and hard-float compiled code is linked together. The soft-float and
hard-float ABIs on powerpc are not compatible, so one ends up with
an object file which is partially soft-float and partially hard-float
compiled and thus uses different ABIs. That can only create chaos,
so the linker complains about it.

Guenter


Re: [PATCH] drm/amd/display: Add missing hard-float compile flags for PPC64 builds

2022-07-13 Thread Alex Deucher
On Wed, Jul 13, 2022 at 5:18 PM Guenter Roeck  wrote:
>
> On 7/13/22 13:57, Alex Deucher wrote:
> > On Thu, Jun 30, 2022 at 5:01 PM Rodrigo Siqueira Jordao
> >  wrote:
> >>
> >>
> >>
> >> On 2022-06-18 19:27, Guenter Roeck wrote:
> >>> ppc:allmodconfig builds fail with the following error.
> >>>
> >>> powerpc64-linux-ld:
> >>>drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o
> >>>uses hard float,
> >>>drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o
> >>>uses soft float
> >>> powerpc64-linux-ld:
> >>>failed to merge target specific data of file
> >>>drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o
> >>> powerpc64-linux-ld:
> >>>drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o
> >>>uses hard float,
> >>>drivers/gpu/drm/amd/amdgpu/../display/dc/dcn315/dcn315_resource.o
> >>>uses soft float
> >>> powerpc64-linux-ld:
> >>>failed to merge target specific data of
> >>>file 
> >>> drivers/gpu/drm/amd/amdgpu/../display/dc/dcn315/dcn315_resource.o
> >>> powerpc64-linux-ld:
> >>>drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o
> >>>uses hard float,
> >>>drivers/gpu/drm/amd/amdgpu/../display/dc/dcn316/dcn316_resource.o
> >>>uses soft float
> >>> powerpc64-linux-ld:
> >>>failed to merge target specific data of file
> >>>drivers/gpu/drm/amd/amdgpu/../display/dc/dcn316/dcn316_resource.o
> >>>
> >>> The problem was introduced with commit 41b7a347bf14 ("powerpc: Book3S
> >>> 64-bit outline-only KASAN support") which adds support for KASAN. This
> >>> commit in turn enables DRM_AMD_DC_DCN because KCOV_INSTRUMENT_ALL and
> >>> KCOV_ENABLE_COMPARISONS are no longer enabled. As result, new files are
> >>> compiled which lack the selection of hard-float.
> >>>
> >>> Fixes: 41b7a347bf14 ("powerpc: Book3S 64-bit outline-only KASAN support")
> >>> Cc: Michael Ellerman 
> >>> Cc: Daniel Axtens 
> >>> Signed-off-by: Guenter Roeck 
> >>> ---
> >>>drivers/gpu/drm/amd/display/dc/dcn31/Makefile  | 4 
> >>>drivers/gpu/drm/amd/display/dc/dcn315/Makefile | 4 
> >>>drivers/gpu/drm/amd/display/dc/dcn316/Makefile | 4 
> >>>3 files changed, 12 insertions(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile 
> >>> b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
> >>> index ec041e3cda30..74be02114ae4 100644
> >>> --- a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
> >>> +++ b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
> >>> @@ -15,6 +15,10 @@ DCN31 = dcn31_resource.o dcn31_hubbub.o dcn31_hwseq.o 
> >>> dcn31_init.o dcn31_hubp.o
> >>>dcn31_apg.o dcn31_hpo_dp_stream_encoder.o 
> >>> dcn31_hpo_dp_link_encoder.o \
> >>>dcn31_afmt.o dcn31_vpg.o
> >>>
> >>> +ifdef CONFIG_PPC64
> >>> +CFLAGS_$(AMDDALPATH)/dc/dcn31/dcn31_resource.o := -mhard-float -maltivec
> >>> +endif
> >>> +
> >>>AMD_DAL_DCN31 = $(addprefix $(AMDDALPATH)/dc/dcn31/,$(DCN31))
> >>>
> >>>AMD_DISPLAY_FILES += $(AMD_DAL_DCN31)
> >>> diff --git a/drivers/gpu/drm/amd/display/dc/dcn315/Makefile 
> >>> b/drivers/gpu/drm/amd/display/dc/dcn315/Makefile
> >>> index 59381d24800b..1395c1ced8c5 100644
> >>> --- a/drivers/gpu/drm/amd/display/dc/dcn315/Makefile
> >>> +++ b/drivers/gpu/drm/amd/display/dc/dcn315/Makefile
> >>> @@ -25,6 +25,10 @@
> >>>
> >>>DCN315 = dcn315_resource.o
> >>>
> >>> +ifdef CONFIG_PPC64
> >>> +CFLAGS_$(AMDDALPATH)/dc/dcn315/dcn315_resource.o := -mhard-float 
> >>> -maltivec
> >>> +endif
> >>> +
> >>>AMD_DAL_DCN315 = $(addprefix $(AMDDALPATH)/dc/dcn315/,$(DCN315))
> >>>
> >>>AMD_DISPLAY_FILES += $(AMD_DAL_DCN315)
> >>> diff --git a/drivers/gpu/drm/amd/display/dc/dcn316/Makefile 
> >>> b/drivers/gpu/drm/amd/display/dc/dcn316/Makefile
> >>> index 819d44a9439b..c3d2dd78f1e2 100644
> >>> --- a/drivers/gpu/drm/amd/display/dc/dcn316/Makefile
> >>> +++ b/drivers/gpu/drm/amd/display/dc/dcn316/Makefile
> >>> @@ -25,6 +25,10 @@
> >>>
> >>>DCN316 = dcn316_resource.o
> >>>
> >>> +ifdef CONFIG_PPC64
> >>> +CFLAGS_$(AMDDALPATH)/dc/dcn316/dcn316_resource.o := -mhard-float 
> >>> -maltivec
> >>> +endif
> >>> +
> >>>AMD_DAL_DCN316 = $(addprefix $(AMDDALPATH)/dc/dcn316/,$(DCN316))
> >>>
> >>>AMD_DISPLAY_FILES += $(AMD_DAL_DCN316)
> >>
> >> Hi,
> >>
> >> I don't want to re-introduce those FPU flags for DCN31/DCN314/DCN316
> >> since we fully isolate FPU operations for those ASICs inside the DML
> >
> > I don't understand why we don't need to add the hard-float flags back
> > on the other DCN blocks.  Did we miss something in the DML cleanup for
> > DCN 3.1.x?  Anyway, at this point, the patch is:
> > Acked-by: Alex Deucher 
> > We can sort the rest out for 5.20.
> >
>
> The problem is not the FPU operations, but the fact that soft-float
> and hard-float compiled code is linked together. The soft-float and
> hard-float ABIs on powerpc 

Re: Linux 5.19-rc6

2022-07-13 Thread Alex Deucher
On Wed, Jul 13, 2022 at 4:46 PM Guenter Roeck  wrote:
>
> On 7/13/22 12:36, Linus Torvalds wrote:
> > On Tue, Jul 12, 2022 at 10:07 PM Guenter Roeck  wrote:
> >>
> >> Same problems as every week.
> >>
> >> Building powerpc:allmodconfig ... failed
> >
> > Ok, this has been going on since -rc1, which is much too long.
> >
> >>From your patch submission that that was rejected:
> >
> >> The problem was introduced with commit 41b7a347bf14 ("powerpc: Book3S
> >> 64-bit outline-only KASAN support") which adds support for KASAN. This
> >> commit in turn enables DRM_AMD_DC_DCN because KCOV_INSTRUMENT_ALL and
> >> KCOV_ENABLE_COMPARISONS are no longer enabled. As result, new files are
> >> compiled which lack the selection of hard-float.
> >
> > And considering that neither the ppc people nor the drm people seem
> > interested in fixing this, and it doesn't revert cleanly I think the
> > sane solution seems to be to just remove PPC64 support for DRM_AMD_DC
> > entirely.
> >
> > IOW, does something like this (obviously nor a proper patch, but you
> > get the idea) fix the ppc build for you?
> >
> >@@ -6,7 +6,7 @@ config DRM_AMD_DC
> >bool "AMD DC - Enable new display engine"
> >default y
> >select SND_HDA_COMPONENT if SND_HDA_CORE
> >-   select DRM_AMD_DC_DCN if (X86 || PPC64) &&
> > !(KCOV_INSTRUMENT_ALL && KCOV_ENABLE_COMPARISONS)
> >+   select DRM_AMD_DC_DCN if X86 && !(KCOV_INSTRUMENT_ALL &&
> > KCOV_ENABLE_COMPARISONS)
> >help
> >  Choose this option if you want to use the new display engine
> >  support for AMDGPU. This adds required support for Vega and
> >
>
> It does, but I can't imagine that the drm or ppc people would be happy
> about it.

If you want to apply Guenter's patch original patch:
https://patchwork.freedesktop.org/patch/490184/
That's fine with me.  It just kind of slipped off my radar.  We can
dig deeper on a better fix next cycle.
Acked-by: Alex Deucher 

>
> Guenter


Re: Linux 5.19-rc6

2022-07-13 Thread Linus Torvalds
On Wed, Jul 13, 2022 at 1:46 PM Guenter Roeck  wrote:
>
> It does, but I can't imagine that the drm or ppc people would be happy
> about it.

When something has been reported as not building for five weeks?

I have zero f's to give at that point about their "happiness".

 Linus


Re: [PATCH] drm/amd/display: Add missing hard-float compile flags for PPC64 builds

2022-07-13 Thread Alex Deucher
On Thu, Jun 30, 2022 at 5:01 PM Rodrigo Siqueira Jordao
 wrote:
>
>
>
> On 2022-06-18 19:27, Guenter Roeck wrote:
> > ppc:allmodconfig builds fail with the following error.
> >
> > powerpc64-linux-ld:
> >   drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o
> >   uses hard float,
> >   drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o
> >   uses soft float
> > powerpc64-linux-ld:
> >   failed to merge target specific data of file
> >   drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o
> > powerpc64-linux-ld:
> >   drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o
> >   uses hard float,
> >   drivers/gpu/drm/amd/amdgpu/../display/dc/dcn315/dcn315_resource.o
> >   uses soft float
> > powerpc64-linux-ld:
> >   failed to merge target specific data of
> >   file drivers/gpu/drm/amd/amdgpu/../display/dc/dcn315/dcn315_resource.o
> > powerpc64-linux-ld:
> >   drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o
> >   uses hard float,
> >   drivers/gpu/drm/amd/amdgpu/../display/dc/dcn316/dcn316_resource.o
> >   uses soft float
> > powerpc64-linux-ld:
> >   failed to merge target specific data of file
> >   drivers/gpu/drm/amd/amdgpu/../display/dc/dcn316/dcn316_resource.o
> >
> > The problem was introduced with commit 41b7a347bf14 ("powerpc: Book3S
> > 64-bit outline-only KASAN support") which adds support for KASAN. This
> > commit in turn enables DRM_AMD_DC_DCN because KCOV_INSTRUMENT_ALL and
> > KCOV_ENABLE_COMPARISONS are no longer enabled. As result, new files are
> > compiled which lack the selection of hard-float.
> >
> > Fixes: 41b7a347bf14 ("powerpc: Book3S 64-bit outline-only KASAN support")
> > Cc: Michael Ellerman 
> > Cc: Daniel Axtens 
> > Signed-off-by: Guenter Roeck 
> > ---
> >   drivers/gpu/drm/amd/display/dc/dcn31/Makefile  | 4 
> >   drivers/gpu/drm/amd/display/dc/dcn315/Makefile | 4 
> >   drivers/gpu/drm/amd/display/dc/dcn316/Makefile | 4 
> >   3 files changed, 12 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile 
> > b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
> > index ec041e3cda30..74be02114ae4 100644
> > --- a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
> > +++ b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
> > @@ -15,6 +15,10 @@ DCN31 = dcn31_resource.o dcn31_hubbub.o dcn31_hwseq.o 
> > dcn31_init.o dcn31_hubp.o
> >   dcn31_apg.o dcn31_hpo_dp_stream_encoder.o dcn31_hpo_dp_link_encoder.o 
> > \
> >   dcn31_afmt.o dcn31_vpg.o
> >
> > +ifdef CONFIG_PPC64
> > +CFLAGS_$(AMDDALPATH)/dc/dcn31/dcn31_resource.o := -mhard-float -maltivec
> > +endif
> > +
> >   AMD_DAL_DCN31 = $(addprefix $(AMDDALPATH)/dc/dcn31/,$(DCN31))
> >
> >   AMD_DISPLAY_FILES += $(AMD_DAL_DCN31)
> > diff --git a/drivers/gpu/drm/amd/display/dc/dcn315/Makefile 
> > b/drivers/gpu/drm/amd/display/dc/dcn315/Makefile
> > index 59381d24800b..1395c1ced8c5 100644
> > --- a/drivers/gpu/drm/amd/display/dc/dcn315/Makefile
> > +++ b/drivers/gpu/drm/amd/display/dc/dcn315/Makefile
> > @@ -25,6 +25,10 @@
> >
> >   DCN315 = dcn315_resource.o
> >
> > +ifdef CONFIG_PPC64
> > +CFLAGS_$(AMDDALPATH)/dc/dcn315/dcn315_resource.o := -mhard-float -maltivec
> > +endif
> > +
> >   AMD_DAL_DCN315 = $(addprefix $(AMDDALPATH)/dc/dcn315/,$(DCN315))
> >
> >   AMD_DISPLAY_FILES += $(AMD_DAL_DCN315)
> > diff --git a/drivers/gpu/drm/amd/display/dc/dcn316/Makefile 
> > b/drivers/gpu/drm/amd/display/dc/dcn316/Makefile
> > index 819d44a9439b..c3d2dd78f1e2 100644
> > --- a/drivers/gpu/drm/amd/display/dc/dcn316/Makefile
> > +++ b/drivers/gpu/drm/amd/display/dc/dcn316/Makefile
> > @@ -25,6 +25,10 @@
> >
> >   DCN316 = dcn316_resource.o
> >
> > +ifdef CONFIG_PPC64
> > +CFLAGS_$(AMDDALPATH)/dc/dcn316/dcn316_resource.o := -mhard-float -maltivec
> > +endif
> > +
> >   AMD_DAL_DCN316 = $(addprefix $(AMDDALPATH)/dc/dcn316/,$(DCN316))
> >
> >   AMD_DISPLAY_FILES += $(AMD_DAL_DCN316)
>
> Hi,
>
> I don't want to re-introduce those FPU flags for DCN31/DCN314/DCN316
> since we fully isolate FPU operations for those ASICs inside the DML

I don't understand why we don't need to add the hard-float flags back
on the other DCN blocks.  Did we miss something in the DML cleanup for
DCN 3.1.x?  Anyway, at this point, the patch is:
Acked-by: Alex Deucher 
We can sort the rest out for 5.20.

Alex

> folder. Notice that we have the PPC64 in the DML Makefile:
>
> https://gitlab.freedesktop.org/agd5f/linux/-/blob/amd-staging-drm-next/drivers/gpu/drm/amd/display/dc/dml/Makefile
>
> Could you share what you see without your patch in the
> amd-staging-drm-next? Also:
> * Are you using cross-compilation? If so, could you share your setup?
> * Which GCC/Clang version are you using?
>
> Thanks
> Siqueira
>


Re: Linux 5.19-rc6

2022-07-13 Thread Guenter Roeck

On 7/13/22 12:36, Linus Torvalds wrote:

On Tue, Jul 12, 2022 at 10:07 PM Guenter Roeck  wrote:


Same problems as every week.

Building powerpc:allmodconfig ... failed


Ok, this has been going on since -rc1, which is much too long.


From your patch submission that that was rejected:



The problem was introduced with commit 41b7a347bf14 ("powerpc: Book3S
64-bit outline-only KASAN support") which adds support for KASAN. This
commit in turn enables DRM_AMD_DC_DCN because KCOV_INSTRUMENT_ALL and
KCOV_ENABLE_COMPARISONS are no longer enabled. As result, new files are
compiled which lack the selection of hard-float.


And considering that neither the ppc people nor the drm people seem
interested in fixing this, and it doesn't revert cleanly I think the
sane solution seems to be to just remove PPC64 support for DRM_AMD_DC
entirely.

IOW, does something like this (obviously nor a proper patch, but you
get the idea) fix the ppc build for you?

   @@ -6,7 +6,7 @@ config DRM_AMD_DC
   bool "AMD DC - Enable new display engine"
   default y
   select SND_HDA_COMPONENT if SND_HDA_CORE
   -   select DRM_AMD_DC_DCN if (X86 || PPC64) &&
!(KCOV_INSTRUMENT_ALL && KCOV_ENABLE_COMPARISONS)
   +   select DRM_AMD_DC_DCN if X86 && !(KCOV_INSTRUMENT_ALL &&
KCOV_ENABLE_COMPARISONS)
   help
 Choose this option if you want to use the new display engine
 support for AMDGPU. This adds required support for Vega and



It does, but I can't imagine that the drm or ppc people would be happy
about it.

Guenter


Re: Linux 5.19-rc6

2022-07-13 Thread Guenter Roeck

On 7/13/22 13:22, Linus Torvalds wrote:

On Wed, Jul 13, 2022 at 12:53 PM Alex Deucher  wrote:


Does this patch fix it?
https://patchwork.freedesktop.org/patch/493799/


Guenter? Willing to check this one too for your setup, and we can
hopefully close down both issues?



No, that fixes a different problem (I tried). We (Google) are trying to run
tests with KCOV enabled images on AMD hardware which requires the new display
engine, and we need that patch to enable it. That is unrelated to the PPC
build problem.

Guenter


Re: Linux 5.19-rc6

2022-07-13 Thread Linus Torvalds
On Wed, Jul 13, 2022 at 1:40 PM Guenter Roeck  wrote:
>
> That patch is (and has been) in linux-next for a long time,
> as commit d2ca1fd2bc70, and with the following tags.
>
>  Fixes: 7719a68b2fa4 ("ARM: 9192/1: amba: fix memory leak in 
> amba_device_try_add()")
>  Reported-by: Guenter Roeck 
>  Tested-by: Guenter Roeck 
>  Signed-off-by: Kefeng Wang 
>  Signed-off-by: Russell King (Oracle) 
>
> So, yes, it fixes the problem. I don't know where it is pulled from, though.
> I thought that it is from Russell's tree, given his Signed-off-by:,
> but I never really checked.

Heh. Yeah, with that sign-off, I bet it's in Russell's queue, bit it
just ended up in the "for next release" branch. Russell?

 Linus


Re: Linux 5.19-rc6

2022-07-13 Thread Guenter Roeck

On 7/13/22 13:21, Linus Torvalds wrote:

On Wed, Jul 13, 2022 at 12:49 PM Russell King (Oracle)
 wrote:


There may be a patch that solves that, but it's never been submitted to
my patch system:

https://lore.kernel.org/all/20220524025139.40212-1-wangkefeng.w...@huawei.com/


That patch looks sane to me, but I guess Guenter would need to check
... Guenter?



That patch is (and has been) in linux-next for a long time,
as commit d2ca1fd2bc70, and with the following tags.

Fixes: 7719a68b2fa4 ("ARM: 9192/1: amba: fix memory leak in 
amba_device_try_add()")
Reported-by: Guenter Roeck 
Tested-by: Guenter Roeck 
Signed-off-by: Kefeng Wang 
Signed-off-by: Russell King (Oracle) 

So, yes, it fixes the problem. I don't know where it is pulled from, though.
I thought that it is from Russell's tree, given his Signed-off-by:,
but I never really checked.

Guenter


Re: Linux 5.19-rc6

2022-07-13 Thread Linus Torvalds
On Wed, Jul 13, 2022 at 12:53 PM Alex Deucher  wrote:
>
> Does this patch fix it?
> https://patchwork.freedesktop.org/patch/493799/

Guenter? Willing to check this one too for your setup, and we can
hopefully close down both issues?

 Linus


Re: Linux 5.19-rc6

2022-07-13 Thread Linus Torvalds
On Wed, Jul 13, 2022 at 12:49 PM Russell King (Oracle)
 wrote:
>
> There may be a patch that solves that, but it's never been submitted to
> my patch system:
>
> https://lore.kernel.org/all/20220524025139.40212-1-wangkefeng.w...@huawei.com/

That patch looks sane to me, but I guess Guenter would need to check
... Guenter?

 Linus


Re: [PATCH] drm/amd/display: Enable building new display engine with KCOV enabled

2022-07-13 Thread Harry Wentland
On 2022-07-12 18:42, Guenter Roeck wrote:
> The new display engine uses floating point math, which is not supported
> by KCOV. Commit 9d1d02ff3678 ("drm/amd/display: Don't build DCN1 when kcov
> is enabled") tried to work around the problem by disabling
> CONFIG_DRM_AMD_DC_DCN if KCOV_INSTRUMENT_ALL and KCOV_ENABLE_COMPARISONS
> are enabled. The result is that KCOV can not be enabled on systems which
> require this display engine. A much simpler and less invasive solution is
> to disable KCOV selectively when compiling the display enagine while
> keeping it enabled for the rest of the kernel.
> 
> Fixes: 9d1d02ff3678 ("drm/amd/display: Don't build DCN1 when kcov is enabled")
> Cc: Arnd Bergmann 
> Cc: Leo Li 
> Signed-off-by: Guenter Roeck 

Reviewed-by: Harry Wentland 

Harry

> ---
>  drivers/gpu/drm/amd/display/Kconfig | 2 +-
>  drivers/gpu/drm/amd/display/dc/Makefile | 3 +++
>  2 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/Kconfig 
> b/drivers/gpu/drm/amd/display/Kconfig
> index b4029c0d5d8c..96cbc87f7b6b 100644
> --- a/drivers/gpu/drm/amd/display/Kconfig
> +++ b/drivers/gpu/drm/amd/display/Kconfig
> @@ -6,7 +6,7 @@ config DRM_AMD_DC
>   bool "AMD DC - Enable new display engine"
>   default y
>   select SND_HDA_COMPONENT if SND_HDA_CORE
> - select DRM_AMD_DC_DCN if (X86 || PPC64) && !(KCOV_INSTRUMENT_ALL && 
> KCOV_ENABLE_COMPARISONS)
> + select DRM_AMD_DC_DCN if (X86 || PPC64)
>   help
> Choose this option if you want to use the new display engine
> support for AMDGPU. This adds required support for Vega and
> diff --git a/drivers/gpu/drm/amd/display/dc/Makefile 
> b/drivers/gpu/drm/amd/display/dc/Makefile
> index b4eca0236435..b801973749d2 100644
> --- a/drivers/gpu/drm/amd/display/dc/Makefile
> +++ b/drivers/gpu/drm/amd/display/dc/Makefile
> @@ -26,6 +26,9 @@
>  DC_LIBS = basics bios dml clk_mgr dce gpio irq link virtual
>  
>  ifdef CONFIG_DRM_AMD_DC_DCN
> +
> +KCOV_INSTRUMENT := n
> +
>  DC_LIBS += dcn20
>  DC_LIBS += dsc
>  DC_LIBS += dcn10



Re: Linux 5.19-rc6

2022-07-13 Thread Russell King (Oracle)
On Wed, Jul 13, 2022 at 12:36:50PM -0700, Linus Torvalds wrote:
> On Tue, Jul 12, 2022 at 10:07 PM Guenter Roeck  wrote:
> > OF: amba_device_add() failed (-19) for /amba/smc@1010
> > [ cut here ]
> > WARNING: CPU: 0 PID: 1 at lib/refcount.c:28 
> > of_platform_bus_create+0x33c/0x3dc
> > refcount_t: underflow; use-after-free.
> 
> This too has been going on since -rc1, but it's not obvious what caused it.
> 
> At a guess, looking around the amba changes, I'm assuming it's
> 
>   7719a68b2fa4 ("ARM: 9192/1: amba: fix memory leak in amba_device_try_add()")
> 
> Does reverting that commit make it go away?

There may be a patch that solves that, but it's never been submitted to
my patch system:

https://lore.kernel.org/all/20220524025139.40212-1-wangkefeng.w...@huawei.com/

I'm sorry, but I'm utterly crap at picking up patches off mailing lists,
so if stuff doesn't end up inthe patch system, it gets missed.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: Linux 5.19-rc6

2022-07-13 Thread Alex Deucher
On Wed, Jul 13, 2022 at 3:42 PM Linus Torvalds
 wrote:
>
> On Tue, Jul 12, 2022 at 10:07 PM Guenter Roeck  wrote:
> >
> > Same problems as every week.
> >
> > Building powerpc:allmodconfig ... failed
>
> Ok, this has been going on since -rc1, which is much too long.
>
> From your patch submission that that was rejected:
>
> > The problem was introduced with commit 41b7a347bf14 ("powerpc: Book3S
> > 64-bit outline-only KASAN support") which adds support for KASAN. This
> > commit in turn enables DRM_AMD_DC_DCN because KCOV_INSTRUMENT_ALL and
> > KCOV_ENABLE_COMPARISONS are no longer enabled. As result, new files are
> > compiled which lack the selection of hard-float.
>
> And considering that neither the ppc people nor the drm people seem
> interested in fixing this, and it doesn't revert cleanly I think the
> sane solution seems to be to just remove PPC64 support for DRM_AMD_DC
> entirely.

Does this patch fix it?
https://patchwork.freedesktop.org/patch/493799/

Alex

>
> IOW, does something like this (obviously nor a proper patch, but you
> get the idea) fix the ppc build for you?
>
>   @@ -6,7 +6,7 @@ config DRM_AMD_DC
>   bool "AMD DC - Enable new display engine"
>   default y
>   select SND_HDA_COMPONENT if SND_HDA_CORE
>   -   select DRM_AMD_DC_DCN if (X86 || PPC64) &&
> !(KCOV_INSTRUMENT_ALL && KCOV_ENABLE_COMPARISONS)
>   +   select DRM_AMD_DC_DCN if X86 && !(KCOV_INSTRUMENT_ALL &&
> KCOV_ENABLE_COMPARISONS)
>   help
> Choose this option if you want to use the new display engine
> support for AMDGPU. This adds required support for Vega and
>
> > OF: amba_device_add() failed (-19) for /amba/smc@1010
> > [ cut here ]
> > WARNING: CPU: 0 PID: 1 at lib/refcount.c:28 
> > of_platform_bus_create+0x33c/0x3dc
> > refcount_t: underflow; use-after-free.
>
> This too has been going on since -rc1, but it's not obvious what caused it.
>
> At a guess, looking around the amba changes, I'm assuming it's
>
>   7719a68b2fa4 ("ARM: 9192/1: amba: fix memory leak in amba_device_try_add()")
>
> Does reverting that commit make it go away?
>
> Linus


Re: Linux 5.19-rc6

2022-07-13 Thread Linus Torvalds
On Tue, Jul 12, 2022 at 10:07 PM Guenter Roeck  wrote:
>
> Same problems as every week.
>
> Building powerpc:allmodconfig ... failed

Ok, this has been going on since -rc1, which is much too long.

>From your patch submission that that was rejected:

> The problem was introduced with commit 41b7a347bf14 ("powerpc: Book3S
> 64-bit outline-only KASAN support") which adds support for KASAN. This
> commit in turn enables DRM_AMD_DC_DCN because KCOV_INSTRUMENT_ALL and
> KCOV_ENABLE_COMPARISONS are no longer enabled. As result, new files are
> compiled which lack the selection of hard-float.

And considering that neither the ppc people nor the drm people seem
interested in fixing this, and it doesn't revert cleanly I think the
sane solution seems to be to just remove PPC64 support for DRM_AMD_DC
entirely.

IOW, does something like this (obviously nor a proper patch, but you
get the idea) fix the ppc build for you?

  @@ -6,7 +6,7 @@ config DRM_AMD_DC
  bool "AMD DC - Enable new display engine"
  default y
  select SND_HDA_COMPONENT if SND_HDA_CORE
  -   select DRM_AMD_DC_DCN if (X86 || PPC64) &&
!(KCOV_INSTRUMENT_ALL && KCOV_ENABLE_COMPARISONS)
  +   select DRM_AMD_DC_DCN if X86 && !(KCOV_INSTRUMENT_ALL &&
KCOV_ENABLE_COMPARISONS)
  help
Choose this option if you want to use the new display engine
support for AMDGPU. This adds required support for Vega and

> OF: amba_device_add() failed (-19) for /amba/smc@1010
> [ cut here ]
> WARNING: CPU: 0 PID: 1 at lib/refcount.c:28 of_platform_bus_create+0x33c/0x3dc
> refcount_t: underflow; use-after-free.

This too has been going on since -rc1, but it's not obvious what caused it.

At a guess, looking around the amba changes, I'm assuming it's

  7719a68b2fa4 ("ARM: 9192/1: amba: fix memory leak in amba_device_try_add()")

Does reverting that commit make it go away?

Linus


Re: [PATCH] drm/amdgpu: Get rid of amdgpu_job->external_hw_fence

2022-07-13 Thread Andrey Grodzovsky



On 2022-07-13 13:33, Christian König wrote:

Am 13.07.22 um 19:13 schrieb Andrey Grodzovsky:

This is a follow-up cleanup to [1]. See bellow refcount balancing
for calling amdgpu_job_submit_direct after this cleanup as far
as I calculated.

amdgpu_fence_emit
dma_fence_init 1
dma_fence_get(fence) 2
rcu_assign_pointer(*ptr, dma_fence_get(fence) 3

---> amdgpu_job_submit_direct completes before fence signaled
    amdgpu_sa_bo_free
    (*sa_bo)->fence = dma_fence_get(fence) 4

    amdgpu_job_free
    dma_fence_put 3

    amdgpu_vcn_enc_get_destroy_msg
    *fence = dma_fence_get(f) 4
    dma_fence_put(f); 3

    amdgpu_vcn_enc_ring_test_ib
    dma_fence_put(fence) 2

    amdgpu_fence_process
    dma_fence_put 1

    amdgpu_sa_bo_remove_locked
    dma_fence_put 0

---> amdgpu_job_submit_direct completes after fence signaled
    amdgpu_fence_process
    dma_fence_put 2

    amdgpu_job_free
    dma_fence_put 1

    amdgpu_vcn_enc_get_destroy_msg
    *fence = dma_fence_get(f) 2
    dma_fence_put(f); 1

    amdgpu_vcn_enc_ring_test_ib
    dma_fence_put(fence) 0

[1] - 
https://patchwork.kernel.org/project/dri-devel/cover/20220624180955.485440-1-andrey.grodzov...@amd.com/


Signed-off-by: Andrey Grodzovsky 
Suggested-by: Christian König 


Of hand that looks correct to me, but could be that I'm missing 
something as well.


Anyway I think I can give an Reviewed-by: Christian König 
 for this.


Thanks,
Christian.



Pushed, thanks.

Andrey





---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  3 +--
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c    | 27 --
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.h    |  1 -
  3 files changed, 6 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

index 16faea7ed1cd..b79ee4ffb879 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5229,8 +5229,7 @@ int amdgpu_device_gpu_recover(struct 
amdgpu_device *adev,

   *
   * job->base holds a reference to parent fence
   */
-    if (job && (job->hw_fence.ops != NULL) &&
-    dma_fence_is_signaled(>hw_fence)) {
+    if (job && dma_fence_is_signaled(>hw_fence)) {
  job_signaled = true;
  dev_info(adev->dev, "Guilty job already signaled, skipping 
HW reset");

  goto skip_hw_reset;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c

index 6fa381ee5fa0..10fdd12cf853 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -134,16 +134,10 @@ void amdgpu_job_free_resources(struct 
amdgpu_job *job)

  {
  struct amdgpu_ring *ring = to_amdgpu_ring(job->base.sched);
  struct dma_fence *f;
-    struct dma_fence *hw_fence;
  unsigned i;
  -    if (job->hw_fence.ops == NULL)
-    hw_fence = job->external_hw_fence;
-    else
-    hw_fence = >hw_fence;
-
  /* use sched fence if available */
-    f = job->base.s_fence ? >base.s_fence->finished : hw_fence;
+    f = job->base.s_fence ? >base.s_fence->finished :  
>hw_fence;

  for (i = 0; i < job->num_ibs; ++i)
  amdgpu_ib_free(ring->adev, >ibs[i], f);
  }
@@ -157,11 +151,7 @@ static void amdgpu_job_free_cb(struct 
drm_sched_job *s_job)

  amdgpu_sync_free(>sync);
  amdgpu_sync_free(>sched_sync);
  -    /* only put the hw fence if has embedded fence */
-    if (job->hw_fence.ops != NULL)
-    dma_fence_put(>hw_fence);
-    else
-    kfree(job);
+    dma_fence_put(>hw_fence);
  }
    void amdgpu_job_free(struct amdgpu_job *job)
@@ -170,11 +160,7 @@ void amdgpu_job_free(struct amdgpu_job *job)
  amdgpu_sync_free(>sync);
  amdgpu_sync_free(>sched_sync);
  -    /* only put the hw fence if has embedded fence */
-    if (job->hw_fence.ops != NULL)
-    dma_fence_put(>hw_fence);
-    else
-    kfree(job);
+    dma_fence_put(>hw_fence);
  }
    int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
@@ -204,15 +190,12 @@ int amdgpu_job_submit_direct(struct amdgpu_job 
*job, struct amdgpu_ring *ring,

  int r;
    job->base.sched = >sched;
-    r = amdgpu_ib_schedule(ring, job->num_ibs, job->ibs, NULL, fence);
-    /* record external_hw_fence for direct submit */
-    job->external_hw_fence = dma_fence_get(*fence);
+    r = amdgpu_ib_schedule(ring, job->num_ibs, job->ibs, job, fence);
+
  if (r)
  return r;
    amdgpu_job_free(job);
-    dma_fence_put(*fence);
-
  return 0;
  }
  diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h

index d599c0540b46..babc0af751c2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h
+++ 

RE: [PATCH 2/2] drm/amdgpu: use the same HDP flush registers for all nbio 2.3.x

2022-07-13 Thread Russell, Kent
[AMD Official Use Only - General]

Series is  Reviewed-by: Kent Russell 



> -Original Message-
> From: amd-gfx  On Behalf Of Alex
> Deucher
> Sent: Wednesday, July 13, 2022 2:01 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander 
> Subject: [PATCH 2/2] drm/amdgpu: use the same HDP flush registers for all nbio
> 2.3.x
> 
> Align RDNA2.x with other asics.  One HDP bit per SDMA instance,
> aligned with firmware.  This is effectively a revert of
> commit 369b7d04baf3 ("drm/amdgpu/nbio2.3: don't use GPU_HDP_FLUSH bit
> 12").
> On further discussions with the relevant hardware teams,
> re-align the bits for SDMA.
> 
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c |  5 +
>  drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c| 21 ---
>  drivers/gpu/drm/amd/amdgpu/nbio_v2_3.h|  1 -
>  3 files changed, 1 insertion(+), 26 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> index 4f83897a54a8..22144ba6c7ec 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
> @@ -2229,15 +2229,12 @@ int amdgpu_discovery_set_ip_blocks(struct
> amdgpu_device *adev)
>   case IP_VERSION(2, 3, 0):
>   case IP_VERSION(2, 3, 1):
>   case IP_VERSION(2, 3, 2):
> - adev->nbio.funcs = _v2_3_funcs;
> - adev->nbio.hdp_flush_reg = _v2_3_hdp_flush_reg;
> - break;
>   case IP_VERSION(3, 3, 0):
>   case IP_VERSION(3, 3, 1):
>   case IP_VERSION(3, 3, 2):
>   case IP_VERSION(3, 3, 3):
>   adev->nbio.funcs = _v2_3_funcs;
> - adev->nbio.hdp_flush_reg = _v2_3_hdp_flush_reg_sc;
> + adev->nbio.hdp_flush_reg = _v2_3_hdp_flush_reg;
>   break;
>   case IP_VERSION(4, 3, 0):
>   case IP_VERSION(4, 3, 1):
> diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
> b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
> index 34c610b9157d..b465baa26762 100644
> --- a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
> +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
> @@ -328,27 +328,6 @@ const struct nbio_hdp_flush_reg
> nbio_v2_3_hdp_flush_reg = {
>   .ref_and_mask_sdma1 =
> BIF_BX_PF_GPU_HDP_FLUSH_DONE__SDMA1_MASK,
>  };
> 
> -const struct nbio_hdp_flush_reg nbio_v2_3_hdp_flush_reg_sc = {
> - .ref_and_mask_cp0 =
> BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP0_MASK,
> - .ref_and_mask_cp1 =
> BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP1_MASK,
> - .ref_and_mask_cp2 =
> BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP2_MASK,
> - .ref_and_mask_cp3 =
> BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP3_MASK,
> - .ref_and_mask_cp4 =
> BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP4_MASK,
> - .ref_and_mask_cp5 =
> BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP5_MASK,
> - .ref_and_mask_cp6 =
> BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP6_MASK,
> - .ref_and_mask_cp7 =
> BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP7_MASK,
> - .ref_and_mask_cp8 =
> BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP8_MASK,
> - .ref_and_mask_cp9 =
> BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP9_MASK,
> - .ref_and_mask_sdma0 = GPU_HDP_FLUSH_DONE__RSVD_ENG1_MASK,
> - .ref_and_mask_sdma1 = GPU_HDP_FLUSH_DONE__RSVD_ENG2_MASK,
> - .ref_and_mask_sdma2 = GPU_HDP_FLUSH_DONE__RSVD_ENG3_MASK,
> - .ref_and_mask_sdma3 = GPU_HDP_FLUSH_DONE__RSVD_ENG4_MASK,
> - .ref_and_mask_sdma4 = GPU_HDP_FLUSH_DONE__RSVD_ENG5_MASK,
> - .ref_and_mask_sdma5 = GPU_HDP_FLUSH_DONE__RSVD_ENG6_MASK,
> - .ref_and_mask_sdma6 = GPU_HDP_FLUSH_DONE__RSVD_ENG7_MASK,
> - .ref_and_mask_sdma7 = GPU_HDP_FLUSH_DONE__RSVD_ENG8_MASK,
> -};
> -
>  static void nbio_v2_3_init_registers(struct amdgpu_device *adev)
>  {
>   uint32_t def, data;
> diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.h
> b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.h
> index 6074dd3a1ed8..a43b60acf7f6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.h
> +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.h
> @@ -27,7 +27,6 @@
>  #include "soc15_common.h"
> 
>  extern const struct nbio_hdp_flush_reg nbio_v2_3_hdp_flush_reg;
> -extern const struct nbio_hdp_flush_reg nbio_v2_3_hdp_flush_reg_sc;
>  extern const struct amdgpu_nbio_funcs nbio_v2_3_funcs;
> 
>  #endif
> --
> 2.35.3


[PATCH 1/2] drm/amdgpu: use the same HDP flush registers for all nbio 7.4.x

2022-07-13 Thread Alex Deucher
Align aldebaran with all other asics.  One HDP bit per
SDMA instance, aligned with firmware.  This is effectively
a revert of
commit a0f9f8546668 ("drm/amdgpu/nbio7.4: don't use GPU_HDP_FLUSH bit 12").
On further discussions with the relevant hardware teams,
re-align the bits for SDMA.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c |  5 +
 drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c| 21 ---
 drivers/gpu/drm/amd/amdgpu/nbio_v7_4.h|  1 -
 3 files changed, 1 insertion(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 9c5f29159a2d..4f83897a54a8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -2213,12 +2213,9 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device 
*adev)
break;
case IP_VERSION(7, 4, 0):
case IP_VERSION(7, 4, 1):
-   adev->nbio.funcs = _v7_4_funcs;
-   adev->nbio.hdp_flush_reg = _v7_4_hdp_flush_reg;
-   break;
case IP_VERSION(7, 4, 4):
adev->nbio.funcs = _v7_4_funcs;
-   adev->nbio.hdp_flush_reg = _v7_4_hdp_flush_reg_ald;
+   adev->nbio.hdp_flush_reg = _v7_4_hdp_flush_reg;
break;
case IP_VERSION(7, 2, 0):
case IP_VERSION(7, 2, 1):
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c 
b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
index 4531761dcf77..11848d1e238b 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
@@ -339,27 +339,6 @@ const struct nbio_hdp_flush_reg nbio_v7_4_hdp_flush_reg = {
.ref_and_mask_sdma1 = GPU_HDP_FLUSH_DONE__SDMA1_MASK,
 };
 
-const struct nbio_hdp_flush_reg nbio_v7_4_hdp_flush_reg_ald = {
-   .ref_and_mask_cp0 = GPU_HDP_FLUSH_DONE__CP0_MASK,
-   .ref_and_mask_cp1 = GPU_HDP_FLUSH_DONE__CP1_MASK,
-   .ref_and_mask_cp2 = GPU_HDP_FLUSH_DONE__CP2_MASK,
-   .ref_and_mask_cp3 = GPU_HDP_FLUSH_DONE__CP3_MASK,
-   .ref_and_mask_cp4 = GPU_HDP_FLUSH_DONE__CP4_MASK,
-   .ref_and_mask_cp5 = GPU_HDP_FLUSH_DONE__CP5_MASK,
-   .ref_and_mask_cp6 = GPU_HDP_FLUSH_DONE__CP6_MASK,
-   .ref_and_mask_cp7 = GPU_HDP_FLUSH_DONE__CP7_MASK,
-   .ref_and_mask_cp8 = GPU_HDP_FLUSH_DONE__CP8_MASK,
-   .ref_and_mask_cp9 = GPU_HDP_FLUSH_DONE__CP9_MASK,
-   .ref_and_mask_sdma0 = GPU_HDP_FLUSH_DONE__RSVD_ENG1_MASK,
-   .ref_and_mask_sdma1 = GPU_HDP_FLUSH_DONE__RSVD_ENG2_MASK,
-   .ref_and_mask_sdma2 = GPU_HDP_FLUSH_DONE__RSVD_ENG3_MASK,
-   .ref_and_mask_sdma3 = GPU_HDP_FLUSH_DONE__RSVD_ENG4_MASK,
-   .ref_and_mask_sdma4 = GPU_HDP_FLUSH_DONE__RSVD_ENG5_MASK,
-   .ref_and_mask_sdma5 = GPU_HDP_FLUSH_DONE__RSVD_ENG6_MASK,
-   .ref_and_mask_sdma6 = GPU_HDP_FLUSH_DONE__RSVD_ENG7_MASK,
-   .ref_and_mask_sdma7 = GPU_HDP_FLUSH_DONE__RSVD_ENG8_MASK,
-};
-
 static void nbio_v7_4_init_registers(struct amdgpu_device *adev)
 {
uint32_t baco_cntl;
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.h 
b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.h
index 7490022d79d4..f27c41728822 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.h
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.h
@@ -27,7 +27,6 @@
 #include "soc15_common.h"
 
 extern const struct nbio_hdp_flush_reg nbio_v7_4_hdp_flush_reg;
-extern const struct nbio_hdp_flush_reg nbio_v7_4_hdp_flush_reg_ald;
 extern const struct amdgpu_nbio_funcs nbio_v7_4_funcs;
 extern struct amdgpu_nbio_ras nbio_v7_4_ras;
 
-- 
2.35.3



[PATCH 2/2] drm/amdgpu: use the same HDP flush registers for all nbio 2.3.x

2022-07-13 Thread Alex Deucher
Align RDNA2.x with other asics.  One HDP bit per SDMA instance,
aligned with firmware.  This is effectively a revert of
commit 369b7d04baf3 ("drm/amdgpu/nbio2.3: don't use GPU_HDP_FLUSH bit 12").
On further discussions with the relevant hardware teams,
re-align the bits for SDMA.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c |  5 +
 drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c| 21 ---
 drivers/gpu/drm/amd/amdgpu/nbio_v2_3.h|  1 -
 3 files changed, 1 insertion(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 4f83897a54a8..22144ba6c7ec 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -2229,15 +2229,12 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device 
*adev)
case IP_VERSION(2, 3, 0):
case IP_VERSION(2, 3, 1):
case IP_VERSION(2, 3, 2):
-   adev->nbio.funcs = _v2_3_funcs;
-   adev->nbio.hdp_flush_reg = _v2_3_hdp_flush_reg;
-   break;
case IP_VERSION(3, 3, 0):
case IP_VERSION(3, 3, 1):
case IP_VERSION(3, 3, 2):
case IP_VERSION(3, 3, 3):
adev->nbio.funcs = _v2_3_funcs;
-   adev->nbio.hdp_flush_reg = _v2_3_hdp_flush_reg_sc;
+   adev->nbio.hdp_flush_reg = _v2_3_hdp_flush_reg;
break;
case IP_VERSION(4, 3, 0):
case IP_VERSION(4, 3, 1):
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c 
b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
index 34c610b9157d..b465baa26762 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.c
@@ -328,27 +328,6 @@ const struct nbio_hdp_flush_reg nbio_v2_3_hdp_flush_reg = {
.ref_and_mask_sdma1 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__SDMA1_MASK,
 };
 
-const struct nbio_hdp_flush_reg nbio_v2_3_hdp_flush_reg_sc = {
-   .ref_and_mask_cp0 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP0_MASK,
-   .ref_and_mask_cp1 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP1_MASK,
-   .ref_and_mask_cp2 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP2_MASK,
-   .ref_and_mask_cp3 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP3_MASK,
-   .ref_and_mask_cp4 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP4_MASK,
-   .ref_and_mask_cp5 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP5_MASK,
-   .ref_and_mask_cp6 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP6_MASK,
-   .ref_and_mask_cp7 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP7_MASK,
-   .ref_and_mask_cp8 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP8_MASK,
-   .ref_and_mask_cp9 = BIF_BX_PF_GPU_HDP_FLUSH_DONE__CP9_MASK,
-   .ref_and_mask_sdma0 = GPU_HDP_FLUSH_DONE__RSVD_ENG1_MASK,
-   .ref_and_mask_sdma1 = GPU_HDP_FLUSH_DONE__RSVD_ENG2_MASK,
-   .ref_and_mask_sdma2 = GPU_HDP_FLUSH_DONE__RSVD_ENG3_MASK,
-   .ref_and_mask_sdma3 = GPU_HDP_FLUSH_DONE__RSVD_ENG4_MASK,
-   .ref_and_mask_sdma4 = GPU_HDP_FLUSH_DONE__RSVD_ENG5_MASK,
-   .ref_and_mask_sdma5 = GPU_HDP_FLUSH_DONE__RSVD_ENG6_MASK,
-   .ref_and_mask_sdma6 = GPU_HDP_FLUSH_DONE__RSVD_ENG7_MASK,
-   .ref_and_mask_sdma7 = GPU_HDP_FLUSH_DONE__RSVD_ENG8_MASK,
-};
-
 static void nbio_v2_3_init_registers(struct amdgpu_device *adev)
 {
uint32_t def, data;
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.h 
b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.h
index 6074dd3a1ed8..a43b60acf7f6 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.h
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v2_3.h
@@ -27,7 +27,6 @@
 #include "soc15_common.h"
 
 extern const struct nbio_hdp_flush_reg nbio_v2_3_hdp_flush_reg;
-extern const struct nbio_hdp_flush_reg nbio_v2_3_hdp_flush_reg_sc;
 extern const struct amdgpu_nbio_funcs nbio_v2_3_funcs;
 
 #endif
-- 
2.35.3



Re: [PATCH] drm/amdgpu: Get rid of amdgpu_job->external_hw_fence

2022-07-13 Thread Christian König

Am 13.07.22 um 19:13 schrieb Andrey Grodzovsky:

This is a follow-up cleanup to [1]. See bellow refcount balancing
for calling amdgpu_job_submit_direct after this cleanup as far
as I calculated.

amdgpu_fence_emit
dma_fence_init 1
dma_fence_get(fence) 2
rcu_assign_pointer(*ptr, dma_fence_get(fence) 3

---> amdgpu_job_submit_direct completes before fence signaled
amdgpu_sa_bo_free
(*sa_bo)->fence = dma_fence_get(fence) 4

amdgpu_job_free
dma_fence_put 3

amdgpu_vcn_enc_get_destroy_msg
*fence = dma_fence_get(f) 4
dma_fence_put(f); 3

amdgpu_vcn_enc_ring_test_ib
dma_fence_put(fence) 2

amdgpu_fence_process
dma_fence_put 1

amdgpu_sa_bo_remove_locked
dma_fence_put 0

---> amdgpu_job_submit_direct completes after fence signaled
amdgpu_fence_process
dma_fence_put 2

amdgpu_job_free
dma_fence_put 1

amdgpu_vcn_enc_get_destroy_msg
*fence = dma_fence_get(f) 2
dma_fence_put(f); 1

amdgpu_vcn_enc_ring_test_ib
dma_fence_put(fence) 0

[1] - 
https://patchwork.kernel.org/project/dri-devel/cover/20220624180955.485440-1-andrey.grodzov...@amd.com/

Signed-off-by: Andrey Grodzovsky 
Suggested-by: Christian König 


Of hand that looks correct to me, but could be that I'm missing 
something as well.


Anyway I think I can give an Reviewed-by: Christian König 
 for this.


Thanks,
Christian.


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  3 +--
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 27 --
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.h|  1 -
  3 files changed, 6 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 16faea7ed1cd..b79ee4ffb879 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5229,8 +5229,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
 *
 * job->base holds a reference to parent fence
 */
-   if (job && (job->hw_fence.ops != NULL) &&
-   dma_fence_is_signaled(>hw_fence)) {
+   if (job && dma_fence_is_signaled(>hw_fence)) {
job_signaled = true;
dev_info(adev->dev, "Guilty job already signaled, skipping HW 
reset");
goto skip_hw_reset;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 6fa381ee5fa0..10fdd12cf853 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -134,16 +134,10 @@ void amdgpu_job_free_resources(struct amdgpu_job *job)
  {
struct amdgpu_ring *ring = to_amdgpu_ring(job->base.sched);
struct dma_fence *f;
-   struct dma_fence *hw_fence;
unsigned i;
  
-	if (job->hw_fence.ops == NULL)

-   hw_fence = job->external_hw_fence;
-   else
-   hw_fence = >hw_fence;
-
/* use sched fence if available */
-   f = job->base.s_fence ? >base.s_fence->finished : hw_fence;
+   f = job->base.s_fence ? >base.s_fence->finished :  >hw_fence;
for (i = 0; i < job->num_ibs; ++i)
amdgpu_ib_free(ring->adev, >ibs[i], f);
  }
@@ -157,11 +151,7 @@ static void amdgpu_job_free_cb(struct drm_sched_job *s_job)
amdgpu_sync_free(>sync);
amdgpu_sync_free(>sched_sync);
  
-/* only put the hw fence if has embedded fence */

-   if (job->hw_fence.ops != NULL)
-   dma_fence_put(>hw_fence);
-   else
-   kfree(job);
+   dma_fence_put(>hw_fence);
  }
  
  void amdgpu_job_free(struct amdgpu_job *job)

@@ -170,11 +160,7 @@ void amdgpu_job_free(struct amdgpu_job *job)
amdgpu_sync_free(>sync);
amdgpu_sync_free(>sched_sync);
  
-	/* only put the hw fence if has embedded fence */

-   if (job->hw_fence.ops != NULL)
-   dma_fence_put(>hw_fence);
-   else
-   kfree(job);
+   dma_fence_put(>hw_fence);
  }
  
  int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,

@@ -204,15 +190,12 @@ int amdgpu_job_submit_direct(struct amdgpu_job *job, 
struct amdgpu_ring *ring,
int r;
  
  	job->base.sched = >sched;

-   r = amdgpu_ib_schedule(ring, job->num_ibs, job->ibs, NULL, fence);
-   /* record external_hw_fence for direct submit */
-   job->external_hw_fence = dma_fence_get(*fence);
+   r = 

[pull] amdgpu drm-fixes-5.19

2022-07-13 Thread Alex Deucher
Hi Dave, Daniel,

Fixes for 5.19.

The following changes since commit 3590b44b9434af1b9c81c3f40189087ed4fe3635:

  Merge tag 'drm-misc-fixes-2022-07-07-1' of 
ssh://git.freedesktop.org/git/drm/drm-misc into drm-fixes (2022-07-12 10:44:40 
+1000)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-fixes-5.19-2022-07-13

for you to fetch changes up to 3283c83eb6fcfbda8ea03d7149d8e42e71c5d45e:

  drm/amd/display: Ensure valid event timestamp for cursor-only commits 
(2022-07-13 12:20:37 -0400)


amd-drm-fixes-5.19-2022-07-13:

amdgpu:
- DP MST blank screen fix for specific platforms
- MEC firmware check fix for GC 10.3.7
- Deep color fix for DCE
- Fix possible divide by 0
- Coverage blend mode fix
- Fix cursor only commit timestamps


Fangzhi Zuo (1):
  drm/amd/display: Ignore First MST Sideband Message Return Error

Mario Kleiner (1):
  drm/amd/display: Only use depth 36 bpp linebuffers on DCN display engines.

Melissa Wen (1):
  drm/amd/display: correct check of coverage blend mode

Michel Dänzer (1):
  drm/amd/display: Ensure valid event timestamp for cursor-only commits

Prike Liang (1):
  drm/amdkfd: correct the MEC atomic support firmware checking for GC 10.3.7

Yefim Barashkin (1):
  drm/amd/pm: Prevent divide by zero

 drivers/gpu/drm/amd/amdkfd/kfd_device.c|  2 +
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  | 84 --
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h  |  8 +++
 .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c| 17 +
 drivers/gpu/drm/amd/display/dc/core/dc_resource.c  | 11 +--
 drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c |  2 +
 6 files changed, 115 insertions(+), 9 deletions(-)


RE: [PATCH v2 27/29] ACPI: video: Drop Clevo/TUXEDO NL5xRU and NL5xNU acpi_backlight=native quirks

2022-07-13 Thread Limonciello, Mario
[Public]



> -Original Message-
> From: Werner Sembach 
> Sent: Wednesday, July 13, 2022 12:08
> To: Hans de Goede ; Ben Skeggs
> ; Karol Herbst ; Lyude
> ; Daniel Dadap ; Maarten
> Lankhorst ; Maxime Ripard
> ; Thomas Zimmermann ;
> Jani Nikula ; Joonas Lahtinen
> ; Rodrigo Vivi ;
> Tvrtko Ursulin ; Deucher, Alexander
> ; Koenig, Christian
> ; p...@vger.kernel.org; Pan, Xinhui
> ; Rafael J . Wysocki ; Mika
> Westerberg ; Lukas Wunner
> ; Mark Gross ; Andy
> Shevchenko 
> Cc: nouv...@lists.freedesktop.org; Daniel Vetter ; David
> Airlie ; intel-gfx ; dri-
> de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Len Brown
> ; linux-a...@vger.kernel.org; platform-driver-
> x...@vger.kernel.org
> Subject: Re: [PATCH v2 27/29] ACPI: video: Drop Clevo/TUXEDO NL5xRU and
> NL5xNU acpi_backlight=native quirks
> 
> Hi,
> 
> On 7/12/22 21:39, Hans de Goede wrote:
> > acpi_backlight=native is the default for these, but as the comment
> > explains the quirk was still necessary because even briefly registering
> > the acpi_video0 backlight; and then unregistering it once the native
> > driver showed up, was leading to issues.
> >
> > After the "ACPI: video: Make backlight class device registration
> > a separate step" patch from earlier in this patch-series, we no
> > longer briefly register the acpi_video0 backlight on systems where
> > the native driver should be used.
> >
> > So this is no longer an issue an the quirks are no longer needed.
> >
> > Cc: Werner Sembach 
> > Signed-off-by: Hans de Goede 
> 
> Tested and can confirm: The quirks are no longer needed with this Patchset.
> 
> Tested-by: Werner Sembach 

Probably should include this link tag in this commit too then as it fixes
the Tong Fang systems too.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215683

> 
> Kind Regards,
> 
> Werner Sembach
> 
> > ---
> >   drivers/acpi/video_detect.c | 75 -
> >   1 file changed, 75 deletions(-)
> >
> > diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
> > index 2a4d376a703e..4b9395d1bda7 100644
> > --- a/drivers/acpi/video_detect.c
> > +++ b/drivers/acpi/video_detect.c
> > @@ -599,81 +599,6 @@ static const struct dmi_system_id
> video_detect_dmi_table[] = {
> > DMI_MATCH(DMI_BOARD_NAME, "N250P"),
> > },
> > },
> > -   /*
> > -* Clevo NL5xRU and NL5xNU/TUXEDO Aura 15 Gen1 and Gen2 have
> both a
> > -* working native and video interface. However the default detection
> > -* mechanism first registers the video interface before unregistering
> > -* it again and switching to the native interface during boot. This
> > -* results in a dangling SBIOS request for backlight change for some
> > -* reason, causing the backlight to switch to ~2% once per boot on
> the
> > -* first power cord connect or disconnect event. Setting the native
> > -* interface explicitly circumvents this buggy behaviour, by avoiding
> > -* the unregistering process.
> > -*/
> > -   {
> > -   .callback = video_detect_force_native,
> > -   .ident = "Clevo NL5xRU",
> > -   .matches = {
> > -   DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
> > -   DMI_MATCH(DMI_BOARD_NAME, "NL5xRU"),
> > -   },
> > -   },
> > -   {
> > -   .callback = video_detect_force_native,
> > -   .ident = "Clevo NL5xRU",
> > -   .matches = {
> > -   DMI_MATCH(DMI_SYS_VENDOR,
> "SchenkerTechnologiesGmbH"),
> > -   DMI_MATCH(DMI_BOARD_NAME, "NL5xRU"),
> > -   },
> > -   },
> > -   {
> > -   .callback = video_detect_force_native,
> > -   .ident = "Clevo NL5xRU",
> > -   .matches = {
> > -   DMI_MATCH(DMI_SYS_VENDOR, "Notebook"),
> > -   DMI_MATCH(DMI_BOARD_NAME, "NL5xRU"),
> > -   },
> > -   },
> > -   {
> > -   .callback = video_detect_force_native,
> > -   .ident = "Clevo NL5xRU",
> > -   .matches = {
> > -   DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
> > -   DMI_MATCH(DMI_BOARD_NAME, "AURA1501"),
> > -   },
> > -   },
> > -   {
> > -   .callback = video_detect_force_native,
> > -   .ident = "Clevo NL5xRU",
> > -   .matches = {
> > -   DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
> > -   DMI_MATCH(DMI_BOARD_NAME, "EDUBOOK1502"),
> > -   },
> > -   },
> > -   {
> > -   .callback = video_detect_force_native,
> > -   .ident = "Clevo NL5xNU",
> > -   .matches = {
> > -   DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
> > -   DMI_MATCH(DMI_BOARD_NAME, "NL5xNU"),
> > -   },
> > -   },
> > -   {
> > -   .callback = video_detect_force_native,
> > -   .ident = "Clevo NL5xNU",
> > -   .matches = {
> > -   DMI_MATCH(DMI_SYS_VENDOR,
> "SchenkerTechnologiesGmbH"),
> > -   DMI_MATCH(DMI_BOARD_NAME, "NL5xNU"),
> > -   },
> > -   },
> > -   {
> > -   .callback = video_detect_force_native,
> > -   .ident = "Clevo NL5xNU",
> > -   .matches = {
> > -   DMI_MATCH(DMI_SYS_VENDOR, "Notebook"),
> > -   

Re: [PATCH v2 27/29] ACPI: video: Drop Clevo/TUXEDO NL5xRU and NL5xNU acpi_backlight=native quirks

2022-07-13 Thread Werner Sembach

Hi,

On 7/12/22 21:39, Hans de Goede wrote:

acpi_backlight=native is the default for these, but as the comment
explains the quirk was still necessary because even briefly registering
the acpi_video0 backlight; and then unregistering it once the native
driver showed up, was leading to issues.

After the "ACPI: video: Make backlight class device registration
a separate step" patch from earlier in this patch-series, we no
longer briefly register the acpi_video0 backlight on systems where
the native driver should be used.

So this is no longer an issue an the quirks are no longer needed.

Cc: Werner Sembach 
Signed-off-by: Hans de Goede 


Tested and can confirm: The quirks are no longer needed with this Patchset.

Tested-by: Werner Sembach 

Kind Regards,

Werner Sembach


---
  drivers/acpi/video_detect.c | 75 -
  1 file changed, 75 deletions(-)

diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c
index 2a4d376a703e..4b9395d1bda7 100644
--- a/drivers/acpi/video_detect.c
+++ b/drivers/acpi/video_detect.c
@@ -599,81 +599,6 @@ static const struct dmi_system_id video_detect_dmi_table[] 
= {
DMI_MATCH(DMI_BOARD_NAME, "N250P"),
},
},
-   /*
-* Clevo NL5xRU and NL5xNU/TUXEDO Aura 15 Gen1 and Gen2 have both a
-* working native and video interface. However the default detection
-* mechanism first registers the video interface before unregistering
-* it again and switching to the native interface during boot. This
-* results in a dangling SBIOS request for backlight change for some
-* reason, causing the backlight to switch to ~2% once per boot on the
-* first power cord connect or disconnect event. Setting the native
-* interface explicitly circumvents this buggy behaviour, by avoiding
-* the unregistering process.
-*/
-   {
-   .callback = video_detect_force_native,
-   .ident = "Clevo NL5xRU",
-   .matches = {
-   DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
-   DMI_MATCH(DMI_BOARD_NAME, "NL5xRU"),
-   },
-   },
-   {
-   .callback = video_detect_force_native,
-   .ident = "Clevo NL5xRU",
-   .matches = {
-   DMI_MATCH(DMI_SYS_VENDOR, "SchenkerTechnologiesGmbH"),
-   DMI_MATCH(DMI_BOARD_NAME, "NL5xRU"),
-   },
-   },
-   {
-   .callback = video_detect_force_native,
-   .ident = "Clevo NL5xRU",
-   .matches = {
-   DMI_MATCH(DMI_SYS_VENDOR, "Notebook"),
-   DMI_MATCH(DMI_BOARD_NAME, "NL5xRU"),
-   },
-   },
-   {
-   .callback = video_detect_force_native,
-   .ident = "Clevo NL5xRU",
-   .matches = {
-   DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
-   DMI_MATCH(DMI_BOARD_NAME, "AURA1501"),
-   },
-   },
-   {
-   .callback = video_detect_force_native,
-   .ident = "Clevo NL5xRU",
-   .matches = {
-   DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
-   DMI_MATCH(DMI_BOARD_NAME, "EDUBOOK1502"),
-   },
-   },
-   {
-   .callback = video_detect_force_native,
-   .ident = "Clevo NL5xNU",
-   .matches = {
-   DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
-   DMI_MATCH(DMI_BOARD_NAME, "NL5xNU"),
-   },
-   },
-   {
-   .callback = video_detect_force_native,
-   .ident = "Clevo NL5xNU",
-   .matches = {
-   DMI_MATCH(DMI_SYS_VENDOR, "SchenkerTechnologiesGmbH"),
-   DMI_MATCH(DMI_BOARD_NAME, "NL5xNU"),
-   },
-   },
-   {
-   .callback = video_detect_force_native,
-   .ident = "Clevo NL5xNU",
-   .matches = {
-   DMI_MATCH(DMI_SYS_VENDOR, "Notebook"),
-   DMI_MATCH(DMI_BOARD_NAME, "NL5xNU"),
-   },
-   },
  
  	/*

 * Desktops which falsely report a backlight and which our heuristics


[PATCH] drm/amdgpu: Get rid of amdgpu_job->external_hw_fence

2022-07-13 Thread Andrey Grodzovsky
This is a follow-up cleanup to [1]. See bellow refcount balancing
for calling amdgpu_job_submit_direct after this cleanup as far
as I calculated.

amdgpu_fence_emit
dma_fence_init 1
dma_fence_get(fence) 2
rcu_assign_pointer(*ptr, dma_fence_get(fence) 3

---> amdgpu_job_submit_direct completes before fence signaled
amdgpu_sa_bo_free
(*sa_bo)->fence = dma_fence_get(fence) 4

amdgpu_job_free
dma_fence_put 3

amdgpu_vcn_enc_get_destroy_msg
*fence = dma_fence_get(f) 4
dma_fence_put(f); 3

amdgpu_vcn_enc_ring_test_ib
dma_fence_put(fence) 2

amdgpu_fence_process
dma_fence_put 1

amdgpu_sa_bo_remove_locked
dma_fence_put 0

---> amdgpu_job_submit_direct completes after fence signaled
amdgpu_fence_process
dma_fence_put 2

amdgpu_job_free
dma_fence_put 1

amdgpu_vcn_enc_get_destroy_msg
*fence = dma_fence_get(f) 2
dma_fence_put(f); 1

amdgpu_vcn_enc_ring_test_ib
dma_fence_put(fence) 0

[1] - 
https://patchwork.kernel.org/project/dri-devel/cover/20220624180955.485440-1-andrey.grodzov...@amd.com/

Signed-off-by: Andrey Grodzovsky 
Suggested-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  3 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 27 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.h|  1 -
 3 files changed, 6 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 16faea7ed1cd..b79ee4ffb879 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5229,8 +5229,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
 *
 * job->base holds a reference to parent fence
 */
-   if (job && (job->hw_fence.ops != NULL) &&
-   dma_fence_is_signaled(>hw_fence)) {
+   if (job && dma_fence_is_signaled(>hw_fence)) {
job_signaled = true;
dev_info(adev->dev, "Guilty job already signaled, skipping HW 
reset");
goto skip_hw_reset;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 6fa381ee5fa0..10fdd12cf853 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -134,16 +134,10 @@ void amdgpu_job_free_resources(struct amdgpu_job *job)
 {
struct amdgpu_ring *ring = to_amdgpu_ring(job->base.sched);
struct dma_fence *f;
-   struct dma_fence *hw_fence;
unsigned i;
 
-   if (job->hw_fence.ops == NULL)
-   hw_fence = job->external_hw_fence;
-   else
-   hw_fence = >hw_fence;
-
/* use sched fence if available */
-   f = job->base.s_fence ? >base.s_fence->finished : hw_fence;
+   f = job->base.s_fence ? >base.s_fence->finished :  >hw_fence;
for (i = 0; i < job->num_ibs; ++i)
amdgpu_ib_free(ring->adev, >ibs[i], f);
 }
@@ -157,11 +151,7 @@ static void amdgpu_job_free_cb(struct drm_sched_job *s_job)
amdgpu_sync_free(>sync);
amdgpu_sync_free(>sched_sync);
 
-/* only put the hw fence if has embedded fence */
-   if (job->hw_fence.ops != NULL)
-   dma_fence_put(>hw_fence);
-   else
-   kfree(job);
+   dma_fence_put(>hw_fence);
 }
 
 void amdgpu_job_free(struct amdgpu_job *job)
@@ -170,11 +160,7 @@ void amdgpu_job_free(struct amdgpu_job *job)
amdgpu_sync_free(>sync);
amdgpu_sync_free(>sched_sync);
 
-   /* only put the hw fence if has embedded fence */
-   if (job->hw_fence.ops != NULL)
-   dma_fence_put(>hw_fence);
-   else
-   kfree(job);
+   dma_fence_put(>hw_fence);
 }
 
 int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
@@ -204,15 +190,12 @@ int amdgpu_job_submit_direct(struct amdgpu_job *job, 
struct amdgpu_ring *ring,
int r;
 
job->base.sched = >sched;
-   r = amdgpu_ib_schedule(ring, job->num_ibs, job->ibs, NULL, fence);
-   /* record external_hw_fence for direct submit */
-   job->external_hw_fence = dma_fence_get(*fence);
+   r = amdgpu_ib_schedule(ring, job->num_ibs, job->ibs, job, fence);
+
if (r)
return r;
 
amdgpu_job_free(job);
-   dma_fence_put(*fence);
-
return 0;
 }
 
diff --git 

Re: [PATCH v2] drm/amdgpu: limiting AV1 to first instance on VCN4 decode

2022-07-13 Thread Zhu, James
[AMD Official Use Only - General]

This patch is Reviewed-by: James Zhu 



From: amd-gfx  on behalf of Sonny Jiang 

Sent: Wednesday, July 13, 2022 11:59 AM
To: amd-gfx@lists.freedesktop.org 
Cc: Jiang, Sonny 
Subject: [PATCH v2] drm/amdgpu: limiting AV1 to first instance on VCN4 decode

AV1 is only supported on first instance.

Signed-off-by: Sonny Jiang 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c | 131 ++
 1 file changed, 131 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
index 84ac2401895a..a91ffbf902d4 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
@@ -25,6 +25,7 @@
 #include "amdgpu.h"
 #include "amdgpu_vcn.h"
 #include "amdgpu_pm.h"
+#include "amdgpu_cs.h"
 #include "soc15.h"
 #include "soc15d.h"
 #include "soc15_hw_ip.h"
@@ -44,6 +45,9 @@
 #define VCN_VID_SOC_ADDRESS_2_0
 0x1fb00
 #define VCN1_VID_SOC_ADDRESS_3_0   
 0x48300

+#define RDECODE_MSG_CREATE 
0x
+#define RDECODE_MESSAGE_CREATE 
0x0001
+
 static int amdgpu_ih_clientid_vcns[] = {
 SOC15_IH_CLIENTID_VCN,
 SOC15_IH_CLIENTID_VCN1
@@ -1323,6 +1327,132 @@ static void vcn_v4_0_unified_ring_set_wptr(struct 
amdgpu_ring *ring)
 }
 }

+static int vcn_v4_0_limit_sched(struct amdgpu_cs_parser *p)
+{
+   struct drm_gpu_scheduler **scheds;
+
+   /* The create msg must be in the first IB submitted */
+   if (atomic_read(>entity->fence_seq))
+   return -EINVAL;
+
+   scheds = p->adev->gpu_sched[AMDGPU_HW_IP_VCN_ENC]
+   [AMDGPU_RING_PRIO_0].sched;
+   drm_sched_entity_modify_sched(p->entity, scheds, 1);
+   return 0;
+}
+
+static int vcn_v4_0_dec_msg(struct amdgpu_cs_parser *p, uint64_t addr)
+{
+   struct ttm_operation_ctx ctx = { false, false };
+   struct amdgpu_bo_va_mapping *map;
+   uint32_t *msg, num_buffers;
+   struct amdgpu_bo *bo;
+   uint64_t start, end;
+   unsigned int i;
+   void *ptr;
+   int r;
+
+   addr &= AMDGPU_GMC_HOLE_MASK;
+   r = amdgpu_cs_find_mapping(p, addr, , );
+   if (r) {
+   DRM_ERROR("Can't find BO for addr 0x%08llx\n", addr);
+   return r;
+   }
+
+   start = map->start * AMDGPU_GPU_PAGE_SIZE;
+   end = (map->last + 1) * AMDGPU_GPU_PAGE_SIZE;
+   if (addr & 0x7) {
+   DRM_ERROR("VCN messages must be 8 byte aligned!\n");
+   return -EINVAL;
+   }
+
+   bo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
+   amdgpu_bo_placement_from_domain(bo, bo->allowed_domains);
+   r = ttm_bo_validate(>tbo, >placement, );
+   if (r) {
+   DRM_ERROR("Failed validating the VCN message BO (%d)!\n", r);
+   return r;
+   }
+
+   r = amdgpu_bo_kmap(bo, );
+   if (r) {
+   DRM_ERROR("Failed mapping the VCN message (%d)!\n", r);
+   return r;
+   }
+
+   msg = ptr + addr - start;
+
+   /* Check length */
+   if (msg[1] > end - addr) {
+   r = -EINVAL;
+   goto out;
+   }
+
+   if (msg[3] != RDECODE_MSG_CREATE)
+   goto out;
+
+   num_buffers = msg[2];
+   for (i = 0, msg = [6]; i < num_buffers; ++i, msg += 4) {
+   uint32_t offset, size, *create;
+
+   if (msg[0] != RDECODE_MESSAGE_CREATE)
+   continue;
+
+   offset = msg[1];
+   size = msg[2];
+
+   if (offset + size > end) {
+   r = -EINVAL;
+   goto out;
+   }
+
+   create = ptr + addr + offset - start;
+
+   /* H246, HEVC and VP9 can run on any instance */
+   if (create[0] == 0x7 || create[0] == 0x10 || create[0] == 0x11)
+   continue;
+
+   r = vcn_v4_0_limit_sched(p);
+   if (r)
+   goto out;
+   }
+
+out:
+   amdgpu_bo_kunmap(bo);
+   return r;
+}
+
+#define RADEON_VCN_ENGINE_TYPE_DECODE 
(0x0003)
+
+static int vcn_v4_0_ring_patch_cs_in_place(struct amdgpu_cs_parser *p,
+   struct amdgpu_job *job,
+   struct amdgpu_ib *ib)
+{
+   struct amdgpu_ring *ring = to_amdgpu_ring(p->entity->rq->sched);
+   struct amdgpu_vcn_decode_buffer *decode_buffer = NULL;
+   uint32_t val;
+   int r = 0;
+
+   /* The first instance can decode anything */
+   if (!ring->me)
+   return r;
+
+   /* unified queue ib header has 8 double words. */
+   if (ib->length_dw < 8)
+   return r;
+
+   val = amdgpu_ib_get_value(ib, 6); 

[PATCH v2] drm/amdgpu: limiting AV1 to first instance on VCN4 decode

2022-07-13 Thread Sonny Jiang
AV1 is only supported on first instance.

Signed-off-by: Sonny Jiang 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c | 131 ++
 1 file changed, 131 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
index 84ac2401895a..a91ffbf902d4 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
@@ -25,6 +25,7 @@
 #include "amdgpu.h"
 #include "amdgpu_vcn.h"
 #include "amdgpu_pm.h"
+#include "amdgpu_cs.h"
 #include "soc15.h"
 #include "soc15d.h"
 #include "soc15_hw_ip.h"
@@ -44,6 +45,9 @@
 #define VCN_VID_SOC_ADDRESS_2_0
0x1fb00
 #define VCN1_VID_SOC_ADDRESS_3_0   
0x48300
 
+#define RDECODE_MSG_CREATE 
0x
+#define RDECODE_MESSAGE_CREATE 
0x0001
+
 static int amdgpu_ih_clientid_vcns[] = {
SOC15_IH_CLIENTID_VCN,
SOC15_IH_CLIENTID_VCN1
@@ -1323,6 +1327,132 @@ static void vcn_v4_0_unified_ring_set_wptr(struct 
amdgpu_ring *ring)
}
 }
 
+static int vcn_v4_0_limit_sched(struct amdgpu_cs_parser *p)
+{
+   struct drm_gpu_scheduler **scheds;
+
+   /* The create msg must be in the first IB submitted */
+   if (atomic_read(>entity->fence_seq))
+   return -EINVAL;
+
+   scheds = p->adev->gpu_sched[AMDGPU_HW_IP_VCN_ENC]
+   [AMDGPU_RING_PRIO_0].sched;
+   drm_sched_entity_modify_sched(p->entity, scheds, 1);
+   return 0;
+}
+
+static int vcn_v4_0_dec_msg(struct amdgpu_cs_parser *p, uint64_t addr)
+{
+   struct ttm_operation_ctx ctx = { false, false };
+   struct amdgpu_bo_va_mapping *map;
+   uint32_t *msg, num_buffers;
+   struct amdgpu_bo *bo;
+   uint64_t start, end;
+   unsigned int i;
+   void *ptr;
+   int r;
+
+   addr &= AMDGPU_GMC_HOLE_MASK;
+   r = amdgpu_cs_find_mapping(p, addr, , );
+   if (r) {
+   DRM_ERROR("Can't find BO for addr 0x%08llx\n", addr);
+   return r;
+   }
+
+   start = map->start * AMDGPU_GPU_PAGE_SIZE;
+   end = (map->last + 1) * AMDGPU_GPU_PAGE_SIZE;
+   if (addr & 0x7) {
+   DRM_ERROR("VCN messages must be 8 byte aligned!\n");
+   return -EINVAL;
+   }
+
+   bo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
+   amdgpu_bo_placement_from_domain(bo, bo->allowed_domains);
+   r = ttm_bo_validate(>tbo, >placement, );
+   if (r) {
+   DRM_ERROR("Failed validating the VCN message BO (%d)!\n", r);
+   return r;
+   }
+
+   r = amdgpu_bo_kmap(bo, );
+   if (r) {
+   DRM_ERROR("Failed mapping the VCN message (%d)!\n", r);
+   return r;
+   }
+
+   msg = ptr + addr - start;
+
+   /* Check length */
+   if (msg[1] > end - addr) {
+   r = -EINVAL;
+   goto out;
+   }
+
+   if (msg[3] != RDECODE_MSG_CREATE)
+   goto out;
+
+   num_buffers = msg[2];
+   for (i = 0, msg = [6]; i < num_buffers; ++i, msg += 4) {
+   uint32_t offset, size, *create;
+
+   if (msg[0] != RDECODE_MESSAGE_CREATE)
+   continue;
+
+   offset = msg[1];
+   size = msg[2];
+
+   if (offset + size > end) {
+   r = -EINVAL;
+   goto out;
+   }
+
+   create = ptr + addr + offset - start;
+
+   /* H246, HEVC and VP9 can run on any instance */
+   if (create[0] == 0x7 || create[0] == 0x10 || create[0] == 0x11)
+   continue;
+
+   r = vcn_v4_0_limit_sched(p);
+   if (r)
+   goto out;
+   }
+
+out:
+   amdgpu_bo_kunmap(bo);
+   return r;
+}
+
+#define RADEON_VCN_ENGINE_TYPE_DECODE 
(0x0003)
+
+static int vcn_v4_0_ring_patch_cs_in_place(struct amdgpu_cs_parser *p,
+   struct amdgpu_job *job,
+   struct amdgpu_ib *ib)
+{
+   struct amdgpu_ring *ring = to_amdgpu_ring(p->entity->rq->sched);
+   struct amdgpu_vcn_decode_buffer *decode_buffer = NULL;
+   uint32_t val;
+   int r = 0;
+
+   /* The first instance can decode anything */
+   if (!ring->me)
+   return r;
+
+   /* unified queue ib header has 8 double words. */
+   if (ib->length_dw < 8)
+   return r;
+
+   val = amdgpu_ib_get_value(ib, 6); //RADEON_VCN_ENGINE_TYPE
+
+   if (val == RADEON_VCN_ENGINE_TYPE_DECODE) {
+   decode_buffer = (struct amdgpu_vcn_decode_buffer *)>ptr[10];
+
+   if (decode_buffer->valid_buf_flag  & 0x1)
+   r = vcn_v4_0_dec_msg(p, 
((u64)decode_buffer->msg_buffer_address_hi) << 32 |
+  

Re: [PATCH 1/2] drm/amdgpu: skip SMU FW reloading in runpm BACO case (v2)

2022-07-13 Thread Alex Deucher
On Tue, Jul 12, 2022 at 11:18 PM Guchun Chen  wrote:
>
> SMU is always alive, so it's fine to skip SMU FW reloading
> when runpm resumed from BACO, this can avoid some race issues
> when resuming SMU FW.
>
> v2: Exclude boco case if an ASIC supports both boco and baco
>
> Suggested-by: Evan Quan 
> Signed-off-by: Guchun Chen 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> index e9411c28d88b..de59dc051340 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
> @@ -2348,6 +2348,14 @@ static int psp_load_smu_fw(struct psp_context *psp)
> >firmware.ucode[AMDGPU_UCODE_ID_SMC];
> struct amdgpu_ras *ras = psp->ras_context.ras;
>
> +   /* Skip SMU FW reloading in case of using BACO for runpm only,
> +* as SMU is always alive.
> +*/
> +   if (adev->in_runpm &&
> +   !amdgpu_device_supports_boco(adev_to_drm(adev)) &&
> +   amdgpu_device_supports_baco(adev_to_drm(adev)))

I think this would be better as:
if (adev->in_runpm && (adev->pm.rpm_mode != AMDGPU_RUNPM_BOCO))
or something like that.

Alex

> +   return 0;
> +
> if (!ucode->fw || amdgpu_sriov_vf(psp->adev))
> return 0;
>
> --
> 2.17.1
>


[PATCH] drm/amd/display: make retrieve_dmi_info() static

2022-07-13 Thread Alex Deucher
It's not used outside of amdgpu_dm.c.

Reported-by: kernel test robot 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 21aec55abd1a..c03f300851fa 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -1446,7 +1446,7 @@ static const struct dmi_system_id 
hpd_disconnect_quirk_table[] = {
{}
 };
 
-void retrieve_dmi_info(struct amdgpu_display_manager *dm)
+static void retrieve_dmi_info(struct amdgpu_display_manager *dm)
 {
const struct dmi_system_id *dmi_id;
 
-- 
2.35.3



[PATCH] drm/amd/debugfs: Expose GFXOFF state to userspace

2022-07-13 Thread André Almeida
GFXOFF has two different "state" values: one to define if the GPU is
allowed/disallowed to enter GFXOFF, usually called state; and another
one to define if currently GFXOFF is being used, usually called status.
Even when GFXOFF is allowed, GPU firmware can decide to not used it
accordingly to the GPU load.

Userspace can allow/disallow GPUs to enter into GFXOFF via debugfs. The
kernel maintains a counter of requests for GFXOFF (gfx_off_req_count)
that should be decreased to allow GFXOFF and increased to disallow.

The issue with this interface is that userspace can't be sure if GFXOFF
is currently allowed. Even by checking amdgpu_gfxoff file, one might get
an ambiguous 2, that means that GPU is currently out of GFXOFF, but that
can be either because it's currently disallowed or because it's allowed
but given the current GPU load it's enabled. Then, userspace needs to
rely on the fact that GFXOFF is enabled by default on boot and to track
this information.

To make userspace life easier and GFXOFF more reliable, return the
current state of GFXOFF to userspace when reading amdgpu_gfxoff with the
same semantics of writing: 0 means not allowed, not 0 means allowed.

Expose the current status of GFXOFF through a new file,
amdgpu_gfxoff_status.

Signed-off-by: André Almeida 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 49 -
 1 file changed, 47 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index f3b3c688e4e7..e2eec985adb3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -1117,13 +1117,50 @@ static ssize_t amdgpu_debugfs_gfxoff_read(struct file 
*f, char __user *buf,
}
 
while (size) {
-   uint32_t value;
+   u32 value = adev->gfx.gfx_off_state;
+
+   r = put_user(value, (u32 *)buf);
+   if (r)
+   goto out;
+
+   result += 4;
+   buf += 4;
+   *pos += 4;
+   size -= 4;
+   }
+
+   r = result;
+out:
+   pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);
+   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
+
+   return r;
+}
+
+static ssize_t amdgpu_debugfs_gfxoff_status_read(struct file *f, char __user 
*buf,
+size_t size, loff_t *pos)
+{
+   struct amdgpu_device *adev = file_inode(f)->i_private;
+   ssize_t result = 0;
+   int r;
+
+   if (size & 0x3 || *pos & 0x3)
+   return -EINVAL;
+
+   r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
+   if (r < 0) {
+   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
+   return r;
+   }
+
+   while (size) {
+   u32 value;
 
r = amdgpu_get_gfx_off_status(adev, );
if (r)
goto out;
 
-   r = put_user(value, (uint32_t *)buf);
+   r = put_user(value, (u32 *)buf);
if (r)
goto out;
 
@@ -1206,6 +1243,12 @@ static const struct file_operations 
amdgpu_debugfs_gfxoff_fops = {
.llseek = default_llseek
 };
 
+static const struct file_operations amdgpu_debugfs_gfxoff_status_fops = {
+   .owner = THIS_MODULE,
+   .read = amdgpu_debugfs_gfxoff_status_read,
+   .llseek = default_llseek
+};
+
 static const struct file_operations *debugfs_regs[] = {
_debugfs_regs_fops,
_debugfs_regs2_fops,
@@ -1217,6 +1260,7 @@ static const struct file_operations *debugfs_regs[] = {
_debugfs_wave_fops,
_debugfs_gpr_fops,
_debugfs_gfxoff_fops,
+   _debugfs_gfxoff_status_fops,
 };
 
 static const char *debugfs_regs_names[] = {
@@ -1230,6 +1274,7 @@ static const char *debugfs_regs_names[] = {
"amdgpu_wave",
"amdgpu_gpr",
"amdgpu_gfxoff",
+   "amdgpu_gfxoff_status",
 };
 
 /**
-- 
2.37.0



Re: [PATCH] x86/configs: Update defconfig with peer-to-peer configs

2022-07-13 Thread Felix Kuehling

On 2022-07-08 19:17, Ramesh Errabolu wrote:

 - Update defconfig for PCI_P2PDMA
 - Update defconfig for DMABUF_MOVE_NOTIFY
 - Update defconfig for HSA_AMD_P2P
---


The patch is missing a Signed-off-by. With that fixed

Reviewed-by: Felix Kuehling 




Notes:
 Following procedure was applied:
 make rock-dbg_defconfig
 make menuconfig
 Enable PCI_P2PDMA
 Enable DMABUF_MOVE_NOTIFY
 Enable HSA_AMD_P2P
 make savedefconfig
 cp defconfig rock-dbg_defconfig
 commit changes

  arch/x86/configs/rock-dbg_defconfig | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/arch/x86/configs/rock-dbg_defconfig 
b/arch/x86/configs/rock-dbg_defconfig
index 406fdfaceb55..0ad80a8c8eab 100644
--- a/arch/x86/configs/rock-dbg_defconfig
+++ b/arch/x86/configs/rock-dbg_defconfig
@@ -303,6 +303,7 @@ CONFIG_PCIEAER=y
  CONFIG_PCI_REALLOC_ENABLE_AUTO=y
  CONFIG_PCI_STUB=y
  CONFIG_PCI_IOV=y
+CONFIG_PCI_P2PDMA=y
  CONFIG_HOTPLUG_PCI=y
  CONFIG_RAPIDIO=y
  CONFIG_RAPIDIO_DMA_ENGINE=y
@@ -417,6 +418,7 @@ CONFIG_DRM_AMDGPU=m
  CONFIG_DRM_AMDGPU_SI=y
  CONFIG_DRM_AMDGPU_CIK=y
  CONFIG_HSA_AMD=y
+CONFIG_HSA_AMD_P2P=y
  CONFIG_DRM_AST=m
  CONFIG_FB=y
  CONFIG_BACKLIGHT_CLASS_DEVICE=y
@@ -453,6 +455,7 @@ CONFIG_LEDS_TRIGGERS=y
  CONFIG_RTC_CLASS=y
  # CONFIG_RTC_HCTOSYS is not set
  CONFIG_DMADEVICES=y
+CONFIG_DMABUF_MOVE_NOTIFY=y
  # CONFIG_X86_PLATFORM_DEVICES is not set
  CONFIG_AMD_IOMMU=y
  CONFIG_INTEL_IOMMU=y


Re: [PATCH] drm/amdkfd: Remove Align VRAM allocations to 1MB on APU ASIC

2022-07-13 Thread Felix Kuehling



Am 2022-07-13 um 05:14 schrieb shikai guo:

From: Shikai Guo 

While executing KFDMemoryTest.MMBench, test case will allocate 4KB size memory 
1000 times.
Every time, user space will get 2M memory.APU VRAM is 512M, there is not enough 
memory to be allocated.
So the 2M aligned feature is not suitable for APU.
NAK. We can try to make the estimate of available VRAM more accurate. 
But in the end, this comes down to limitations of the VRAM manager and 
how it handles memory fragmentation.


A large discrepancy between total VRAM and available VRAM can have a few 
reasons:


 * Big system memory means we need to reserve more space for page tables
 * Many small allocations causing lots of fragmentation. This may be
   the result of memory leaks in previous tests

This patch can "fix" a situation where a leak caused excessive 
fragmentation. But that just papers over the leak. And it will cause the 
opposite problem for the new AvailableMemory test that checks that we 
can really allocate as much memory as we promised.


Regards,
  Felix




guoshikai@guoshikai-MayanKD-RMB:~/linth/libhsakmt/tests/kfdtest/build$ 
./kfdtest --gtest_filter=KFDMemoryTest.MMBench
[  ] Profile: Full Test
[  ] HW capabilities: 0x9
Note: Google Test filter = KFDMemoryTest.MMBench
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from KFDMemoryTest
[ RUN  ] KFDMemoryTest.MMBench
[  ] Found VRAM of 512MB.
[  ] Available VRAM 328MB.
[  ] Test (avg. ns) alloc   mapOne  umapOne   mapAll  umapAll   
  free
[  ] 
--
[  ]   4K-SysMem-noSDMA 2656110350 5212 3787 3981   
 12372
[  ]  64K-SysMem-noSDMA 42864 6648 3973 5223 3843   
 15100
[  ]   2M-SysMem-noSDMA31290612614 4390 6254 4790   
 70260
[  ]  32M-SysMem-noSDMA   4417812   130437216259768718500   
929562
[  ]   1G-SysMem-noSDMA 132161000  2738000   583000  2181000   499000 
39091000
[  ] 
--
/home/guoshikai/linth/libhsakmt/tests/kfdtest/src/KFDMemoryTest.cpp:922: Failure
Value of: (hsaKmtAllocMemory(allocNode, bufSize, memFlags, [i]))
   Actual: 6
Expected: HSAKMT_STATUS_SUCCESS
Which is: 0
[  FAILED  ] KFDMemoryTest.MMBench (749 ms)

fix this issue by adding different treatments for apu and dgpu

Signed-off-by: ruili ji 
Signed-off-by: shikai guo 
---
  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c   | 18 +-
  1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index d1657de5f875..2ad2cd5e3e8b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -115,7 +115,9 @@ void amdgpu_amdkfd_reserve_system_mem(uint64_t size)
   * compromise that should work in most cases without reserving too
   * much memory for page tables unnecessarily (factor 16K, >> 14).
   */
-#define ESTIMATE_PT_SIZE(mem_size) max(((mem_size) >> 14), 
AMDGPU_VM_RESERVED_VRAM)
+
+#define ESTIMATE_PT_SIZE(adev, mem_size)   (adev->flags & AMD_IS_APU) ? \
+(mem_size >> 14) : max(((mem_size) >> 14), 
AMDGPU_VM_RESERVED_VRAM)
  
  static size_t amdgpu_amdkfd_acc_size(uint64_t size)

  {
@@ -142,7 +144,7 @@ static int amdgpu_amdkfd_reserve_mem_limit(struct 
amdgpu_device *adev,
uint64_t size, u32 alloc_flag)
  {
uint64_t reserved_for_pt =
-   ESTIMATE_PT_SIZE(amdgpu_amdkfd_total_mem_size);
+   ESTIMATE_PT_SIZE(adev, amdgpu_amdkfd_total_mem_size);
size_t acc_size, system_mem_needed, ttm_mem_needed, vram_needed;
int ret = 0;
  
@@ -156,12 +158,15 @@ static int amdgpu_amdkfd_reserve_mem_limit(struct amdgpu_device *adev,

system_mem_needed = acc_size;
ttm_mem_needed = acc_size;
  
+		if (adev->flags & AMD_IS_APU)

+   vram_needed = size;
+   else
/*
 * Conservatively round up the allocation requirement to 2 MB
 * to avoid fragmentation caused by 4K allocations in the tail
 * 2M BO chunk.
 */
-   vram_needed = ALIGN(size, VRAM_ALLOCATION_ALIGN);
+   vram_needed = ALIGN(size, VRAM_ALLOCATION_ALIGN);
} else if (alloc_flag & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) {
system_mem_needed = acc_size + size;
ttm_mem_needed = acc_size;
@@ -220,7 +225,10 @@ static void unreserve_mem_limit(struct amdgpu_device *adev,
} else if (alloc_flag & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
kfd_mem_limit.system_mem_used -= acc_size;
kfd_mem_limit.ttm_mem_used -= acc_size;

Re: [PATCH] drm/amdkfd: Remove Align VRAM allocations to 1MB on APU ASIC

2022-07-13 Thread Alex Deucher
On Wed, Jul 13, 2022 at 5:14 AM shikai guo  wrote:
>
> From: Shikai Guo 
>
> While executing KFDMemoryTest.MMBench, test case will allocate 4KB size 
> memory 1000 times.
> Every time, user space will get 2M memory.APU VRAM is 512M, there is not 
> enough memory to be allocated.
> So the 2M aligned feature is not suitable for APU.

Wouldn't it be better to decide based on vram size rather than APU vs
dGPU?  some APUs have large carve outs.

Alex

>
> guoshikai@guoshikai-MayanKD-RMB:~/linth/libhsakmt/tests/kfdtest/build$ 
> ./kfdtest --gtest_filter=KFDMemoryTest.MMBench
> [  ] Profile: Full Test
> [  ] HW capabilities: 0x9
> Note: Google Test filter = KFDMemoryTest.MMBench
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from KFDMemoryTest
> [ RUN  ] KFDMemoryTest.MMBench
> [  ] Found VRAM of 512MB.
> [  ] Available VRAM 328MB.
> [  ] Test (avg. ns) alloc   mapOne  umapOne   mapAll  umapAll 
> free
> [  ] 
> --
> [  ]   4K-SysMem-noSDMA 2656110350 5212 3787 3981 
>12372
> [  ]  64K-SysMem-noSDMA 42864 6648 3973 5223 3843 
>15100
> [  ]   2M-SysMem-noSDMA31290612614 4390 6254 4790 
>70260
> [  ]  32M-SysMem-noSDMA   4417812   130437216259768718500 
>   929562
> [  ]   1G-SysMem-noSDMA 132161000  2738000   583000  2181000   499000 
> 39091000
> [  ] 
> --
> /home/guoshikai/linth/libhsakmt/tests/kfdtest/src/KFDMemoryTest.cpp:922: 
> Failure
> Value of: (hsaKmtAllocMemory(allocNode, bufSize, memFlags, [i]))
>   Actual: 6
> Expected: HSAKMT_STATUS_SUCCESS
> Which is: 0
> [  FAILED  ] KFDMemoryTest.MMBench (749 ms)
>
> fix this issue by adding different treatments for apu and dgpu
>
> Signed-off-by: ruili ji 
> Signed-off-by: shikai guo 
> ---
>  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c   | 18 +-
>  1 file changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index d1657de5f875..2ad2cd5e3e8b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -115,7 +115,9 @@ void amdgpu_amdkfd_reserve_system_mem(uint64_t size)
>   * compromise that should work in most cases without reserving too
>   * much memory for page tables unnecessarily (factor 16K, >> 14).
>   */
> -#define ESTIMATE_PT_SIZE(mem_size) max(((mem_size) >> 14), 
> AMDGPU_VM_RESERVED_VRAM)
> +
> +#define ESTIMATE_PT_SIZE(adev, mem_size)   (adev->flags & AMD_IS_APU) ? \
> +(mem_size >> 14) : max(((mem_size) >> 14), 
> AMDGPU_VM_RESERVED_VRAM)
>
>  static size_t amdgpu_amdkfd_acc_size(uint64_t size)
>  {
> @@ -142,7 +144,7 @@ static int amdgpu_amdkfd_reserve_mem_limit(struct 
> amdgpu_device *adev,
> uint64_t size, u32 alloc_flag)
>  {
> uint64_t reserved_for_pt =
> -   ESTIMATE_PT_SIZE(amdgpu_amdkfd_total_mem_size);
> +   ESTIMATE_PT_SIZE(adev, amdgpu_amdkfd_total_mem_size);
> size_t acc_size, system_mem_needed, ttm_mem_needed, vram_needed;
> int ret = 0;
>
> @@ -156,12 +158,15 @@ static int amdgpu_amdkfd_reserve_mem_limit(struct 
> amdgpu_device *adev,
> system_mem_needed = acc_size;
> ttm_mem_needed = acc_size;
>
> +   if (adev->flags & AMD_IS_APU)
> +   vram_needed = size;
> +   else
> /*
>  * Conservatively round up the allocation requirement to 2 MB
>  * to avoid fragmentation caused by 4K allocations in the tail
>  * 2M BO chunk.
>  */
> -   vram_needed = ALIGN(size, VRAM_ALLOCATION_ALIGN);
> +   vram_needed = ALIGN(size, VRAM_ALLOCATION_ALIGN);
> } else if (alloc_flag & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) {
> system_mem_needed = acc_size + size;
> ttm_mem_needed = acc_size;
> @@ -220,7 +225,10 @@ static void unreserve_mem_limit(struct amdgpu_device 
> *adev,
> } else if (alloc_flag & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
> kfd_mem_limit.system_mem_used -= acc_size;
> kfd_mem_limit.ttm_mem_used -= acc_size;
> -   adev->kfd.vram_used -= ALIGN(size, VRAM_ALLOCATION_ALIGN);
> +   if (adev->flags & AMD_IS_APU)
> +   adev->kfd.vram_used -= size;
> +   else
> +   adev->kfd.vram_used -= ALIGN(size, 
> VRAM_ALLOCATION_ALIGN);
> } else if (alloc_flag & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) {
> kfd_mem_limit.system_mem_used -= (acc_size + 

Re: [Bug][5.19-rc0] Between commits fdaf9a5840ac and babf0bb978e3 GPU stopped entering in graphic mode.

2022-07-13 Thread Mikhail Gavrilov
On Sat, Jul 9, 2022 at 5:10 PM Mikhail Gavrilov
 wrote:

> Hi Christian,
> if you read my initial post. You should see that I tried to bisect the issue.
> But it is very problematic because on each step I see different symptomes.
> And if mark different symptoms with skip step we got at end lot of
> possible commits:
> Here is my bisect from initial post: https://pastebin.com/AhLMNfyv

> [8.291298] [ cut here ]
> [8.291309] kernel BUG at mm/page_alloc.c:1329!
> [8.291324] invalid opcode:  [#1] PREEMPT SMP NOPTI
> [8.291328] CPU: 8 PID: 599 Comm: systemd-udevd Not tainted
> 5.18.0-rc2-003-790b45f1bc6736a8dd48ba5731b6871e0217311e+ #361
> [8.291333] Hardware name: System manufacturer System Product
> Name/ROG STRIX X570-I GAMING, BIOS 4403 04/27/2022
> [8.291338] RIP: 0010:free_pcp_prepare+0x58d/0x5a0

There will be a 5.19 release soon. I haven't got a working kernel
fresher than the fdaf9a5840ac commit on any machine (all machines have
AMD graphics).

Bisecting the kernel if we considered the mutex issue as "bad" state
and all other non working state as "skip" did not lead to anything
useful.

Even if we consider "bad" all commits in which the kernel does not
work, this also does not lead to anything good.
Below I did it:
$ git bisect log
git bisect start
# status: waiting for both good and bad commits
# good: [fdaf9a5840acaab18694a19e0eb0aa51162d] Merge tag
'folio-5.19' of git://git.infradead.org/users/willy/pagecache
git bisect good fdaf9a5840acaab18694a19e0eb0aa51162d
# status: waiting for bad commit, 1 good commit known
# bad: [babf0bb978e3c9fce6c4eba6b744c8754fd43d8e] Merge tag
'xfs-5.19-for-linus' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
git bisect bad babf0bb978e3c9fce6c4eba6b744c8754fd43d8e

# 01 - good: [86c87bea6b42100c67418af690919c44de6ede6e] Merge tag
'devicetree-for-5.19' of
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
git bisect good 86c87bea6b42100c67418af690919c44de6ede6e

# 02 - observed initial problem with mutex
# bad: [43ab20c599f4dc4c3972a8386ef4ca3943b5f9cd] drm/i915/gt: Fix
build error without CONFIG_PM
git bisect bad 43ab20c599f4dc4c3972a8386ef4ca3943b5f9cd

# 03 - observed invalid opcode:  [#1] PREEMPT SMP NOPTI - RIP:
0010:free_pcp_prepare+0x58d/0x5a0
# bad: [790b45f1bc6736a8dd48ba5731b6871e0217311e] drm/i915/bios: Parse
the seamless DRRS min refresh rate
git bisect bad 790b45f1bc6736a8dd48ba5731b6871e0217311e

# 04 - observed invalid opcode:  [#1] PREEMPT SMP NOPTI - RIP:
0010:free_pcp_prepare+0x455/0x650
# bad: [c6ed9f66eb70aeaac9998bd3552ada740d90e20c]
drm/nouveau/gr/gf100-: change gf108_gr_fwif from global to static
git bisect bad c6ed9f66eb70aeaac9998bd3552ada740d90e20c

# 05 good: [3123109284176b1532874591f7c81f3837bbdc17] Linux 5.18-rc1
git bisect good 3123109284176b1532874591f7c81f3837bbdc17

# 06 good: [711c7adc4687250deb550ee8a6994203f817b2ca] drm: exynos:
dsi: Use drm panel_bridge API
git bisect good 711c7adc4687250deb550ee8a6994203f817b2ca

# 07 - observed invalid opcode:  [#1] PREEMPT SMP NOPTI - RIP:
0010:free_pcp_prepare+0x35e/0x410
# bad: [047a1b877ed48098bed71fcfb1d4891e1b54441d] dma-buf &
drm/amdgpu: remove dma_resv workaround
git bisect bad 047a1b877ed48098bed71fcfb1d4891e1b54441d

# 08 good: [644704740b8282c9ee9483a38666ee4a4561c37c] drm/amdgpu: use
dma_resv_for_each_fence for CS workaround v2
git bisect good 644704740b8282c9ee9483a38666ee4a4561c37c

# 09 - observed invalid opcode:  [#1] PREEMPT SMP NOPTI - RIP:
0010:free_pcp_prepare+0x35e/0x410
# bad: [61fe0ab26e36998cebec48805d6873e31f0d79d7] drm/gma500: fix a
missing break in psb_intel_crtc_mode_set
git bisect bad 61fe0ab26e36998cebec48805d6873e31f0d79d7

# 10 good: [1c3b2a27def609473ed13b1cd668cb10deab49b4] drm/nouveau/clk:
Fix an incorrect NULL check on list iterator
git bisect good 1c3b2a27def609473ed13b1cd668cb10deab49b4

# 11 - observed invalid opcode:  [#1] PREEMPT SMP NOPTI - RIP:
0010:free_pcp_prepare+0x35e/0x410
# bad: [aa46154355e1e81ef746470d2e88bdb283508bff] drm/ingenic: Add
ingenic_drm_bridge_atomic_enable and disable
git bisect bad aa46154355e1e81ef746470d2e88bdb283508bff

# 12 good: [71d637823cac7748079a912e0373476c7cf6f985] dma-buf: finally
make dma_resv_excl_fence private v2
git bisect good 71d637823cac7748079a912e0373476c7cf6f985

# 13 - observed invalid opcode:  [#1] PREEMPT SMP NOPTI - RIP:
0010:free_pcp_prepare+0x35e/0x410
# bad: [33f2069fb6a9c2d6509accc39521d3f4d6369576] drm/nouveau: support
more than one write fence in fenv50_wndw_prepare_fb
git bisect bad 33f2069fb6a9c2d6509accc39521d3f4d6369576

# 14 - observed invalid opcode:  [#1] PREEMPT SMP NOPTI - RIP:
0010:free_pcp_prepare+0x35e/0x410
# bad: [9cbbd694a58bdf24def2462276514c90cab7cf80] Merge drm/drm-next
into drm-misc-next
git bisect bad 9cbbd694a58bdf24def2462276514c90cab7cf80

# first bad commit: [9cbbd694a58bdf24def2462276514c90cab7cf80] Merge
drm/drm-next into drm-misc-next


Need an alternative way to find the problem. 

[PATCH] drm/amd/display: Remove unnecessary NULL check in commit_planes_for_stream()

2022-07-13 Thread Dan Carpenter
Smatch complains that:

drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:3369 
commit_planes_for_stream()
warn: variable dereferenced before check 'stream' (see line 3114)

The 'stream' pointer cannot be NULL and the check can be removed.

Signed-off-by: Dan Carpenter 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index dc2c59995a19..76f9af2c5e19 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -3366,7 +3366,7 @@ static void commit_planes_for_stream(struct dc *dc,
top_pipe_to_program->stream_res.tg,
CRTC_STATE_VACTIVE);
 
-   if (stream && should_use_dmub_lock(stream->link)) {
+   if (should_use_dmub_lock(stream->link)) {
union dmub_hw_lock_flags hw_locks = { 0 };
struct dmub_hw_lock_inst_flags inst_flags = { 0 
};
 
-- 
2.35.1



Re: [PATCH v2] drm/amdgpu: Check BO's requested pinning domains against its preferred_domains

2022-07-13 Thread Christian König

Am 12.07.22 um 18:30 schrieb sunpeng...@amd.com:

From: Leo Li 

When pinning a buffer, we should check to see if there are any
additional restrictions imposed by bo->preferred_domains. This will
prevent the BO from being moved to an invalid domain when pinning.

For example, this can happen if the user requests to create a BO in GTT
domain for display scanout. amdgpu_dm will allow pinning to either VRAM
or GTT domains, since DCN can scanout from either or. However, in
amdgpu_bo_pin_restricted(), pinning to VRAM is preferred if there is
adequate carveout. This can lead to pinning to VRAM despite the user
requesting GTT placement for the BO.

v2: Allow the kernel to override the domain, which can happen when
 exporting a BO to a V4L camera (for example).

Signed-off-by: Leo Li 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 8a7b0f6162da..bbd3b8b14cfb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -883,6 +883,10 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 
domain,
if (WARN_ON_ONCE(min_offset > max_offset))
return -EINVAL;
  
+	/* Check domain to be pinned to against preferred domains */

+   if (bo->preferred_domains & domain)
+   domain = bo->preferred_domains & domain;
+
/* A shared bo cannot be migrated to VRAM */
if (bo->tbo.base.import_attach) {
if (domain & AMDGPU_GEM_DOMAIN_GTT)




[PATCH] drm/amdkfd: Remove Align VRAM allocations to 1MB on APU ASIC

2022-07-13 Thread shikai guo
From: Shikai Guo 

While executing KFDMemoryTest.MMBench, test case will allocate 4KB size memory 
1000 times.
Every time, user space will get 2M memory.APU VRAM is 512M, there is not enough 
memory to be allocated.
So the 2M aligned feature is not suitable for APU.

guoshikai@guoshikai-MayanKD-RMB:~/linth/libhsakmt/tests/kfdtest/build$ 
./kfdtest --gtest_filter=KFDMemoryTest.MMBench
[  ] Profile: Full Test
[  ] HW capabilities: 0x9
Note: Google Test filter = KFDMemoryTest.MMBench
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from KFDMemoryTest
[ RUN  ] KFDMemoryTest.MMBench
[  ] Found VRAM of 512MB.
[  ] Available VRAM 328MB.
[  ] Test (avg. ns) alloc   mapOne  umapOne   mapAll  umapAll   
  free
[  ] 
--
[  ]   4K-SysMem-noSDMA 2656110350 5212 3787 3981   
 12372
[  ]  64K-SysMem-noSDMA 42864 6648 3973 5223 3843   
 15100
[  ]   2M-SysMem-noSDMA31290612614 4390 6254 4790   
 70260
[  ]  32M-SysMem-noSDMA   4417812   130437216259768718500   
929562
[  ]   1G-SysMem-noSDMA 132161000  2738000   583000  2181000   499000 
39091000
[  ] 
--
/home/guoshikai/linth/libhsakmt/tests/kfdtest/src/KFDMemoryTest.cpp:922: Failure
Value of: (hsaKmtAllocMemory(allocNode, bufSize, memFlags, [i]))
  Actual: 6
Expected: HSAKMT_STATUS_SUCCESS
Which is: 0
[  FAILED  ] KFDMemoryTest.MMBench (749 ms)

fix this issue by adding different treatments for apu and dgpu

Signed-off-by: ruili ji 
Signed-off-by: shikai guo 
---
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c   | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index d1657de5f875..2ad2cd5e3e8b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -115,7 +115,9 @@ void amdgpu_amdkfd_reserve_system_mem(uint64_t size)
  * compromise that should work in most cases without reserving too
  * much memory for page tables unnecessarily (factor 16K, >> 14).
  */
-#define ESTIMATE_PT_SIZE(mem_size) max(((mem_size) >> 14), 
AMDGPU_VM_RESERVED_VRAM)
+
+#define ESTIMATE_PT_SIZE(adev, mem_size)   (adev->flags & AMD_IS_APU) ? \
+(mem_size >> 14) : max(((mem_size) >> 14), 
AMDGPU_VM_RESERVED_VRAM)
 
 static size_t amdgpu_amdkfd_acc_size(uint64_t size)
 {
@@ -142,7 +144,7 @@ static int amdgpu_amdkfd_reserve_mem_limit(struct 
amdgpu_device *adev,
uint64_t size, u32 alloc_flag)
 {
uint64_t reserved_for_pt =
-   ESTIMATE_PT_SIZE(amdgpu_amdkfd_total_mem_size);
+   ESTIMATE_PT_SIZE(adev, amdgpu_amdkfd_total_mem_size);
size_t acc_size, system_mem_needed, ttm_mem_needed, vram_needed;
int ret = 0;
 
@@ -156,12 +158,15 @@ static int amdgpu_amdkfd_reserve_mem_limit(struct 
amdgpu_device *adev,
system_mem_needed = acc_size;
ttm_mem_needed = acc_size;
 
+   if (adev->flags & AMD_IS_APU)
+   vram_needed = size;
+   else
/*
 * Conservatively round up the allocation requirement to 2 MB
 * to avoid fragmentation caused by 4K allocations in the tail
 * 2M BO chunk.
 */
-   vram_needed = ALIGN(size, VRAM_ALLOCATION_ALIGN);
+   vram_needed = ALIGN(size, VRAM_ALLOCATION_ALIGN);
} else if (alloc_flag & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) {
system_mem_needed = acc_size + size;
ttm_mem_needed = acc_size;
@@ -220,7 +225,10 @@ static void unreserve_mem_limit(struct amdgpu_device *adev,
} else if (alloc_flag & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
kfd_mem_limit.system_mem_used -= acc_size;
kfd_mem_limit.ttm_mem_used -= acc_size;
-   adev->kfd.vram_used -= ALIGN(size, VRAM_ALLOCATION_ALIGN);
+   if (adev->flags & AMD_IS_APU)
+   adev->kfd.vram_used -= size;
+   else
+   adev->kfd.vram_used -= ALIGN(size, 
VRAM_ALLOCATION_ALIGN);
} else if (alloc_flag & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) {
kfd_mem_limit.system_mem_used -= (acc_size + size);
kfd_mem_limit.ttm_mem_used -= acc_size;
@@ -1666,7 +1674,7 @@ int amdgpu_amdkfd_criu_resume(void *p)
 size_t amdgpu_amdkfd_get_available_memory(struct amdgpu_device *adev)
 {
uint64_t reserved_for_pt =
-   ESTIMATE_PT_SIZE(amdgpu_amdkfd_total_mem_size);
+   ESTIMATE_PT_SIZE(adev, amdgpu_amdkfd_total_mem_size);
size_t