[PATCH v5] drm/amdkfd: Provide SMI events watch

2020-04-15 Thread Amber Lin
When the compute is malfunctioning or performance drops, the system admin will use SMI (System Management Interface) tool to monitor/diagnostic what went wrong. This patch provides an event watch interface for the user space to register devices and subscribe events they are interested. After regist

RE: [PATCH] drm/amd/powerplay: fix resume failed as smu table initialize early exit

2020-04-15 Thread Huang, Ray
[AMD Official Use Only - Internal Distribution Only] Reviewed-by: Huang Rui -Original Message- From: Liang, Prike Sent: Wednesday, April 15, 2020 11:43 PM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Huang, Ray ; Liang, Prike Subject: [PATCH] drm/amd/powerplay: fix resu

[pull] amdgpu drm-fixes-5.7

2020-04-15 Thread Alex Deucher
Hi Dave, Daniel, Fixes for 5.7. The following changes since commit 7e7ea24f0b46cd3078bc9af29d1c1aced89d1c8e: drm/amdgpu/display: fix warning when compiling without debugfs (2020-04-08 17:53:11 -0400) are available in the Git repository at: git://people.freedesktop.org/~agd5f/linux tags/am

RE: [PATCH] drm/amdgpu/gmc: Fix spelling mistake.

2020-04-15 Thread Russell, Kent
[AMD Official Use Only - Internal Distribution Only] Reviewed-By: Kent Russell > -Original Message- > From: amd-gfx On Behalf Of > Rajneesh Bhardwaj > Sent: Wednesday, April 15, 2020 12:34 PM > To: amd-gfx@lists.freedesktop.org > Cc: Bhardwaj, Rajneesh > Subject: [PATCH] drm/amdgpu/gmc

[PATCH] drm/amdgpu/gmc: Fix spelling mistake.

2020-04-15 Thread Rajneesh Bhardwaj
Fixes a minor typo in the file. Reviewed-by: Christian König Reviewed-by: Alex Deucher Signed-off-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/driver

Re: [PATCH] drm/amd/powerplay: fix resume failed as smu table initialize early exit

2020-04-15 Thread Deucher, Alexander
[AMD Public Use] Reviewed-by: Alex Deucher From: Liang, Prike Sent: Wednesday, April 15, 2020 11:43 AM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Huang, Ray ; Liang, Prike Subject: [PATCH] drm/amd/powerplay: fix resume failed as smu table initi

[PATCH] drm/amd/powerplay: fix resume failed as smu table initialize early exit

2020-04-15 Thread Prike Liang
When the amdgpu in the suspend/resume loop need notify the dpm disabled, otherwise the smu table will be uninitialize and result in resume failed. Signed-off-by: Prike Liang Tested-by: Mengbing Wang --- drivers/gpu/drm/amd/powerplay/renoir_ppt.c | 7 ++- 1 file changed, 6 insertions(+), 1 d

Re: [PATCH v4] drm/amdkfd: Provide SMI events watch

2020-04-15 Thread Felix Kuehling
Am 2020-04-15 um 9:48 a.m. schrieb Deucher, Alexander: > > [AMD Public Use] > > > We use the drm major/minor in all cases.  Bump  KMS_DRIVER_MINOR in > amdgpu_drv.c and add a note about what was added in the comment. The KFD ioctl API has its own major and minor version defined in include/uapi/li

Re: [PATCH] Revert "drm/amdgpu: use the BAR if possible in amdgpu_device_vram_access v2"

2020-04-15 Thread Kuehling, Felix
[AMD Official Use Only - Internal Distribution Only] The test does not access outside of the allocated memory. But it deliberately crosses a boundary where memory can be allocated non-contiguously. This is meant to catch problems where the access function doesn't handle non-contiguous VRAM allo

Re: [PATCH] Optimized division operation to shift operation

2020-04-15 Thread Deucher, Alexander
[AMD Public Use] I've gone ahead and dropped the patch. Alex From: Koenig, Christian Sent: Wednesday, April 15, 2020 3:57 AM To: Jani Nikula ; Alex Deucher ; Bernard Zhao Cc: Sierra Guiza, Alejandro (Alex) ; Zeng, Oak ; Maling list - DRI developers ; David Ai

Re: [PATCH v4] drm/amdkfd: Provide SMI events watch

2020-04-15 Thread Deucher, Alexander
[AMD Public Use] We use the drm major/minor in all cases. Bump KMS_DRIVER_MINOR in amdgpu_drv.c and add a note about what was added in the comment. Alex From: Lin, Amber Sent: Wednesday, April 15, 2020 9:36 AM To: Deucher, Alexander ; Kuehling, Felix ; amd-gf

Re: [PATCH v4] drm/amdkfd: Provide SMI events watch

2020-04-15 Thread Amber Lin
Thank you Felix. Now I understand the problem of global client ID is leaking a hole for potential attackers. I didn't take that into consideration. I'll change that following your advice below. Hi Alex, Thank you for the link. It's helpful. I have a question regarding the versioning. One topi

RE: [PATCH] Optimized division operation to shift operation

2020-04-15 Thread David Laight
From: Christian König > Sent: 15 April 2020 08:57 > Am 15.04.20 um 09:41 schrieb Jani Nikula: > > On Tue, 14 Apr 2020, Alex Deucher wrote: > >> On Tue, Apr 14, 2020 at 9:05 AM Bernard Zhao wrote: > >>> On some processors, the / operate will call the compiler`s div lib, > >>> which is low efficien

Re: [Intel-gfx] [PATCH 4/5] drm/amdgpu: utilize subconnector property for DP through atombios

2020-04-15 Thread Jani Nikula
Alex, Harry, Christian, can you please eyeball this series and see if it makes sense for you? Thanks, Jani. On Tue, 07 Apr 2020, Jeevan B wrote: > From: Oleg Vasilev > > Since DP-specific information is stored in driver's structures, every > driver needs to implement subconnector property by

Re: AMD DC graphics display code enables -mhard-float, -msse, -msse2 without any visible FPU state protection

2020-04-15 Thread Peter Zijlstra
On Fri, Apr 10, 2020 at 04:31:39PM +0200, Christian König wrote: > Can we put this new automated check will be behind a configuration flag > initially? Or at least make it a warning and not a hard error. I'll try and get the patches merged in mainline objtool but with a flag that isn't used by def

Re: [PATCH] drm/amdgpu/vcn: fix gfxoff issue

2020-04-15 Thread James Zhu
I think this code in amdgpu_vcn.c right now is only for vcn2.0 and above. why it can affect raven? we need rerun  Video play back test case on renoir, see if  it still needs this WA. Thanks! James On 2020-04-15 7:27 a.m., Zhu, Changfeng wrote: [AMD Official Use Only - Internal Distribution

Re: [PATCH] drm/amdgpu/vcn: fix gfxoff issue

2020-04-15 Thread Deucher, Alexander
[AMD Official Use Only - Internal Distribution Only] Do we know if whatever issue was actually fixed on renoir? If not, I'd say just leave it for now. Alex From: amd-gfx on behalf of Zhu, Changfeng Sent: Wednesday, April 15, 2020 7:27 AM To: Zhang, Hawking ;

RE: [PATCH] drm/scheduler: fix drm_sched_get_cleanup_job

2020-04-15 Thread Russell, Kent
[AMD Official Use Only - Internal Distribution Only] It's all good. I pushed a copy to amd-staging-drm-next, reviewed by Andrey. Thanks for pushing it to drm-misc-fixes! Kent > -Original Message- > From: Koenig, Christian > Sent: Wednesday, April 15, 2020 6:36 AM > To: Grodzovsky, And

[PATCH AUTOSEL 4.14 23/30] drm/amdkfd: kfree the wrong pointer

2020-04-15 Thread Sasha Levin
From: Jack Zhang [ Upstream commit 3148a6a0ef3cf93570f30a477292768f7eb5d3c3 ] Originally, it kfrees the wrong pointer for mem_obj. It would cause memory leak under stress test. Signed-off-by: Jack Zhang Acked-by: Nirmoy Das Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers

[PATCH AUTOSEL 4.19 28/40] drm/amdkfd: kfree the wrong pointer

2020-04-15 Thread Sasha Levin
From: Jack Zhang [ Upstream commit 3148a6a0ef3cf93570f30a477292768f7eb5d3c3 ] Originally, it kfrees the wrong pointer for mem_obj. It would cause memory leak under stress test. Signed-off-by: Jack Zhang Acked-by: Nirmoy Das Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers

[PATCH AUTOSEL 5.4 61/84] drm/amdkfd: kfree the wrong pointer

2020-04-15 Thread Sasha Levin
From: Jack Zhang [ Upstream commit 3148a6a0ef3cf93570f30a477292768f7eb5d3c3 ] Originally, it kfrees the wrong pointer for mem_obj. It would cause memory leak under stress test. Signed-off-by: Jack Zhang Acked-by: Nirmoy Das Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers

[PATCH AUTOSEL 5.5 079/106] drm/amdkfd: kfree the wrong pointer

2020-04-15 Thread Sasha Levin
From: Jack Zhang [ Upstream commit 3148a6a0ef3cf93570f30a477292768f7eb5d3c3 ] Originally, it kfrees the wrong pointer for mem_obj. It would cause memory leak under stress test. Signed-off-by: Jack Zhang Acked-by: Nirmoy Das Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers

[PATCH AUTOSEL 5.6 091/129] drm/amdkfd: kfree the wrong pointer

2020-04-15 Thread Sasha Levin
From: Jack Zhang [ Upstream commit 3148a6a0ef3cf93570f30a477292768f7eb5d3c3 ] Originally, it kfrees the wrong pointer for mem_obj. It would cause memory leak under stress test. Signed-off-by: Jack Zhang Acked-by: Nirmoy Das Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers

[PATCH AUTOSEL 5.6 063/129] drm/amd/display: Don't try hdcp1.4 when content_type is set to type1

2020-04-15 Thread Sasha Levin
From: Bhawanpreet Lakha [ Upstream commit c2850c125d919efbb3a9ab46410d23912934f585 ] [Why] When content type property is set to 1. We should enable hdcp2.2 and if we cant then stop. Currently the way it works in DC is that if we fail hdcp2, we will try hdcp1 after. [How] Use link config to forc

RE: [PATCH] drm/amdgpu/vcn: fix gfxoff issue

2020-04-15 Thread Zhu, Changfeng
[AMD Official Use Only - Internal Distribution Only] After drop this WA, It can't enter GFXOFF on raven2. And it can't run S3 successfully on Picasso and raven1. I suggest that it can add chip type and drop this WA only on renoir. BR, Changfeng -Original Message- From: Zhang, Hawking

[PATCH] drm/amdgpu/powerplay:avoid to show invalid DPM table info

2020-04-15 Thread Yuxian Dai
we should avoid to show the invalid level value when the DPM_LEVELS supported number changed Signed-off-by: Yuxian Dai Change-Id: Ib66d0cf34a866fa6f0cedd1d5fc642f59236787d --- drivers/gpu/drm/amd/powerplay/renoir_ppt.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/pow

Re: [PATCH] Revert "drm/amdgpu: use the BAR if possible in amdgpu_device_vram_access v2"

2020-04-15 Thread Christian König
To elaborate on the PTRACE test, we PEEK 2 DWORDs inside thunk allocated mapped memory and 2 DWORDS outside that boundary (it’s only about 4MB to the boundary).  Then we POKE to swap the DWORD positions across the boundary.  The RAS event on the single failing machine happens on the out of boun

Re: [PATCH] drm/scheduler: fix drm_sched_get_cleanup_job

2020-04-15 Thread Christian König
Sorry for the holiday/vacation/COVID-19 delay. I've just pushed this patch into drm-misc-fixes. I assume it already landed in our internal branches? Thanks, Christian. Am 14.04.20 um 16:33 schrieb Andrey Grodzovsky: Reviewed-by: Andrey Grodzovsky Andrey On 4/14/20 10:22 AM, Kent Russell wr

RE: [PATCH] Revert "drm/amdgpu: use the BAR if possible in amdgpu_device_vram_access v2"

2020-04-15 Thread Kim, Jonathan
[AMD Public Use] Hi Christian, That could potentially be it. With additional testing, 2 of 3 Vega20 machines never hit error over BAR access with the PTRACE test. 3 of 3 machines (from the same pool) always hit error with CWSR. To elaborate on the PTRACE test, we PEEK 2 DWORDs inside thunk al

Re: [PATCH] Optimized division operation to shift operation

2020-04-15 Thread Daniel Vetter
On Wed, Apr 15, 2020 at 9:57 AM Christian König wrote: > > Am 15.04.20 um 09:41 schrieb Jani Nikula: > > On Tue, 14 Apr 2020, Alex Deucher wrote: > >> On Tue, Apr 14, 2020 at 9:05 AM Bernard Zhao wrote: > >>> On some processors, the / operate will call the compiler`s div lib, > >>> which is low

[PATCH v4] x86: insn: Add insn_is_fpu()

2020-04-15 Thread Masami Hiramatsu
Add insn_is_fpu(insn) which tells that the insn is whether touch the FPU/SSE/MMX register or the instruction of FP coprocessor. Signed-off-by: Masami Hiramatsu --- Changes in v4: - Fix to match x87-opcode pattern with opcode instead of ext(ension). --- arch/x86/include/asm/inat.h

Re: [PATCH v3] x86: insn: Add insn_is_fpu()

2020-04-15 Thread Masami Hiramatsu
On Fri, 10 Apr 2020 10:22:30 +0900 Masami Hiramatsu wrote: > @@ -318,10 +331,14 @@ function convert_operands(count,opnd, i,j,imm,mod) > if (match(opcode, rex_expr)) > flags = add_flags(flags, > "INAT_MAKE_PREFIX(INAT_PFX_REX)") > > - # chec

Re: [PATCH] Revert "drm/amdgpu: use the BAR if possible in amdgpu_device_vram_access v2"

2020-04-15 Thread Christian König
Hi Jon, Also cwsr tests fail on Vega20 with or without the revert with the same RAS error. That sounds like the system/setup has a more general problem. Could it be that we are seeing RAS errors because there really is some hardware failure, but with the MM path we don't trigger a RAS interr

Re: [PATCH] Optimized division operation to shift operation

2020-04-15 Thread Christian König
Am 15.04.20 um 09:41 schrieb Jani Nikula: On Tue, 14 Apr 2020, Alex Deucher wrote: On Tue, Apr 14, 2020 at 9:05 AM Bernard Zhao wrote: On some processors, the / operate will call the compiler`s div lib, which is low efficient, We can replace the / operation with shift, so that we can replace

Re: [PATCH] Optimized division operation to shift operation

2020-04-15 Thread Jani Nikula
On Tue, 14 Apr 2020, Alex Deucher wrote: > On Tue, Apr 14, 2020 at 9:05 AM Bernard Zhao wrote: >> >> On some processors, the / operate will call the compiler`s div lib, >> which is low efficient, We can replace the / operation with shift, >> so that we can replace the call of the division library