Re: [PATCH] drm/amdgpu: stop disable the scheduler during HW fini

2020-06-08 Thread Nirmoy

Eh sorry. It is merged, I was looking at wrong branch.


Regards,

Nirmoy

On 6/8/20 7:13 PM, Nirmoy wrote:


Hi Christian,


I realized we are still missing this patch while reading dmesg of 
https://gitlab.freedesktop.org/drm/amd/-/issues/1158 
<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1158&data=02%7C01%7Cnirmoy.das%40amd.com%7C426163ec6ca743ff3f5d08d80bcf2dab%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637272332499610412&sdata=gQW0f0Ovm52jmekzgDYLrr3XvHPp3V9u1steyBNtgic%3D&reserved=0>



Regards,

Nirmoy

On 2/28/20 4:24 PM, Li, Dennis wrote:


[AMD Public Use]

Looks good to me

Test-by: Dennis Li mailto:dennis...@amd.com>>

Best Regards

Dennis Li

*From:* amd-gfx  *On Behalf Of 
*Deucher, Alexander

*Sent:* Thursday, February 27, 2020 11:18 PM
*To:* Christian König ; Das, Nirmoy 
; amd-gfx@lists.freedesktop.org
*Subject:* Re: [PATCH] drm/amdgpu: stop disable the scheduler during 
HW fini


[AMD Public Use]

Looks good to me.

Reviewed-by: Alex Deucher <mailto:alexander.deuc...@amd.com>>




*From:*Christian König <mailto:ckoenig.leichtzumer...@gmail.com>>

*Sent:* Thursday, February 27, 2020 9:50 AM
*To:* Das, Nirmoy mailto:nirmoy@amd.com>>; 
amd-gfx@lists.freedesktop.org <mailto:amd-gfx@lists.freedesktop.org> 
<mailto:amd-gfx@lists.freedesktop.org>>; Deucher, Alexander 
mailto:alexander.deuc...@amd.com>>
*Subject:* Re: [PATCH] drm/amdgpu: stop disable the scheduler during 
HW fini


Alex any comment on this?

Am 25.02.20 um 14:16 schrieb Nirmoy:
> Acked-by: Nirmoy Das mailto:nirmoy@amd.com>>
>
> On 2/25/20 2:07 PM, Christian König wrote:
>> When we stop the HW for example for GPU reset we should not stop the
>> front-end scheduler. Otherwise we run into intermediate failures 
during

>> command submission.
>>
>> The scheduler should only be stopped in very few cases:
>> 1. We can't get the hardware working in ring or IB test after a GPU
>> reset.
>> 2. The KIQ scheduler is not used in the front-end and should be
>> disabled during GPU reset.
>> 3. In amdgpu_ring_fini() when the driver unloads.
>>
>> Signed-off-by: Christian König <mailto:christian.koe...@amd.com>>

>> ---
>>   drivers/gpu/drm/amd/amdgpu/cik_sdma.c  |  2 --
>>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c |  8 
>>   drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c  |  5 -
>>   drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 25 
+

>>   drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  |  7 ---
>>   drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  |  9 -
>>   drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c |  2 --
>>   drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c |  2 --
>>   drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c |  4 
>>   drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/si_dma.c    |  1 -
>>   drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c  |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c  |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c  |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c  |  7 ---
>>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c  |  4 
>>   drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c  |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c  |  9 -
>>   drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c  | 11 +--
>>   20 files changed, 10 insertions(+), 104 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
>> b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
>> index 4274ccf765de..cb3b3a0a1348 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
>> @@ -320,8 +320,6 @@ static void cik_sdma_gfx_stop(struct
>> amdgpu_device *adev)
>>   WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i], rb_cntl);
>>   WREG32(mmSDMA0_GFX_IB_CNTL + sdma_offsets[i], 0);
>>   }
>> -    sdma0->sched.ready = false;
>> -    sdma1->sched.ready = false;
>>   }
>>     /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> index 7b6158320400..36ce67ce4800 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> @@ -2391,10 +2391,6 @@ static int gfx_v10_0_cp_gfx_enable(struct
>> amdgpu_device *adev, bool enable)
>>   tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, ME_HALT, enable ? 0 : 1);
>>   tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, PFP_HALT, enable ? 0 : 1);
>>   tmp = REG_SET_FIELD(tmp, CP_M

Re: [PATCH] drm/amdgpu: stop disable the scheduler during HW fini

2020-06-08 Thread Nirmoy

Hi Christian,


I realized we are still missing this patch while reading dmesg of 
https://gitlab.freedesktop.org/drm/amd/-/issues/1158



Regards,

Nirmoy

On 2/28/20 4:24 PM, Li, Dennis wrote:


[AMD Public Use]

Looks good to me

Test-by: Dennis Li mailto:dennis...@amd.com>>

Best Regards

Dennis Li

*From:* amd-gfx  *On Behalf Of 
*Deucher, Alexander

*Sent:* Thursday, February 27, 2020 11:18 PM
*To:* Christian König ; Das, Nirmoy 
; amd-gfx@lists.freedesktop.org
*Subject:* Re: [PATCH] drm/amdgpu: stop disable the scheduler during 
HW fini


[AMD Public Use]

Looks good to me.

Reviewed-by: Alex Deucher <mailto:alexander.deuc...@amd.com>>




*From:*Christian König <mailto:ckoenig.leichtzumer...@gmail.com>>

*Sent:* Thursday, February 27, 2020 9:50 AM
*To:* Das, Nirmoy mailto:nirmoy@amd.com>>; 
amd-gfx@lists.freedesktop.org <mailto:amd-gfx@lists.freedesktop.org> 
<mailto:amd-gfx@lists.freedesktop.org>>; Deucher, Alexander 
mailto:alexander.deuc...@amd.com>>
*Subject:* Re: [PATCH] drm/amdgpu: stop disable the scheduler during 
HW fini


Alex any comment on this?

Am 25.02.20 um 14:16 schrieb Nirmoy:
> Acked-by: Nirmoy Das mailto:nirmoy@amd.com>>
>
> On 2/25/20 2:07 PM, Christian König wrote:
>> When we stop the HW for example for GPU reset we should not stop the
>> front-end scheduler. Otherwise we run into intermediate failures during
>> command submission.
>>
>> The scheduler should only be stopped in very few cases:
>> 1. We can't get the hardware working in ring or IB test after a GPU
>> reset.
>> 2. The KIQ scheduler is not used in the front-end and should be
>> disabled during GPU reset.
>> 3. In amdgpu_ring_fini() when the driver unloads.
>>
>> Signed-off-by: Christian König <mailto:christian.koe...@amd.com>>

>> ---
>>   drivers/gpu/drm/amd/amdgpu/cik_sdma.c  |  2 --
>>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c |  8 
>>   drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c  |  5 -
>>   drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 25 +
>>   drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  |  7 ---
>>   drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  |  9 -
>>   drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c |  2 --
>>   drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c |  2 --
>>   drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c |  4 
>>   drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/si_dma.c    |  1 -
>>   drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c  |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c  |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c  |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c  |  7 ---
>>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c  |  4 
>>   drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c  |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c  |  9 -
>>   drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c  | 11 +--
>>   20 files changed, 10 insertions(+), 104 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
>> b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
>> index 4274ccf765de..cb3b3a0a1348 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
>> @@ -320,8 +320,6 @@ static void cik_sdma_gfx_stop(struct
>> amdgpu_device *adev)
>>   WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i], rb_cntl);
>>   WREG32(mmSDMA0_GFX_IB_CNTL + sdma_offsets[i], 0);
>>   }
>> -    sdma0->sched.ready = false;
>> -    sdma1->sched.ready = false;
>>   }
>>     /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> index 7b6158320400..36ce67ce4800 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> @@ -2391,10 +2391,6 @@ static int gfx_v10_0_cp_gfx_enable(struct
>> amdgpu_device *adev, bool enable)
>>   tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, ME_HALT, enable ? 0 : 1);
>>   tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, PFP_HALT, enable ? 0 : 1);
>>   tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, CE_HALT, enable ? 0 : 1);
>> -    if (!enable) {
>> -    for (i = 0; i < adev->gfx.num_gfx_rings; i++)
>> - adev->gfx.gfx_ring[i].sched.ready = false;
>> -    }
>>   WREG32_SOC15(GC, 0, mmCP_ME_CNTL, tmp);
>>     for (i = 0; i < adev->usec_timeout; i++) {
>> @@ -2869,16 +2865,12 @@ static int gfx_v10_0_cp_gfx_resume(struct
>> amdgpu_device *adev)
>>     static void gfx_v10_0_c

RE: [PATCH] drm/amdgpu: stop disable the scheduler during HW fini

2020-02-28 Thread Li, Dennis
[AMD Public Use]

Looks good to me

Test-by: Dennis Li mailto:dennis...@amd.com>>

Best Regards
Dennis Li
From: amd-gfx  On Behalf Of Deucher, 
Alexander
Sent: Thursday, February 27, 2020 11:18 PM
To: Christian König ; Das, Nirmoy 
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: stop disable the scheduler during HW fini


[AMD Public Use]

Looks good to me.
Reviewed-by: Alex Deucher 
mailto:alexander.deuc...@amd.com>>

From: Christian König 
mailto:ckoenig.leichtzumer...@gmail.com>>
Sent: Thursday, February 27, 2020 9:50 AM
To: Das, Nirmoy mailto:nirmoy@amd.com>>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> 
mailto:amd-gfx@lists.freedesktop.org>>; Deucher, 
Alexander mailto:alexander.deuc...@amd.com>>
Subject: Re: [PATCH] drm/amdgpu: stop disable the scheduler during HW fini

Alex any comment on this?

Am 25.02.20 um 14:16 schrieb Nirmoy:
> Acked-by: Nirmoy Das mailto:nirmoy@amd.com>>
>
> On 2/25/20 2:07 PM, Christian König wrote:
>> When we stop the HW for example for GPU reset we should not stop the
>> front-end scheduler. Otherwise we run into intermediate failures during
>> command submission.
>>
>> The scheduler should only be stopped in very few cases:
>> 1. We can't get the hardware working in ring or IB test after a GPU
>> reset.
>> 2. The KIQ scheduler is not used in the front-end and should be
>> disabled during GPU reset.
>> 3. In amdgpu_ring_fini() when the driver unloads.
>>
>> Signed-off-by: Christian König 
>> mailto:christian.koe...@amd.com>>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/cik_sdma.c  |  2 --
>>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c |  8 
>>   drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c  |  5 -
>>   drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 25 +
>>   drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  |  7 ---
>>   drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  |  9 -
>>   drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c |  2 --
>>   drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c |  2 --
>>   drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c |  4 
>>   drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/si_dma.c|  1 -
>>   drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c  |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c  |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c  |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c  |  7 ---
>>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c  |  4 
>>   drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c  |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c  |  9 -
>>   drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c  | 11 +--
>>   20 files changed, 10 insertions(+), 104 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
>> b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
>> index 4274ccf765de..cb3b3a0a1348 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
>> @@ -320,8 +320,6 @@ static void cik_sdma_gfx_stop(struct
>> amdgpu_device *adev)
>>   WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i], rb_cntl);
>>   WREG32(mmSDMA0_GFX_IB_CNTL + sdma_offsets[i], 0);
>>   }
>> -sdma0->sched.ready = false;
>> -sdma1->sched.ready = false;
>>   }
>> /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> index 7b6158320400..36ce67ce4800 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> @@ -2391,10 +2391,6 @@ static int gfx_v10_0_cp_gfx_enable(struct
>> amdgpu_device *adev, bool enable)
>>   tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, ME_HALT, enable ? 0 : 1);
>>   tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, PFP_HALT, enable ? 0 : 1);
>>   tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, CE_HALT, enable ? 0 : 1);
>> -if (!enable) {
>> -for (i = 0; i < adev->gfx.num_gfx_rings; i++)
>> -adev->gfx.gfx_ring[i].sched.ready = false;
>> -}
>>   WREG32_SOC15(GC, 0, mmCP_ME_CNTL, tmp);
>> for (i = 0; i < adev->usec_timeout; i++) {
>> @@ -2869,16 +2865,12 @@ static int gfx_v10_0_cp_gfx_resume(struct
>> amdgpu_device *adev)
>> static void gfx_v10_0_cp_compute_enable(struct amdgpu_device
>> *adev, bool enable)
>>   {
>> -int i;
>> -
>>   if (enable) {
>>   WREG32_SOC15(GC, 0, mmCP_MEC_CNTL, 0);
>>   } else {
>>   WREG32_SOC15(GC, 0

Re: [PATCH] drm/amdgpu: stop disable the scheduler during HW fini

2020-02-27 Thread Deucher, Alexander
[AMD Public Use]

Looks good to me.
Reviewed-by: Alex Deucher 

From: Christian König 
Sent: Thursday, February 27, 2020 9:50 AM
To: Das, Nirmoy ; amd-gfx@lists.freedesktop.org 
; Deucher, Alexander 
Subject: Re: [PATCH] drm/amdgpu: stop disable the scheduler during HW fini

Alex any comment on this?

Am 25.02.20 um 14:16 schrieb Nirmoy:
> Acked-by: Nirmoy Das 
>
> On 2/25/20 2:07 PM, Christian König wrote:
>> When we stop the HW for example for GPU reset we should not stop the
>> front-end scheduler. Otherwise we run into intermediate failures during
>> command submission.
>>
>> The scheduler should only be stopped in very few cases:
>> 1. We can't get the hardware working in ring or IB test after a GPU
>> reset.
>> 2. The KIQ scheduler is not used in the front-end and should be
>> disabled during GPU reset.
>> 3. In amdgpu_ring_fini() when the driver unloads.
>>
>> Signed-off-by: Christian König 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/cik_sdma.c  |  2 --
>>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c |  8 
>>   drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c  |  5 -
>>   drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 25 +
>>   drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  |  7 ---
>>   drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  |  9 -
>>   drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c |  2 --
>>   drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c |  2 --
>>   drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c |  4 
>>   drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/si_dma.c|  1 -
>>   drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c  |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c  |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c  |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c  |  7 ---
>>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c  |  4 
>>   drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c  |  3 ---
>>   drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c  |  9 -
>>   drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c  | 11 +--
>>   20 files changed, 10 insertions(+), 104 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
>> b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
>> index 4274ccf765de..cb3b3a0a1348 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
>> @@ -320,8 +320,6 @@ static void cik_sdma_gfx_stop(struct
>> amdgpu_device *adev)
>>   WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i], rb_cntl);
>>   WREG32(mmSDMA0_GFX_IB_CNTL + sdma_offsets[i], 0);
>>   }
>> -sdma0->sched.ready = false;
>> -sdma1->sched.ready = false;
>>   }
>> /**
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> index 7b6158320400..36ce67ce4800 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
>> @@ -2391,10 +2391,6 @@ static int gfx_v10_0_cp_gfx_enable(struct
>> amdgpu_device *adev, bool enable)
>>   tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, ME_HALT, enable ? 0 : 1);
>>   tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, PFP_HALT, enable ? 0 : 1);
>>   tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, CE_HALT, enable ? 0 : 1);
>> -if (!enable) {
>> -for (i = 0; i < adev->gfx.num_gfx_rings; i++)
>> -adev->gfx.gfx_ring[i].sched.ready = false;
>> -}
>>   WREG32_SOC15(GC, 0, mmCP_ME_CNTL, tmp);
>> for (i = 0; i < adev->usec_timeout; i++) {
>> @@ -2869,16 +2865,12 @@ static int gfx_v10_0_cp_gfx_resume(struct
>> amdgpu_device *adev)
>> static void gfx_v10_0_cp_compute_enable(struct amdgpu_device
>> *adev, bool enable)
>>   {
>> -int i;
>> -
>>   if (enable) {
>>   WREG32_SOC15(GC, 0, mmCP_MEC_CNTL, 0);
>>   } else {
>>   WREG32_SOC15(GC, 0, mmCP_MEC_CNTL,
>>(CP_MEC_CNTL__MEC_ME1_HALT_MASK |
>> CP_MEC_CNTL__MEC_ME2_HALT_MASK));
>> -for (i = 0; i < adev->gfx.num_compute_rings; i++)
>> -adev->gfx.compute_ring[i].sched.ready = false;
>>   adev->gfx.kiq.ring.sched.ready = false;
>>   }
>>   udelay(50);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
>> index 31f44d05e606..e462a099dbda 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
>> @@ -1950,7 +1

Re: [PATCH] drm/amdgpu: stop disable the scheduler during HW fini

2020-02-27 Thread Christian König

Alex any comment on this?

Am 25.02.20 um 14:16 schrieb Nirmoy:

Acked-by: Nirmoy Das 

On 2/25/20 2:07 PM, Christian König wrote:

When we stop the HW for example for GPU reset we should not stop the
front-end scheduler. Otherwise we run into intermediate failures during
command submission.

The scheduler should only be stopped in very few cases:
1. We can't get the hardware working in ring or IB test after a GPU 
reset.
2. The KIQ scheduler is not used in the front-end and should be 
disabled during GPU reset.

3. In amdgpu_ring_fini() when the driver unloads.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/cik_sdma.c  |  2 --
  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c |  8 
  drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c  |  5 -
  drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 25 +
  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  |  7 ---
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  |  9 -
  drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c |  3 ---
  drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c |  2 --
  drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c |  2 --
  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c |  4 
  drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c |  3 ---
  drivers/gpu/drm/amd/amdgpu/si_dma.c    |  1 -
  drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c  |  3 ---
  drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c  |  3 ---
  drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c  |  3 ---
  drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c  |  7 ---
  drivers/gpu/drm/amd/amdgpu/vce_v4_0.c  |  4 
  drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c  |  3 ---
  drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c  |  9 -
  drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c  | 11 +--
  20 files changed, 10 insertions(+), 104 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c 
b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c

index 4274ccf765de..cb3b3a0a1348 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -320,8 +320,6 @@ static void cik_sdma_gfx_stop(struct 
amdgpu_device *adev)

  WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i], rb_cntl);
  WREG32(mmSDMA0_GFX_IB_CNTL + sdma_offsets[i], 0);
  }
-    sdma0->sched.ready = false;
-    sdma1->sched.ready = false;
  }
    /**
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c

index 7b6158320400..36ce67ce4800 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -2391,10 +2391,6 @@ static int gfx_v10_0_cp_gfx_enable(struct 
amdgpu_device *adev, bool enable)

  tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, ME_HALT, enable ? 0 : 1);
  tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, PFP_HALT, enable ? 0 : 1);
  tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, CE_HALT, enable ? 0 : 1);
-    if (!enable) {
-    for (i = 0; i < adev->gfx.num_gfx_rings; i++)
-    adev->gfx.gfx_ring[i].sched.ready = false;
-    }
  WREG32_SOC15(GC, 0, mmCP_ME_CNTL, tmp);
    for (i = 0; i < adev->usec_timeout; i++) {
@@ -2869,16 +2865,12 @@ static int gfx_v10_0_cp_gfx_resume(struct 
amdgpu_device *adev)
    static void gfx_v10_0_cp_compute_enable(struct amdgpu_device 
*adev, bool enable)

  {
-    int i;
-
  if (enable) {
  WREG32_SOC15(GC, 0, mmCP_MEC_CNTL, 0);
  } else {
  WREG32_SOC15(GC, 0, mmCP_MEC_CNTL,
   (CP_MEC_CNTL__MEC_ME1_HALT_MASK |
    CP_MEC_CNTL__MEC_ME2_HALT_MASK));
-    for (i = 0; i < adev->gfx.num_compute_rings; i++)
-    adev->gfx.compute_ring[i].sched.ready = false;
  adev->gfx.kiq.ring.sched.ready = false;
  }
  udelay(50);
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c

index 31f44d05e606..e462a099dbda 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
@@ -1950,7 +1950,6 @@ static int gfx_v6_0_ring_test_ib(struct 
amdgpu_ring *ring, long timeout)
    static void gfx_v6_0_cp_gfx_enable(struct amdgpu_device *adev, 
bool enable)

  {
-    int i;
  if (enable) {
  WREG32(mmCP_ME_CNTL, 0);
  } else {
@@ -1958,10 +1957,6 @@ static void gfx_v6_0_cp_gfx_enable(struct 
amdgpu_device *adev, bool enable)

    CP_ME_CNTL__PFP_HALT_MASK |
    CP_ME_CNTL__CE_HALT_MASK));
  WREG32(mmSCRATCH_UMSK, 0);
-    for (i = 0; i < adev->gfx.num_gfx_rings; i++)
-    adev->gfx.gfx_ring[i].sched.ready = false;
-    for (i = 0; i < adev->gfx.num_compute_rings; i++)
-    adev->gfx.compute_ring[i].sched.ready = false;
  }
  udelay(50);
  }
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c

index 8f20a5dd44fe..9bc8673c83ac 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -2431,15 +2431,12 @@ static int gfx_v7_0_ring_test_ib(struct 
amdgpu_ring *ring, long timeout)

   */
  static void gfx_v7_0_cp_gfx_enable(struct amdgpu_device *adev

Re: [PATCH] drm/amdgpu: stop disable the scheduler during HW fini

2020-02-25 Thread Nirmoy



On 2/25/20 2:16 PM, Nirmoy wrote:

Acked-by: Nirmoy Das 

Please change that to Tested-by: Nirmoy Das 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: stop disable the scheduler during HW fini

2020-02-25 Thread Nirmoy

Hi Dennis,

Can you please test this patch on vega20 for SWDEV-223117


Regards,

Nirmoy

On 2/25/20 2:07 PM, Christian König wrote:

When we stop the HW for example for GPU reset we should not stop the
front-end scheduler. Otherwise we run into intermediate failures during
command submission.

The scheduler should only be stopped in very few cases:
1. We can't get the hardware working in ring or IB test after a GPU reset.
2. The KIQ scheduler is not used in the front-end and should be disabled during 
GPU reset.
3. In amdgpu_ring_fini() when the driver unloads.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/cik_sdma.c  |  2 --
  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c |  8 
  drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c  |  5 -
  drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 25 +
  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  |  7 ---
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  |  9 -
  drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c |  3 ---
  drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c |  2 --
  drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c |  2 --
  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c |  4 
  drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c |  3 ---
  drivers/gpu/drm/amd/amdgpu/si_dma.c|  1 -
  drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c  |  3 ---
  drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c  |  3 ---
  drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c  |  3 ---
  drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c  |  7 ---
  drivers/gpu/drm/amd/amdgpu/vce_v4_0.c  |  4 
  drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c  |  3 ---
  drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c  |  9 -
  drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c  | 11 +--
  20 files changed, 10 insertions(+), 104 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c 
b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index 4274ccf765de..cb3b3a0a1348 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -320,8 +320,6 @@ static void cik_sdma_gfx_stop(struct amdgpu_device *adev)
WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i], rb_cntl);
WREG32(mmSDMA0_GFX_IB_CNTL + sdma_offsets[i], 0);
}
-   sdma0->sched.ready = false;
-   sdma1->sched.ready = false;
  }
  
  /**

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 7b6158320400..36ce67ce4800 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -2391,10 +2391,6 @@ static int gfx_v10_0_cp_gfx_enable(struct amdgpu_device 
*adev, bool enable)
tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, ME_HALT, enable ? 0 : 1);
tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, PFP_HALT, enable ? 0 : 1);
tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, CE_HALT, enable ? 0 : 1);
-   if (!enable) {
-   for (i = 0; i < adev->gfx.num_gfx_rings; i++)
-   adev->gfx.gfx_ring[i].sched.ready = false;
-   }
WREG32_SOC15(GC, 0, mmCP_ME_CNTL, tmp);
  
  	for (i = 0; i < adev->usec_timeout; i++) {

@@ -2869,16 +2865,12 @@ static int gfx_v10_0_cp_gfx_resume(struct amdgpu_device 
*adev)
  
  static void gfx_v10_0_cp_compute_enable(struct amdgpu_device *adev, bool enable)

  {
-   int i;
-
if (enable) {
WREG32_SOC15(GC, 0, mmCP_MEC_CNTL, 0);
} else {
WREG32_SOC15(GC, 0, mmCP_MEC_CNTL,
 (CP_MEC_CNTL__MEC_ME1_HALT_MASK |
  CP_MEC_CNTL__MEC_ME2_HALT_MASK));
-   for (i = 0; i < adev->gfx.num_compute_rings; i++)
-   adev->gfx.compute_ring[i].sched.ready = false;
adev->gfx.kiq.ring.sched.ready = false;
}
udelay(50);
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
index 31f44d05e606..e462a099dbda 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
@@ -1950,7 +1950,6 @@ static int gfx_v6_0_ring_test_ib(struct amdgpu_ring 
*ring, long timeout)
  
  static void gfx_v6_0_cp_gfx_enable(struct amdgpu_device *adev, bool enable)

  {
-   int i;
if (enable) {
WREG32(mmCP_ME_CNTL, 0);
} else {
@@ -1958,10 +1957,6 @@ static void gfx_v6_0_cp_gfx_enable(struct amdgpu_device 
*adev, bool enable)
  CP_ME_CNTL__PFP_HALT_MASK |
  CP_ME_CNTL__CE_HALT_MASK));
WREG32(mmSCRATCH_UMSK, 0);
-   for (i = 0; i < adev->gfx.num_gfx_rings; i++)
-   adev->gfx.gfx_ring[i].sched.ready = false;
-   for (i = 0; i < adev->gfx.num_compute_rings; i++)
-   adev->gfx.compute_ring[i].sched.ready = false;
}
udelay(50);
  }
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index 8f20a5dd44fe..9bc8673c83ac 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers

Re: [PATCH] drm/amdgpu: stop disable the scheduler during HW fini

2020-02-25 Thread Nirmoy

Hi Christian,


I tested with amdgpu_test which does a GPU reset as well because of 
deadlock_tests. Reset was fine I could run amdgpu_test multiple times.



dmesg:

Feb 25 14:32:20 brihaspati kernel: [drm:gfx_v9_0_priv_reg_irq [amdgpu]] 
*ERROR* Illegal register access in command stream
Feb 25 14:32:20 brihaspati kernel: [drm:amdgpu_job_timedout [amdgpu]] 
*ERROR* ring gfx timeout, signaled seq=290, emitted seq=291
Feb 25 14:32:20 brihaspati kernel: [drm:amdgpu_job_timedout [amdgpu]] 
*ERROR* Process information: process amdgpu_test pid 2401 thread 
amdgpu_test pid 2401

Feb 25 14:32:20 brihaspati kernel: amdgpu :09:00.0: GPU reset begin!
Feb 25 14:32:21 brihaspati kernel: amdgpu :09:00.0: GPU BACO reset
Feb 25 14:32:21 brihaspati kernel: amdgpu :09:00.0: GPU reset 
succeeded, trying to resume
Feb 25 14:32:21 brihaspati kernel: [drm] PCIE GART of 512M enabled 
(table at 0x00F40090).

Feb 25 14:32:21 brihaspati kernel: [drm] VRAM is lost due to GPU reset!
Feb 25 14:32:21 brihaspati kernel: [drm] PSP is resuming...
Feb 25 14:32:22 brihaspati kernel: [drm] reserve 0x40 from 
0xf5fe80 for PSP TMR

Feb 25 14:32:22 brihaspati kernel: [drm] kiq ring mec 2 pipe 1 q 0
Feb 25 14:32:22 brihaspati kernel: [drm] UVD and UVD ENC initialized 
successfully.

Feb 25 14:32:22 brihaspati kernel: [drm] VCE initialized successfully.
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: ring gfx uses VM 
inv eng 0 on hub 0
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: ring comp_1.0.0 
uses VM inv eng 1 on hub 0
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: ring comp_1.1.0 
uses VM inv eng 4 on hub 0
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: ring comp_1.2.0 
uses VM inv eng 5 on hub 0
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: ring comp_1.3.0 
uses VM inv eng 6 on hub 0
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: ring comp_1.0.1 
uses VM inv eng 7 on hub 0
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: ring comp_1.1.1 
uses VM inv eng 8 on hub 0
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: ring comp_1.2.1 
uses VM inv eng 9 on hub 0
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: ring comp_1.3.1 
uses VM inv eng 10 on hub 0
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: ring kiq_2.1.0 
uses VM inv eng 11 on hub 0
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: ring sdma0 uses 
VM inv eng 0 on hub 1
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: ring page0 uses 
VM inv eng 1 on hub 1
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: ring sdma1 uses 
VM inv eng 4 on hub 1
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: ring page1 uses 
VM inv eng 5 on hub 1
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: ring uvd_0 uses 
VM inv eng 6 on hub 1
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: ring uvd_enc_0.0 
uses VM inv eng 7 on hub 1
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: ring uvd_enc_0.1 
uses VM inv eng 8 on hub 1
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: ring vce0 uses 
VM inv eng 9 on hub 1
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: ring vce1 uses 
VM inv eng 10 on hub 1
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: ring vce2 uses 
VM inv eng 11 on hub 1

Feb 25 14:32:22 brihaspati kernel: [drm] ECC is not present.
Feb 25 14:32:22 brihaspati kernel: [drm] SRAM ECC is not present.
Feb 25 14:32:22 brihaspati kernel: [drm] recover vram bo from shadow start
Feb 25 14:32:22 brihaspati kernel: [drm] recover vram bo from shadow done
Feb 25 14:32:22 brihaspati kernel: [drm] Skip scheduling IBs!
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: GPU reset(2) 
succeeded!
Feb 25 14:32:22 brihaspati kernel: gmc_v9_0_process_interrupt: 45 
callbacks suppressed
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: [gfxhub0] retry 
page fault (src_id:0 ring:0 vmid:4 pasid:32769, for process amdgpu_test 
pid 2401 thread amdgpu_test pid 2401)
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0:   in page 
starting at address 0xdeadb000 from client 27
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: 
VM_L2_PROTECTION_FAULT_STATUS:0x00440C51

Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: MORE_FAULTS: 0x1
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: WALKER_ERROR: 0x0
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: 
PERMISSION_FAULTS: 0x5

Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: MAPPING_ERROR: 0x0
Feb 25 14:32:22 brihaspati kernel: amdgpu :09:00.0: RW: 0x1
Feb 25 14:32:23 brihaspati systemd[1]: 
NetworkManager-dispatcher.service: Succeeded.
Feb 25 14:32:24 brihaspati nscd[1255]: 1255 checking for monitored file 
`/etc/services': No such file or directory

Feb 25 14:32:32 brihaspati PackageKit[2092]: daemon quit
Feb 25 14:32:32 brihaspati systemd[1]: packagekit.service: Succeeded.
Feb 25 14:32:39 brihaspati nscd[1255]: 1255 checking for monitored fil

Re: [PATCH] drm/amdgpu: stop disable the scheduler during HW fini

2020-02-25 Thread Christian König

Am 25.02.20 um 14:16 schrieb Nirmoy:

Acked-by: Nirmoy Das 


Could you test it as well? I only did a quick round of smoke tests, but 
somebody should probably run a gpu reset test as well.


Thanks in advance,
Christian.



On 2/25/20 2:07 PM, Christian König wrote:

When we stop the HW for example for GPU reset we should not stop the
front-end scheduler. Otherwise we run into intermediate failures during
command submission.

The scheduler should only be stopped in very few cases:
1. We can't get the hardware working in ring or IB test after a GPU 
reset.
2. The KIQ scheduler is not used in the front-end and should be 
disabled during GPU reset.

3. In amdgpu_ring_fini() when the driver unloads.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/cik_sdma.c  |  2 --
  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c |  8 
  drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c  |  5 -
  drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 25 +
  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  |  7 ---
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  |  9 -
  drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c |  3 ---
  drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c |  2 --
  drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c |  2 --
  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c |  4 
  drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c |  3 ---
  drivers/gpu/drm/amd/amdgpu/si_dma.c    |  1 -
  drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c  |  3 ---
  drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c  |  3 ---
  drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c  |  3 ---
  drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c  |  7 ---
  drivers/gpu/drm/amd/amdgpu/vce_v4_0.c  |  4 
  drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c  |  3 ---
  drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c  |  9 -
  drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c  | 11 +--
  20 files changed, 10 insertions(+), 104 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c 
b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c

index 4274ccf765de..cb3b3a0a1348 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -320,8 +320,6 @@ static void cik_sdma_gfx_stop(struct 
amdgpu_device *adev)

  WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i], rb_cntl);
  WREG32(mmSDMA0_GFX_IB_CNTL + sdma_offsets[i], 0);
  }
-    sdma0->sched.ready = false;
-    sdma1->sched.ready = false;
  }
    /**
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c

index 7b6158320400..36ce67ce4800 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -2391,10 +2391,6 @@ static int gfx_v10_0_cp_gfx_enable(struct 
amdgpu_device *adev, bool enable)

  tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, ME_HALT, enable ? 0 : 1);
  tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, PFP_HALT, enable ? 0 : 1);
  tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, CE_HALT, enable ? 0 : 1);
-    if (!enable) {
-    for (i = 0; i < adev->gfx.num_gfx_rings; i++)
-    adev->gfx.gfx_ring[i].sched.ready = false;
-    }
  WREG32_SOC15(GC, 0, mmCP_ME_CNTL, tmp);
    for (i = 0; i < adev->usec_timeout; i++) {
@@ -2869,16 +2865,12 @@ static int gfx_v10_0_cp_gfx_resume(struct 
amdgpu_device *adev)
    static void gfx_v10_0_cp_compute_enable(struct amdgpu_device 
*adev, bool enable)

  {
-    int i;
-
  if (enable) {
  WREG32_SOC15(GC, 0, mmCP_MEC_CNTL, 0);
  } else {
  WREG32_SOC15(GC, 0, mmCP_MEC_CNTL,
   (CP_MEC_CNTL__MEC_ME1_HALT_MASK |
    CP_MEC_CNTL__MEC_ME2_HALT_MASK));
-    for (i = 0; i < adev->gfx.num_compute_rings; i++)
-    adev->gfx.compute_ring[i].sched.ready = false;
  adev->gfx.kiq.ring.sched.ready = false;
  }
  udelay(50);
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c

index 31f44d05e606..e462a099dbda 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
@@ -1950,7 +1950,6 @@ static int gfx_v6_0_ring_test_ib(struct 
amdgpu_ring *ring, long timeout)
    static void gfx_v6_0_cp_gfx_enable(struct amdgpu_device *adev, 
bool enable)

  {
-    int i;
  if (enable) {
  WREG32(mmCP_ME_CNTL, 0);
  } else {
@@ -1958,10 +1957,6 @@ static void gfx_v6_0_cp_gfx_enable(struct 
amdgpu_device *adev, bool enable)

    CP_ME_CNTL__PFP_HALT_MASK |
    CP_ME_CNTL__CE_HALT_MASK));
  WREG32(mmSCRATCH_UMSK, 0);
-    for (i = 0; i < adev->gfx.num_gfx_rings; i++)
-    adev->gfx.gfx_ring[i].sched.ready = false;
-    for (i = 0; i < adev->gfx.num_compute_rings; i++)
-    adev->gfx.compute_ring[i].sched.ready = false;
  }
  udelay(50);
  }
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c

index 8f20a5dd44fe..9bc8673c83ac 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -2431,15 +2431,12 @@ static 

Re: [PATCH] drm/amdgpu: stop disable the scheduler during HW fini

2020-02-25 Thread Nirmoy

Acked-by: Nirmoy Das 

On 2/25/20 2:07 PM, Christian König wrote:

When we stop the HW for example for GPU reset we should not stop the
front-end scheduler. Otherwise we run into intermediate failures during
command submission.

The scheduler should only be stopped in very few cases:
1. We can't get the hardware working in ring or IB test after a GPU reset.
2. The KIQ scheduler is not used in the front-end and should be disabled during 
GPU reset.
3. In amdgpu_ring_fini() when the driver unloads.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/cik_sdma.c  |  2 --
  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c |  8 
  drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c  |  5 -
  drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 25 +
  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  |  7 ---
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  |  9 -
  drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c |  3 ---
  drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c |  2 --
  drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c |  2 --
  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c |  4 
  drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c |  3 ---
  drivers/gpu/drm/amd/amdgpu/si_dma.c|  1 -
  drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c  |  3 ---
  drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c  |  3 ---
  drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c  |  3 ---
  drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c  |  7 ---
  drivers/gpu/drm/amd/amdgpu/vce_v4_0.c  |  4 
  drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c  |  3 ---
  drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c  |  9 -
  drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c  | 11 +--
  20 files changed, 10 insertions(+), 104 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c 
b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index 4274ccf765de..cb3b3a0a1348 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -320,8 +320,6 @@ static void cik_sdma_gfx_stop(struct amdgpu_device *adev)
WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i], rb_cntl);
WREG32(mmSDMA0_GFX_IB_CNTL + sdma_offsets[i], 0);
}
-   sdma0->sched.ready = false;
-   sdma1->sched.ready = false;
  }
  
  /**

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 7b6158320400..36ce67ce4800 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -2391,10 +2391,6 @@ static int gfx_v10_0_cp_gfx_enable(struct amdgpu_device 
*adev, bool enable)
tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, ME_HALT, enable ? 0 : 1);
tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, PFP_HALT, enable ? 0 : 1);
tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, CE_HALT, enable ? 0 : 1);
-   if (!enable) {
-   for (i = 0; i < adev->gfx.num_gfx_rings; i++)
-   adev->gfx.gfx_ring[i].sched.ready = false;
-   }
WREG32_SOC15(GC, 0, mmCP_ME_CNTL, tmp);
  
  	for (i = 0; i < adev->usec_timeout; i++) {

@@ -2869,16 +2865,12 @@ static int gfx_v10_0_cp_gfx_resume(struct amdgpu_device 
*adev)
  
  static void gfx_v10_0_cp_compute_enable(struct amdgpu_device *adev, bool enable)

  {
-   int i;
-
if (enable) {
WREG32_SOC15(GC, 0, mmCP_MEC_CNTL, 0);
} else {
WREG32_SOC15(GC, 0, mmCP_MEC_CNTL,
 (CP_MEC_CNTL__MEC_ME1_HALT_MASK |
  CP_MEC_CNTL__MEC_ME2_HALT_MASK));
-   for (i = 0; i < adev->gfx.num_compute_rings; i++)
-   adev->gfx.compute_ring[i].sched.ready = false;
adev->gfx.kiq.ring.sched.ready = false;
}
udelay(50);
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
index 31f44d05e606..e462a099dbda 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
@@ -1950,7 +1950,6 @@ static int gfx_v6_0_ring_test_ib(struct amdgpu_ring 
*ring, long timeout)
  
  static void gfx_v6_0_cp_gfx_enable(struct amdgpu_device *adev, bool enable)

  {
-   int i;
if (enable) {
WREG32(mmCP_ME_CNTL, 0);
} else {
@@ -1958,10 +1957,6 @@ static void gfx_v6_0_cp_gfx_enable(struct amdgpu_device 
*adev, bool enable)
  CP_ME_CNTL__PFP_HALT_MASK |
  CP_ME_CNTL__CE_HALT_MASK));
WREG32(mmSCRATCH_UMSK, 0);
-   for (i = 0; i < adev->gfx.num_gfx_rings; i++)
-   adev->gfx.gfx_ring[i].sched.ready = false;
-   for (i = 0; i < adev->gfx.num_compute_rings; i++)
-   adev->gfx.compute_ring[i].sched.ready = false;
}
udelay(50);
  }
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index 8f20a5dd44fe..9bc8673c83ac 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -2431,15 +2431,12 @@ static int g

[PATCH] drm/amdgpu: stop disable the scheduler during HW fini

2020-02-25 Thread Christian König
When we stop the HW for example for GPU reset we should not stop the
front-end scheduler. Otherwise we run into intermediate failures during
command submission.

The scheduler should only be stopped in very few cases:
1. We can't get the hardware working in ring or IB test after a GPU reset.
2. The KIQ scheduler is not used in the front-end and should be disabled during 
GPU reset.
3. In amdgpu_ring_fini() when the driver unloads.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/cik_sdma.c  |  2 --
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c |  8 
 drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c  |  5 -
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  | 25 +
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c  |  7 ---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  |  9 -
 drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c |  3 ---
 drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c |  2 --
 drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c |  2 --
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c |  4 
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c |  3 ---
 drivers/gpu/drm/amd/amdgpu/si_dma.c|  1 -
 drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c  |  3 ---
 drivers/gpu/drm/amd/amdgpu/uvd_v5_0.c  |  3 ---
 drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c  |  3 ---
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c  |  7 ---
 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c  |  4 
 drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c  |  3 ---
 drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c  |  9 -
 drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c  | 11 +--
 20 files changed, 10 insertions(+), 104 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c 
b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index 4274ccf765de..cb3b3a0a1348 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -320,8 +320,6 @@ static void cik_sdma_gfx_stop(struct amdgpu_device *adev)
WREG32(mmSDMA0_GFX_RB_CNTL + sdma_offsets[i], rb_cntl);
WREG32(mmSDMA0_GFX_IB_CNTL + sdma_offsets[i], 0);
}
-   sdma0->sched.ready = false;
-   sdma1->sched.ready = false;
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 7b6158320400..36ce67ce4800 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -2391,10 +2391,6 @@ static int gfx_v10_0_cp_gfx_enable(struct amdgpu_device 
*adev, bool enable)
tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, ME_HALT, enable ? 0 : 1);
tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, PFP_HALT, enable ? 0 : 1);
tmp = REG_SET_FIELD(tmp, CP_ME_CNTL, CE_HALT, enable ? 0 : 1);
-   if (!enable) {
-   for (i = 0; i < adev->gfx.num_gfx_rings; i++)
-   adev->gfx.gfx_ring[i].sched.ready = false;
-   }
WREG32_SOC15(GC, 0, mmCP_ME_CNTL, tmp);
 
for (i = 0; i < adev->usec_timeout; i++) {
@@ -2869,16 +2865,12 @@ static int gfx_v10_0_cp_gfx_resume(struct amdgpu_device 
*adev)
 
 static void gfx_v10_0_cp_compute_enable(struct amdgpu_device *adev, bool 
enable)
 {
-   int i;
-
if (enable) {
WREG32_SOC15(GC, 0, mmCP_MEC_CNTL, 0);
} else {
WREG32_SOC15(GC, 0, mmCP_MEC_CNTL,
 (CP_MEC_CNTL__MEC_ME1_HALT_MASK |
  CP_MEC_CNTL__MEC_ME2_HALT_MASK));
-   for (i = 0; i < adev->gfx.num_compute_rings; i++)
-   adev->gfx.compute_ring[i].sched.ready = false;
adev->gfx.kiq.ring.sched.ready = false;
}
udelay(50);
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
index 31f44d05e606..e462a099dbda 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
@@ -1950,7 +1950,6 @@ static int gfx_v6_0_ring_test_ib(struct amdgpu_ring 
*ring, long timeout)
 
 static void gfx_v6_0_cp_gfx_enable(struct amdgpu_device *adev, bool enable)
 {
-   int i;
if (enable) {
WREG32(mmCP_ME_CNTL, 0);
} else {
@@ -1958,10 +1957,6 @@ static void gfx_v6_0_cp_gfx_enable(struct amdgpu_device 
*adev, bool enable)
  CP_ME_CNTL__PFP_HALT_MASK |
  CP_ME_CNTL__CE_HALT_MASK));
WREG32(mmSCRATCH_UMSK, 0);
-   for (i = 0; i < adev->gfx.num_gfx_rings; i++)
-   adev->gfx.gfx_ring[i].sched.ready = false;
-   for (i = 0; i < adev->gfx.num_compute_rings; i++)
-   adev->gfx.compute_ring[i].sched.ready = false;
}
udelay(50);
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index 8f20a5dd44fe..9bc8673c83ac 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -2431,15 +2431,12 @@ static int gfx_v7_0_ring_test_ib(struct amdgpu_ring 
*ring, long timeout)
  */
 static void gfx_v7_0_cp_gfx_en