[PATCH] drm/amdgpu: Return -EINVAL when whole gpu reset happened

2020-12-09 Thread Liu ChengZhe
If CS init return -ECANCELED, UMD will free and create new context.
Job in this new context could conitnue exexcuting. In the case of
BACO or mode 1, we can't allow this happpen. Because VRAM has lost
after whole gpu reset, the job can't guarantee to succeed.

Signed-off-by: Liu ChengZhe 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 85e48c29a57c..2a98f58134ed 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -120,6 +120,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser 
*p, union drm_amdgpu_cs
uint64_t *chunk_array;
unsigned size, num_ibs = 0;
uint32_t uf_offset = 0;
+   uint32_t vramlost_count = 0;
int i;
int ret;
 
@@ -140,7 +141,11 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser 
*p, union drm_amdgpu_cs
 
/* skip guilty context job */
if (atomic_read(>ctx->guilty) == 1) {
-   ret = -ECANCELED;
+   vramlost_count = atomic_read(>adev->vram_lost_counter);
+   if (p->ctx->vram_lost_counter != vramlost_count)
+   ret = -EINVAL;
+   else
+   ret = -ECANCELED;
goto free_chunk;
}
 
@@ -246,7 +251,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser 
*p, union drm_amdgpu_cs
goto free_all_kdata;
 
if (p->ctx->vram_lost_counter != p->job->vram_lost_counter) {
-   ret = -ECANCELED;
+   ret = -EINVAL;
goto free_all_kdata;
}
 
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: Do gpu recovery when no job is running

2020-09-09 Thread Liu ChengZhe
In function flr_work, we should do gpu recovery when no job
is running. Fix the logic by inverting it.

v2: modify the description

Signed-off-by: Liu ChengZhe 
---
 drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 3 ++-
 drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c 
b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
index 9c07014d9bd6..f5ce9a9f4cf5 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
@@ -262,7 +262,8 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct 
*work)
 
/* Trigger recovery for world switch failure if no TDR */
if (amdgpu_device_should_recover_gpu(adev)
-   && (amdgpu_device_has_job_running(adev) || adev->sdma_timeout 
== MAX_SCHEDULE_TIMEOUT))
+   && (!amdgpu_device_has_job_running(adev) ||
+   adev->sdma_timeout == MAX_SCHEDULE_TIMEOUT))
amdgpu_device_gpu_recover(adev, NULL);
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c 
b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
index 9c23abf9b140..666ed99cc14b 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
@@ -283,7 +283,7 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct 
*work)
 
/* Trigger recovery for world switch failure if no TDR */
if (amdgpu_device_should_recover_gpu(adev)
-   && (amdgpu_device_has_job_running(adev) ||
+   && (!amdgpu_device_has_job_running(adev) ||
adev->sdma_timeout == MAX_SCHEDULE_TIMEOUT ||
adev->gfx_timeout == MAX_SCHEDULE_TIMEOUT ||
adev->compute_timeout == MAX_SCHEDULE_TIMEOUT ||
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: Do gpu recovery when no job is running

2020-09-09 Thread Liu ChengZhe
In function flr_work, do gpu recovery when no job is running
instead of when some job is running. Because if there is job
in list, amdgpu_job_timedout will do the gpu recovery.

Signed-off-by: Liu ChengZhe 
---
 drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 3 ++-
 drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c 
b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
index 9c07014d9bd6..f5ce9a9f4cf5 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
@@ -262,7 +262,8 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct 
*work)
 
/* Trigger recovery for world switch failure if no TDR */
if (amdgpu_device_should_recover_gpu(adev)
-   && (amdgpu_device_has_job_running(adev) || adev->sdma_timeout 
== MAX_SCHEDULE_TIMEOUT))
+   && (!amdgpu_device_has_job_running(adev) ||
+   adev->sdma_timeout == MAX_SCHEDULE_TIMEOUT))
amdgpu_device_gpu_recover(adev, NULL);
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c 
b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
index 9c23abf9b140..666ed99cc14b 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
@@ -283,7 +283,7 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct 
*work)
 
/* Trigger recovery for world switch failure if no TDR */
if (amdgpu_device_should_recover_gpu(adev)
-   && (amdgpu_device_has_job_running(adev) ||
+   && (!amdgpu_device_has_job_running(adev) ||
adev->sdma_timeout == MAX_SCHEDULE_TIMEOUT ||
adev->gfx_timeout == MAX_SCHEDULE_TIMEOUT ||
adev->compute_timeout == MAX_SCHEDULE_TIMEOUT ||
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: Skip some registers config for SRIOV

2020-08-07 Thread Liu ChengZhe
Some registers are not accessible to virtual function setup, so
skip their initialization when in VF-SRIOV mode.

v2: move SRIOV VF check into specify functions;
modify commit description and comment.

Signed-off-by: Liu ChengZhe 
---
 drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c | 19 +++
 drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c  | 19 +++
 2 files changed, 38 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c 
b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c
index 1f6112b7fa49..80c906a0383f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c
@@ -182,6 +182,12 @@ static void gfxhub_v2_1_init_cache_regs(struct 
amdgpu_device *adev)
 {
uint32_t tmp;
 
+   /* These registers are not accessible to VF-SRIOV.
+* The PF will program them instead.
+*/
+   if (amdgpu_sriov_vf(adev))
+   return;
+
/* Setup L2 cache */
tmp = RREG32_SOC15(GC, 0, mmGCVM_L2_CNTL);
tmp = REG_SET_FIELD(tmp, GCVM_L2_CNTL, ENABLE_L2_CACHE, 1);
@@ -237,6 +243,12 @@ static void gfxhub_v2_1_enable_system_domain(struct 
amdgpu_device *adev)
 
 static void gfxhub_v2_1_disable_identity_aperture(struct amdgpu_device *adev)
 {
+   /* These registers are not accessible to VF-SRIOV.
+* The PF will program them instead.
+*/
+   if (amdgpu_sriov_vf(adev))
+   return;
+
WREG32_SOC15(GC, 0, mmGCVM_L2_CONTEXT1_IDENTITY_APERTURE_LOW_ADDR_LO32,
 0x);
WREG32_SOC15(GC, 0, mmGCVM_L2_CONTEXT1_IDENTITY_APERTURE_LOW_ADDR_HI32,
@@ -373,6 +385,13 @@ void gfxhub_v2_1_set_fault_enable_default(struct 
amdgpu_device *adev,
  bool value)
 {
u32 tmp;
+
+   /* These registers are not accessible to VF-SRIOV.
+* The PF will program them instead.
+*/
+   if (amdgpu_sriov_vf(adev))
+   return;
+
tmp = RREG32_SOC15(GC, 0, mmGCVM_L2_PROTECTION_FAULT_CNTL);
tmp = REG_SET_FIELD(tmp, GCVM_L2_PROTECTION_FAULT_CNTL,
RANGE_PROTECTION_FAULT_ENABLE_DEFAULT, value);
diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c 
b/drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c
index d83912901f73..8acb3b625afe 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c
@@ -181,6 +181,12 @@ static void mmhub_v2_0_init_cache_regs(struct 
amdgpu_device *adev)
 {
uint32_t tmp;
 
+   /* These registers are not accessible to VF-SRIOV.
+* The PF will program them instead.
+*/
+   if (amdgpu_sriov_vf(adev))
+   return;
+
/* Setup L2 cache */
tmp = RREG32_SOC15(MMHUB, 0, mmMMVM_L2_CNTL);
tmp = REG_SET_FIELD(tmp, MMVM_L2_CNTL, ENABLE_L2_CACHE, 1);
@@ -236,6 +242,12 @@ static void mmhub_v2_0_enable_system_domain(struct 
amdgpu_device *adev)
 
 static void mmhub_v2_0_disable_identity_aperture(struct amdgpu_device *adev)
 {
+   /* These registers are not accessible to VF-SRIOV.
+* The PF will program them instead.
+*/
+   if (amdgpu_sriov_vf(adev))
+   return;
+
WREG32_SOC15(MMHUB, 0,
 mmMMVM_L2_CONTEXT1_IDENTITY_APERTURE_LOW_ADDR_LO32,
 0x);
@@ -365,6 +377,13 @@ void mmhub_v2_0_gart_disable(struct amdgpu_device *adev)
 void mmhub_v2_0_set_fault_enable_default(struct amdgpu_device *adev, bool 
value)
 {
u32 tmp;
+
+   /* These registers are not accessible to VF-SRIOV.
+* The PF will program them instead.
+*/
+   if (amdgpu_sriov_vf(adev))
+   return;
+
tmp = RREG32_SOC15(MMHUB, 0, mmMMVM_L2_PROTECTION_FAULT_CNTL);
tmp = REG_SET_FIELD(tmp, MMVM_L2_PROTECTION_FAULT_CNTL,
RANGE_PROTECTION_FAULT_ENABLE_DEFAULT, value);
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: Skip some registers config for SRIOV

2020-08-06 Thread Liu ChengZhe
For VF, registers L2_CNTL, L2_CONTEXT1_IDENTITY_APERTURE
L2_PROTECTION_FAULT_CNTL are not accesible, skip the
configuration for them in SRIOV mode.

Signed-off-by: Liu ChengZhe 
---
 drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c | 12 ++--
 drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c  | 12 ++--
 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c 
b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c
index 1f6112b7fa49..6b96f45fde2a 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v2_1.c
@@ -330,10 +330,13 @@ int gfxhub_v2_1_gart_enable(struct amdgpu_device *adev)
gfxhub_v2_1_init_gart_aperture_regs(adev);
gfxhub_v2_1_init_system_aperture_regs(adev);
gfxhub_v2_1_init_tlb_regs(adev);
-   gfxhub_v2_1_init_cache_regs(adev);
+   if (!amdgpu_sriov_vf(adev))
+   gfxhub_v2_1_init_cache_regs(adev);
 
gfxhub_v2_1_enable_system_domain(adev);
-   gfxhub_v2_1_disable_identity_aperture(adev);
+   if (!amdgpu_sriov_vf(adev))
+   gfxhub_v2_1_disable_identity_aperture(adev);
+
gfxhub_v2_1_setup_vmid_config(adev);
gfxhub_v2_1_program_invalidation(adev);
 
@@ -372,7 +375,12 @@ void gfxhub_v2_1_gart_disable(struct amdgpu_device *adev)
 void gfxhub_v2_1_set_fault_enable_default(struct amdgpu_device *adev,
  bool value)
 {
+   /*These regs are not accessible for VF, PF will program in SRIOV */
+   if (amdgpu_sriov_vf(adev))
+   return;
+
u32 tmp;
+
tmp = RREG32_SOC15(GC, 0, mmGCVM_L2_PROTECTION_FAULT_CNTL);
tmp = REG_SET_FIELD(tmp, GCVM_L2_PROTECTION_FAULT_CNTL,
RANGE_PROTECTION_FAULT_ENABLE_DEFAULT, value);
diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c 
b/drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c
index d83912901f73..9cfde9b81600 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c
@@ -321,10 +321,13 @@ int mmhub_v2_0_gart_enable(struct amdgpu_device *adev)
mmhub_v2_0_init_gart_aperture_regs(adev);
mmhub_v2_0_init_system_aperture_regs(adev);
mmhub_v2_0_init_tlb_regs(adev);
-   mmhub_v2_0_init_cache_regs(adev);
+   if (!amdgpu_sriov_vf(adev))
+   mmhub_v2_0_init_cache_regs(adev);
 
mmhub_v2_0_enable_system_domain(adev);
-   mmhub_v2_0_disable_identity_aperture(adev);
+   if (!amdgpu_sriov_vf(adev))
+   mmhub_v2_0_disable_identity_aperture(adev);
+
mmhub_v2_0_setup_vmid_config(adev);
mmhub_v2_0_program_invalidation(adev);
 
@@ -364,7 +367,12 @@ void mmhub_v2_0_gart_disable(struct amdgpu_device *adev)
  */
 void mmhub_v2_0_set_fault_enable_default(struct amdgpu_device *adev, bool 
value)
 {
+   /*These regs are not accessible for VF, PF will program in SRIOV */
+   if (amdgpu_sriov_vf(adev))
+   return;
+
u32 tmp;
+
tmp = RREG32_SOC15(MMHUB, 0, mmMMVM_L2_PROTECTION_FAULT_CNTL);
tmp = REG_SET_FIELD(tmp, MMVM_L2_PROTECTION_FAULT_CNTL,
RANGE_PROTECTION_FAULT_ENABLE_DEFAULT, value);
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: fix PSP autoload twice in FLR

2020-07-29 Thread Liu ChengZhe
Assigning false to block->status.hw overwrites PSP's previous
hardware status, which causes the PSP to Resume operation after
hardware init.

Remove this assignment and let the PSP execute Resume operation
when it is told to.

v2: Remove the braces.
v3: Modify the description.

Signed-off-by: Liu ChengZhe 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 62ecac97fbd2..5d9affa1d35a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2574,6 +2574,9 @@ static int amdgpu_device_ip_reinit_early_sriov(struct 
amdgpu_device *adev)
AMD_IP_BLOCK_TYPE_IH,
};
 
+   for (i = 0; i < adev->num_ip_blocks; i++)
+   adev->ip_blocks[i].status.hw = false;
+
for (i = 0; i < ARRAY_SIZE(ip_order); i++) {
int j;
struct amdgpu_ip_block *block;
@@ -2581,7 +2584,6 @@ static int amdgpu_device_ip_reinit_early_sriov(struct 
amdgpu_device *adev)
for (j = 0; j < adev->num_ip_blocks; j++) {
block = >ip_blocks[j];
 
-   block->status.hw = false;
if (block->version->type != ip_order[i] ||
!block->status.valid)
continue;
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/2] drm amdgpu: Skip tmr load for SRIOV

2020-07-29 Thread Liu ChengZhe
1. For Navi12, CHIP_SIENNA_CICHLID, skip tmr load operation;
2. Check pointer before release firmware.

v2: use CHIP_SIENNA_CICHLID instead
v3: remove local "bool ret"; fix grammer issue
v4: use my name instead of "root"
v5: fix grammer issue and indent issue

Signed-off-by: Liu ChengZhe 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 35 -
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index a053b7af0680..c68369731b20 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -193,12 +193,18 @@ static int psp_sw_fini(void *handle)
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
psp_memory_training_fini(>psp);
-   release_firmware(adev->psp.sos_fw);
-   adev->psp.sos_fw = NULL;
-   release_firmware(adev->psp.asd_fw);
-   adev->psp.asd_fw = NULL;
-   release_firmware(adev->psp.ta_fw);
-   adev->psp.ta_fw = NULL;
+   if (adev->psp.sos_fw) {
+   release_firmware(adev->psp.sos_fw);
+   adev->psp.sos_fw = NULL;
+   }
+   if (adev->psp.asd_fw) {
+   release_firmware(adev->psp.asd_fw);
+   adev->psp.asd_fw = NULL;
+   }
+   if (adev->psp.ta_fw) {
+   release_firmware(adev->psp.ta_fw);
+   adev->psp.ta_fw = NULL;
+   }
 
if (adev->asic_type == CHIP_NAVI10)
psp_sysfs_fini(adev);
@@ -409,11 +415,28 @@ static int psp_clear_vf_fw(struct psp_context *psp)
return ret;
 }
 
+static bool psp_skip_tmr(struct psp_context *psp)
+{
+   switch (psp->adev->asic_type) {
+   case CHIP_NAVI12:
+   case CHIP_SIENNA_CICHLID:
+   return true;
+   default:
+   return false;
+   }
+}
+
 static int psp_tmr_load(struct psp_context *psp)
 {
int ret;
struct psp_gfx_cmd_resp *cmd;
 
+   /* For Navi12 and CHIP_SIENNA_CICHLID SRIOV, do not set up TMR.
+* Already set up by host driver.
+*/
+   if (amdgpu_sriov_vf(psp->adev) && psp_skip_tmr(psp))
+   return 0;
+
cmd = kzalloc(sizeof(struct psp_gfx_cmd_resp), GFP_KERNEL);
if (!cmd)
return -ENOMEM;
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/2] drm amdgpu: Skip tmr load for SRIOV

2020-07-27 Thread Liu ChengZhe
1. For Navi12, CHIP_SIENNA_CICHLID, skip tmr load operation;
2. Check pointer before release firmware.

v2: use CHIP_SIENNA_CICHLID instead
v3: remove local "bool ret"; fix grammer issue
v4: use my name instead of "root"

Signed-off-by: Liu ChengZhe 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 35 -
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index a053b7af0680..7f18286a0cc2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -193,12 +193,18 @@ static int psp_sw_fini(void *handle)
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
psp_memory_training_fini(>psp);
-   release_firmware(adev->psp.sos_fw);
-   adev->psp.sos_fw = NULL;
-   release_firmware(adev->psp.asd_fw);
-   adev->psp.asd_fw = NULL;
-   release_firmware(adev->psp.ta_fw);
-   adev->psp.ta_fw = NULL;
+   if (adev->psp.sos_fw) {
+   release_firmware(adev->psp.sos_fw);
+   adev->psp.sos_fw = NULL;
+   }
+   if (adev->psp.asd_fw) {
+   release_firmware(adev->psp.asd_fw);
+   adev->psp.asd_fw = NULL;
+   }
+   if (adev->psp.ta_fw) {
+   release_firmware(adev->psp.ta_fw);
+   adev->psp.ta_fw = NULL;
+   }
 
if (adev->asic_type == CHIP_NAVI10)
psp_sysfs_fini(adev);
@@ -409,11 +415,28 @@ static int psp_clear_vf_fw(struct psp_context *psp)
return ret;
 }
 
+static bool psp_skip_tmr(struct psp_context *psp)
+{
+   switch (psp->adev->asic_type) {
+   case CHIP_NAVI12:
+   case CHIP_SIENNA_CICHLID:
+   return true;
+   default:
+   return false;
+   }
+}
+
 static int psp_tmr_load(struct psp_context *psp)
 {
int ret;
struct psp_gfx_cmd_resp *cmd;
 
+   /* for Navi12 and CHIP_SIENNA_CICHLID SRIOV, do not set up TMR
+* (already setup by host driver)
+*/
+   if (amdgpu_sriov_vf(psp->adev) && psp_skip_tmr(psp))
+   return 0;
+
cmd = kzalloc(sizeof(struct psp_gfx_cmd_resp), GFP_KERNEL);
if (!cmd)
return -ENOMEM;
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/2] drm amdgpu: Skip tmr load for SRIOV

2020-07-27 Thread Liu ChengZhe
From: root 

1. For Navi12, CHIP_SIENNA_CICHLID, skip tmr load operation;
2. Check pointer before release firmware.

v2: use CHIP_SIENNA_CICHLID instead
v3: remove local "bool ret"; fix grammer issue
Signed-off-by: root 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 35 -
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index a053b7af0680..7f18286a0cc2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -193,12 +193,18 @@ static int psp_sw_fini(void *handle)
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
psp_memory_training_fini(>psp);
-   release_firmware(adev->psp.sos_fw);
-   adev->psp.sos_fw = NULL;
-   release_firmware(adev->psp.asd_fw);
-   adev->psp.asd_fw = NULL;
-   release_firmware(adev->psp.ta_fw);
-   adev->psp.ta_fw = NULL;
+   if (adev->psp.sos_fw) {
+   release_firmware(adev->psp.sos_fw);
+   adev->psp.sos_fw = NULL;
+   }
+   if (adev->psp.asd_fw) {
+   release_firmware(adev->psp.asd_fw);
+   adev->psp.asd_fw = NULL;
+   }
+   if (adev->psp.ta_fw) {
+   release_firmware(adev->psp.ta_fw);
+   adev->psp.ta_fw = NULL;
+   }
 
if (adev->asic_type == CHIP_NAVI10)
psp_sysfs_fini(adev);
@@ -409,11 +415,28 @@ static int psp_clear_vf_fw(struct psp_context *psp)
return ret;
 }
 
+static bool psp_skip_tmr(struct psp_context *psp)
+{
+   switch (psp->adev->asic_type) {
+   case CHIP_NAVI12:
+   case CHIP_SIENNA_CICHLID:
+   return true;
+   default:
+   return false;
+   }
+}
+
 static int psp_tmr_load(struct psp_context *psp)
 {
int ret;
struct psp_gfx_cmd_resp *cmd;
 
+   /* for Navi12 and CHIP_SIENNA_CICHLID SRIOV, do not set up TMR
+* (already setup by host driver)
+*/
+   if (amdgpu_sriov_vf(psp->adev) && psp_skip_tmr(psp))
+   return 0;
+
cmd = kzalloc(sizeof(struct psp_gfx_cmd_resp), GFP_KERNEL);
if (!cmd)
return -ENOMEM;
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: fix PSP autoload twice in FLR

2020-07-27 Thread Liu ChengZhe
the block->status.hw = false assignment will overwrite PSP's previous
hw status, which will cause PSP execute resume operation after hw init.

v2: remove the braces

Signed-off-by: Liu ChengZhe 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 62ecac97fbd2..5d9affa1d35a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2574,6 +2574,9 @@ static int amdgpu_device_ip_reinit_early_sriov(struct 
amdgpu_device *adev)
AMD_IP_BLOCK_TYPE_IH,
};
 
+   for (i = 0; i < adev->num_ip_blocks; i++)
+   adev->ip_blocks[i].status.hw = false;
+
for (i = 0; i < ARRAY_SIZE(ip_order); i++) {
int j;
struct amdgpu_ip_block *block;
@@ -2581,7 +2584,6 @@ static int amdgpu_device_ip_reinit_early_sriov(struct 
amdgpu_device *adev)
for (j = 0; j < adev->num_ip_blocks; j++) {
block = >ip_blocks[j];
 
-   block->status.hw = false;
if (block->version->type != ip_order[i] ||
!block->status.valid)
continue;
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: fix PSP autoload twice in FLR

2020-07-27 Thread Liu ChengZhe
the block->status.hw = false assignment will overwrite PSP's previous
hw status, which will cause PSP execute resume operation after hw init.

Signed-off-by: Liu ChengZhe 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 62ecac97fbd2..88c681957d39 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2574,6 +2574,10 @@ static int amdgpu_device_ip_reinit_early_sriov(struct 
amdgpu_device *adev)
AMD_IP_BLOCK_TYPE_IH,
};
 
+   for (i = 0; i < adev->num_ip_blocks; i++) {
+   adev->ip_blocks[i].status.hw = false;
+   }
+
for (i = 0; i < ARRAY_SIZE(ip_order); i++) {
int j;
struct amdgpu_ip_block *block;
@@ -2581,7 +2585,6 @@ static int amdgpu_device_ip_reinit_early_sriov(struct 
amdgpu_device *adev)
for (j = 0; j < adev->num_ip_blocks; j++) {
block = >ip_blocks[j];
 
-   block->status.hw = false;
if (block->version->type != ip_order[i] ||
!block->status.valid)
continue;
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/2] drm amdgpu: Skip tmr load for SRIOV

2020-07-27 Thread Liu ChengZhe
From: root 

1. For Navi12, CHIP_SIENNA_CICHLID, skip tmr load operation;
2. Check pointer before release firmware.

Signed-off-by: root 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 40 +
 1 file changed, 34 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index a053b7af0680..a9481e112cb3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -193,12 +193,18 @@ static int psp_sw_fini(void *handle)
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
psp_memory_training_fini(>psp);
-   release_firmware(adev->psp.sos_fw);
-   adev->psp.sos_fw = NULL;
-   release_firmware(adev->psp.asd_fw);
-   adev->psp.asd_fw = NULL;
-   release_firmware(adev->psp.ta_fw);
-   adev->psp.ta_fw = NULL;
+   if (adev->psp.sos_fw) {
+   release_firmware(adev->psp.sos_fw);
+   adev->psp.sos_fw = NULL;
+   }
+   if (adev->psp.asd_fw) {
+   release_firmware(adev->psp.asd_fw);
+   adev->psp.asd_fw = NULL;
+   }
+   if (adev->psp.ta_fw) {
+   release_firmware(adev->psp.ta_fw);
+   adev->psp.ta_fw = NULL;
+   }
 
if (adev->asic_type == CHIP_NAVI10)
psp_sysfs_fini(adev);
@@ -409,11 +415,33 @@ static int psp_clear_vf_fw(struct psp_context *psp)
return ret;
 }
 
+static bool psp_skip_tmr(struct psp_context *psp)
+{
+   bool ret = false;
+
+   switch (psp->adev->asic_type) {
+   case CHIP_NAVI12:
+   case CHIP_SIENNA_CICHLID:
+   ret = true;
+   break;
+   default:
+   return false;
+   }
+
+   return ret;
+}
+
 static int psp_tmr_load(struct psp_context *psp)
 {
int ret;
struct psp_gfx_cmd_resp *cmd;
 
+   /* for Navi12 and CHIP_SIENNA_CICHLID SRIOV, do not setup TMR
+* (already setup by host driver)
+*/
+   if (amdgpu_sriov_vf(psp->adev) && psp_skip_tmr(psp))
+   return 0;
+
cmd = kzalloc(sizeof(struct psp_gfx_cmd_resp), GFP_KERNEL);
if (!cmd)
return -ENOMEM;
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm amdgpu: Skip tmr load for SRIOV

2020-07-27 Thread Liu ChengZhe
From: root 

1. For Navi12, Navi21, skip tmr load operation;
2. Check pointer before release firmware.

Signed-off-by: root 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 40 +
 1 file changed, 34 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index a053b7af0680..b0717b16b5d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -193,12 +193,18 @@ static int psp_sw_fini(void *handle)
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
psp_memory_training_fini(>psp);
-   release_firmware(adev->psp.sos_fw);
-   adev->psp.sos_fw = NULL;
-   release_firmware(adev->psp.asd_fw);
-   adev->psp.asd_fw = NULL;
-   release_firmware(adev->psp.ta_fw);
-   adev->psp.ta_fw = NULL;
+   if (adev->psp.sos_fw) {
+   release_firmware(adev->psp.sos_fw);
+   adev->psp.sos_fw = NULL;
+   }
+   if (adev->psp.asd_fw) {
+   release_firmware(adev->psp.asd_fw);
+   adev->psp.asd_fw = NULL;
+   }
+   if (adev->psp.ta_fw) {
+   release_firmware(adev->psp.ta_fw);
+   adev->psp.ta_fw = NULL;
+   }
 
if (adev->asic_type == CHIP_NAVI10)
psp_sysfs_fini(adev);
@@ -409,11 +415,33 @@ static int psp_clear_vf_fw(struct psp_context *psp)
return ret;
 }
 
+static bool psp_skip_tmr(struct psp_context *psp)
+{
+   bool ret = false;
+
+   switch (psp->adev->asic_type) {
+   case CHIP_NAVI12:
+   case CHIP_SIENNA_CICHLID:
+   ret = true;
+   break;
+   default:
+   return false;
+   }
+
+   return ret;
+}
+
 static int psp_tmr_load(struct psp_context *psp)
 {
int ret;
struct psp_gfx_cmd_resp *cmd;
 
+   /* for Navi12 and Navi21 SRIOV, do not setup TMR
+* (already setup by host driver)
+*/
+   if (amdgpu_sriov_vf(psp->adev) && psp_skip_tmr(psp))
+   return 0;
+
cmd = kzalloc(sizeof(struct psp_gfx_cmd_resp), GFP_KERNEL);
if (!cmd)
return -ENOMEM;
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: fix PSP autoload twice in FLR

2020-07-24 Thread Liu ChengZhe
the block->status.hw = false assignment will overwrite PSP's previous
hw status, which will cause PSP execute resume operation after hw init.

Signed-off-by: Liu ChengZhe 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 62ecac97fbd2..88c681957d39 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2574,6 +2574,10 @@ static int amdgpu_device_ip_reinit_early_sriov(struct 
amdgpu_device *adev)
AMD_IP_BLOCK_TYPE_IH,
};
 
+   for (i = 0; i < adev->num_ip_blocks; i++) {
+   adev->ip_blocks[i].status.hw = false;
+   }
+
for (i = 0; i < ARRAY_SIZE(ip_order); i++) {
int j;
struct amdgpu_ip_block *block;
@@ -2581,7 +2585,6 @@ static int amdgpu_device_ip_reinit_early_sriov(struct 
amdgpu_device *adev)
for (j = 0; j < adev->num_ip_blocks; j++) {
block = >ip_blocks[j];
 
-   block->status.hw = false;
if (block->version->type != ip_order[i] ||
!block->status.valid)
continue;
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/amdgpu: handle return value of amdgpu_driver_load_kms

2020-06-09 Thread Liu ChengZhe
if guest driver failed to enter full GPU access, amdgpu_driver_load_kms
will unload kms and free dev->dev_private, drm_dev_register would access
null pointer. Driver will enter an error state and can't be unloaded.

Signed-off-by: Liu ChengZhe 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 667aad1f15c0..9c81a3d0b546 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1165,7 +1165,9 @@ static int amdgpu_pci_probe(struct pci_dev *pdev,
 
pci_set_drvdata(pdev, dev);
 
-   amdgpu_driver_load_kms(dev, ent->driver_data);
+   ret = amdgpu_driver_load_kms(dev, ent->driver_data);
+   if (ret)
+   goto err_pci;
 
 retry_init:
ret = drm_dev_register(dev, ent->driver_data);
-- 
2.25.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx