Hi Christian, I have talked with hw team for the reason why adding the masks. the answer is "bits 24-27 of the VCE_VCPU_CACHE_OFFSETx registers should be set to the cache window # (0 for window 0, 1 for window 1, etc.)"
Best Regards, Frank -----邮件原件----- 发件人: Min, Frank 发送时间: 2017年11月23日 12:08 收件人: Koenig, Christian <christian.koe...@amd.com>; amd-gfx@lists.freedesktop.org; Liu, Leo <leo....@amd.com> 主题: RE: [PATCH] drm/amdgpu: correct vce4.0 fw config for SRIOV (V2) Hi Leo, Would you please comments on Christian's questions? Best Regards, Frank -----Original Message----- From: Min, Frank Sent: Wednesday, November 22, 2017 4:04 PM To: Koenig, Christian <christian.koe...@amd.com>; amd-gfx@lists.freedesktop.org; Liu, Leo <leo....@amd.com> Subject: RE: [PATCH] drm/amdgpu: correct vce4.0 fw config for SRIOV (V2) Hi Christian, Thanks again for your review. And for the mask change my understanding is it is to be used for mark different part of fw (1<<24 is for stack and 2<<24 is for the data). And more detailed background would need Leo give us. Best Regards, Frank -----Original Message----- From: Koenig, Christian Sent: Wednesday, November 22, 2017 3:57 PM To: Min, Frank <frank....@amd.com>; amd-gfx@lists.freedesktop.org; Liu, Leo <leo....@amd.com> Subject: Re: [PATCH] drm/amdgpu: correct vce4.0 fw config for SRIOV (V2) Hi Frank, thanks, the patch looks much better now. > The masks using is to programming the stack and data part for vce fw. And > this part of code is borrowed from the non-sriov sequences. In this case Leo can you explain this strange masks used for the VCE_VCPU_CACHE_OFFSET* registers? > MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > mmVCE_VCPU_CACHE_OFFSET0), > - offset & 0x7FFFFFFF); > + offset & ~0x0f000000); ... > MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > mmVCE_VCPU_CACHE_OFFSET1), > - offset & 0x7FFFFFFF); > + (offset & ~0x0f000000) | (1 << 24)); Using ~0x0f000000 looks really odd here and what should the "| (1 << 24)" part be about? Thanks, Christian. Am 22.11.2017 um 06:11 schrieb Min, Frank: > Hi Christian, > Patch updated according to your suggestions. > The masks using is to programming the stack and data part for vce fw. And > this part of code is borrowed from the non-sriov sequences. > > Best Regards, > Frank > > 1. program vce 4.0 fw with 48 bit address 2. correct vce 4.0 fw stack > and date offset > > Change-Id: Ic1bc49c21d3a90c477d11162f9d6d9e2073fbbd3 > Signed-off-by: Frank Min <frank....@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 38 > +++++++++++++++++++++++------------ > 1 file changed, 25 insertions(+), 13 deletions(-) mode change > 100644 => 100755 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c > > diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c > b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c > old mode 100644 > new mode 100755 > index 7574554..024a1be > --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c > @@ -243,37 +243,49 @@ static int vce_v4_0_sriov_start(struct amdgpu_device > *adev) > MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > mmVCE_LMI_VM_CTRL), 0); > > if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) { > - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > mmVCE_LMI_VCPU_CACHE_40BIT_BAR0), > - > adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8); > - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > mmVCE_LMI_VCPU_CACHE_40BIT_BAR1), > - > adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8); > - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > mmVCE_LMI_VCPU_CACHE_40BIT_BAR2), > + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > + > mmVCE_LMI_VCPU_CACHE_40BIT_BAR0), > > adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8); > + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > + > mmVCE_LMI_VCPU_CACHE_64BIT_BAR0), > + > (adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 40) & > +0xff); > } else { > - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > mmVCE_LMI_VCPU_CACHE_40BIT_BAR0), > + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > + > mmVCE_LMI_VCPU_CACHE_40BIT_BAR0), > adev->vce.gpu_addr >> 8); > - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > mmVCE_LMI_VCPU_CACHE_40BIT_BAR1), > + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > + > mmVCE_LMI_VCPU_CACHE_64BIT_BAR0), > + (adev->vce.gpu_addr >> 40) & > 0xff); > + } > + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > + > mmVCE_LMI_VCPU_CACHE_40BIT_BAR1), > adev->vce.gpu_addr >> 8); > - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > mmVCE_LMI_VCPU_CACHE_40BIT_BAR2), > + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > + > mmVCE_LMI_VCPU_CACHE_64BIT_BAR1), > + (adev->vce.gpu_addr >> 40) & > 0xff); > + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > + > mmVCE_LMI_VCPU_CACHE_40BIT_BAR2), > adev->vce.gpu_addr >> 8); > - } > + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > + > mmVCE_LMI_VCPU_CACHE_64BIT_BAR2), > + (adev->vce.gpu_addr >> 40) & > 0xff); > > offset = AMDGPU_VCE_FIRMWARE_OFFSET; > size = VCE_V4_0_FW_SIZE; > MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > mmVCE_VCPU_CACHE_OFFSET0), > - offset & 0x7FFFFFFF); > + offset & ~0x0f000000); > MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > mmVCE_VCPU_CACHE_SIZE0), size); > > - offset += size; > + offset = (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) ? > offset > ++ size : 0; > size = VCE_V4_0_STACK_SIZE; > MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > mmVCE_VCPU_CACHE_OFFSET1), > - offset & 0x7FFFFFFF); > + (offset & ~0x0f000000) | (1 << 24)); > MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > mmVCE_VCPU_CACHE_SIZE1), size); > > offset += size; > size = VCE_V4_0_DATA_SIZE; > MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > mmVCE_VCPU_CACHE_OFFSET2), > - offset & 0x7FFFFFFF); > + (offset & ~0x0f000000) | (2 << 24)); > MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, > mmVCE_VCPU_CACHE_SIZE2), size); > > MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, > mmVCE_LMI_CTRL2), ~0x100, 0); > -- > 1.9.1 > > -----邮件原件----- > 发件人: Christian König [mailto:ckoenig.leichtzumer...@gmail.com] > 发送时间: 2017年11月21日 22:45 > 收件人: Min, Frank <frank....@amd.com>; amd-gfx@lists.freedesktop.org > 主题: Re: [PATCH] drm/amdgpu: correct vce4.0 fw config for SRIOV (V2) > > Am 21.11.2017 um 11:23 schrieb Frank Min: >> 1. program vce 4.0 fw with 48 bit address 2. correct vce 4.0 fw stack >> and date offset >> >> Change-Id: I835f3f52f3b29f996812a3948aabede9f2d9b056 >> Signed-off-by: Frank Min <frank....@amd.com> >> --- >> drivers/gpu/drm/amd/amdgpu/vce_v4_0.c | 97 >> ++++++++++++++++++++++------------- >> 1 file changed, 62 insertions(+), 35 deletions(-) >> mode change 100644 => 100755 drivers/gpu/drm/amd/amdgpu/vce_v4_0.c >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c >> b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c >> old mode 100644 >> new mode 100755 >> index 7574554..dc7b615 >> --- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c >> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c >> @@ -243,59 +243,86 @@ static int vce_v4_0_sriov_start(struct amdgpu_device >> *adev) >> MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> mmVCE_LMI_VM_CTRL), 0); >> >> if (adev->firmware.load_type == AMDGPU_FW_LOAD_PSP) { >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> mmVCE_LMI_VCPU_CACHE_40BIT_BAR0), >> - >> adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8); >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> mmVCE_LMI_VCPU_CACHE_40BIT_BAR1), >> - >> adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8); >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> mmVCE_LMI_VCPU_CACHE_40BIT_BAR2), >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + >> mmVCE_LMI_VCPU_CACHE_40BIT_BAR0), >> >> adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 8); >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + >> mmVCE_LMI_VCPU_CACHE_64BIT_BAR0), >> + >> (adev->firmware.ucode[AMDGPU_UCODE_ID_VCE].mc_addr >> 40) & >> +0xff); >> } else { >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> mmVCE_LMI_VCPU_CACHE_40BIT_BAR0), >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + >> mmVCE_LMI_VCPU_CACHE_40BIT_BAR0), >> adev->vce.gpu_addr >> 8); >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> mmVCE_LMI_VCPU_CACHE_40BIT_BAR1), >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + >> mmVCE_LMI_VCPU_CACHE_64BIT_BAR0), >> + (adev->vce.gpu_addr >> 40) & >> 0xff); >> + } >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + >> mmVCE_LMI_VCPU_CACHE_40BIT_BAR1), >> adev->vce.gpu_addr >> 8); >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> mmVCE_LMI_VCPU_CACHE_40BIT_BAR2), >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + >> mmVCE_LMI_VCPU_CACHE_64BIT_BAR1), >> + (adev->vce.gpu_addr >> 40) & >> 0xff); >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + >> mmVCE_LMI_VCPU_CACHE_40BIT_BAR2), >> adev->vce.gpu_addr >> 8); >> - } >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + >> mmVCE_LMI_VCPU_CACHE_64BIT_BAR2), >> + (adev->vce.gpu_addr >> 40) & >> 0xff); >> >> offset = AMDGPU_VCE_FIRMWARE_OFFSET; >> size = VCE_V4_0_FW_SIZE; >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> mmVCE_VCPU_CACHE_OFFSET0), >> - offset & 0x7FFFFFFF); >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> mmVCE_VCPU_CACHE_SIZE0), size); >> - >> - offset += size; >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_VCPU_CACHE_OFFSET0), >> + offset & ~0x0f000000); >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_VCPU_CACHE_SIZE0), size); >> + >> + offset = (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) ? >> + offset + size : 0; >> size = VCE_V4_0_STACK_SIZE; >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> mmVCE_VCPU_CACHE_OFFSET1), >> - offset & 0x7FFFFFFF); >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> mmVCE_VCPU_CACHE_SIZE1), size); >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_VCPU_CACHE_OFFSET1), >> + (offset & ~0x0f000000) | (1 << 24)); > That mask still looks incorrect to me. > >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_VCPU_CACHE_SIZE1), size); >> >> offset += size; >> size = VCE_V4_0_DATA_SIZE; >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> mmVCE_VCPU_CACHE_OFFSET2), >> - offset & 0x7FFFFFFF); >> - MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> mmVCE_VCPU_CACHE_SIZE2), size); >> - >> - MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, >> mmVCE_LMI_CTRL2), ~0x100, 0); >> - MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, >> mmVCE_SYS_INT_EN), >> - >> VCE_SYS_INT_EN__VCE_SYS_INT_TRAP_INTERRUPT_EN_MASK, >> - >> VCE_SYS_INT_EN__VCE_SYS_INT_TRAP_INTERRUPT_EN_MASK); >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_VCPU_CACHE_OFFSET2), >> + (offset & ~0x0f000000) | (2 << 24)); > Dito. > >> + MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_VCPU_CACHE_SIZE2), size); >> + >> + MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_LMI_CTRL2), ~0x100, 0); >> + MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_SYS_INT_EN), >> + >> VCE_SYS_INT_EN__VCE_SYS_INT_TRAP_INTERRUPT_EN_MASK, >> + >> VCE_SYS_INT_EN__VCE_SYS_INT_TRAP_INTERRUPT_EN_MASK); >> >> /* end of MC_RESUME */ >> - MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, >> mmVCE_STATUS), >> - VCE_STATUS__JOB_BUSY_MASK, >> ~VCE_STATUS__JOB_BUSY_MASK); >> - MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, >> mmVCE_VCPU_CNTL), >> - ~0x200001, >> VCE_VCPU_CNTL__CLK_EN_MASK); >> - MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, >> mmVCE_SOFT_RESET), >> - >> ~VCE_SOFT_RESET__ECPU_SOFT_RESET_MASK, 0); >> + MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_STATUS), >> + VCE_STATUS__JOB_BUSY_MASK, >> + ~VCE_STATUS__JOB_BUSY_MASK); >> + MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_VCPU_CNTL), >> + ~0x200001, >> + VCE_VCPU_CNTL__CLK_EN_MASK); >> + MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_SOFT_RESET), >> + ~VCE_SOFT_RESET__ECPU_SOFT_RESET_MASK, >> 0); > Unrelated coding style change, please concentrate on the functional change > for this patch. > >> >> MMSCH_V1_0_INSERT_DIRECT_POLL(SOC15_REG_OFFSET(VCE, 0, >> mmVCE_STATUS), >> - >> VCE_STATUS_VCPU_REPORT_FW_LOADED_MASK, >> - >> VCE_STATUS_VCPU_REPORT_FW_LOADED_MASK); >> + VCE_STATUS_VCPU_REPORT_FW_LOADED_MASK, >> + VCE_STATUS_VCPU_REPORT_FW_LOADED_MASK); > Here the indentation is wrong. Looks like it was correct before the change. > > Regards, > Christian. > >> >> /* clear BUSY flag */ >> - MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, >> mmVCE_STATUS), >> - ~VCE_STATUS__JOB_BUSY_MASK, >> 0); >> + MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, >> + mmVCE_STATUS), >> + ~VCE_STATUS__JOB_BUSY_MASK, 0); >> >> /* add end packet */ >> memcpy((void *)init_table, &end, sizeof(struct >> mmsch_v1_0_cmd_end)); > _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx