Hi Alex
While we can avoid such vm flush failure by stitch together of the
sending REQ and reading ACK part, at least for compute ring this is
confirmed.
And I believe for SDMA ring (even UVD/VCE ring) it could also be achieved.
But @Koenig, Christian <mailto:christian.koe...@amd.com> insist
stitching together the REQ AND ACK part is not a formal way to fix the
issue, instead just a walkaround and I cannot debate that
What make me worry more is what if there are more registers like Alex
said that behaves like this CC_RB_BACKEND_DISABLE,
since we don’t know their names(too hard to filter them out!) so we
couldn’t remove them all from SR list,
So I still think we need plan B to handle above case, A.K.A use one
package for the REQ and ACK job
/Monk
*From:*Deucher, Alexander
*Sent:* 2018年3月8日10:53
*To:* Liu, Monk <monk....@amd.com>; Koenig, Christian
<christian.koe...@amd.com>; Mao, David <david....@amd.com>
*Cc:* amd-gfx@lists.freedesktop.org; Jin, Jian-Rong
<jian-rong....@amd.com>
*Subject:* Re: deprecated register issues
I think there are more than just CC_RB_BACKEND_DISABLE that could
cause this problem. IIRC, some entire class of gfx registers could
cause it, it just happened that this was one of the only ones we
readback via mmio. Also for the save and restore list, I think the
RLC uses a different interface to read back the registers so it may
not be affected the same way.
Alex
------------------------------------------------------------------------
*From:*Liu, Monk
*Sent:* Wednesday, March 7, 2018 9:42:41 PM
*To:* Deucher, Alexander; Koenig, Christian; Mao, David
*Cc:* amd-gfx@lists.freedesktop.org
<mailto:amd-gfx@lists.freedesktop.org>; Jin, Jian-Rong
*Subject:* RE: deprecated register issues
Hi guys
According to Christian’s found, reading this register would make vm
hub failed to finish the vm flush request , e.g.: sdma is doing vm
flush which first write data to vm_invalidat_req and read result from
vm_invalidate_ack, but found driver will forever failed to get the
correct value from vm_invalidate_ack if the meantime BIF is reading
this CC_RB_BACKEND_DISABLE register.
Now SR-IOV world switch also may get such similar trouble, see below
save_restore_list ( during world_switch, RLCV will save current VF’s
register according to this list and restore all those registers when
loading back this VF)
uint32 register_restore[] = {
(uint32)((0x3000 << 18) | mmPA_SC_FIFO_SIZE), /* SC */
0x00000001,
*(uint32)((0x3000 << 18) | mmCC_RB_BACKEND_DISABLE), /* SC SC
PER_SE */*
*0x00000000,*
*(uint32)((0x3400 << 18) | mmCC_RB_BACKEND_DISABLE), /* SC SC
PER_SE */*
*0x00000000,*
*(uint32)((0x3800 << 18) | mmCC_RB_BACKEND_DISABLE), /* SC SC
PER_SE */*
*0x00000000,*
*(uint32)((0x3c00 << 18) | mmCC_RB_BACKEND_DISABLE), /* SC SC
PER_SE */*
*0x00000000,*
(uint32)((0x3000 << 18) | mmVGT_VTX_VECT_EJECT_REG),
0x00000001,
(uint32)((0x3000 << 18) | mmVGT_DMA_DATA_FIFO_DEPTH), /* IA WD */
0x00000001,
(uint32)((0x3000 << 18) | mmVGT_DMA_REQ_FIFO_DEPTH), /* WD */
0x00000001,
(uint32)((0x3000 << 18) | mmVGT_DRAW_INIT_FIFO_DEPTH), /* WD */
0x00000001,
(uint32)((0x3000 << 18) | mmVGT_CACHE_INVALIDATION), /* IA */
0x00000001,
(uint32)((0x3000 << 18) | mmVGT_RESET_DEBUG), /* WD */
0x00000001,
(uint32)((0x3000 << 18) | mmVGT_FIFO_DEPTHS),
I will do some test against this CC_RB_BACKEND_DISABLE register, see
if vm flush failure issue could be avoided by removing those four
register from SR list
Thanks
/Monk
*From:*Deucher, Alexander
*Sent:* 2018年3月7日23:13
*To:* Koenig, Christian <christian.koe...@amd.com
<mailto:christian.koe...@amd.com>>; Mao, David <david....@amd.com
<mailto:david....@amd.com>>; Liu, Monk <monk....@amd.com
<mailto:monk....@amd.com>>
*Cc:* amd-gfx@lists.freedesktop.org
<mailto:amd-gfx@lists.freedesktop.org>; Jin, Jian-Rong
<jian-rong....@amd.com <mailto:jian-rong....@amd.com>>
*Subject:* Re: deprecated register issues
Right. We ran into issues with reading back that register at runtime
when UMDs queried it when other stuff was in flight, so we just read
it once at startup and cache the results. Now when UMDs request it, we
return the cached value.
Alex
------------------------------------------------------------------------
*From:*Koenig, Christian
*Sent:* Wednesday, March 7, 2018 9:31:13 AM
*To:* Mao, David; Liu, Monk
*Cc:* Deucher, Alexander; amd-gfx@lists.freedesktop.org
<mailto:amd-gfx@lists.freedesktop.org>; Jin, Jian-Rong
*Subject:* Re: deprecated register issues
Hi David,
well I just figured that this is a misunderstanding.
Accessing this register and some other deprecated registers can cause
problem when invalidating VMHUBs.
This register itself isn't deprecated, the wording in a patch fixing
things is just a bit unclear.
Question is is that register still accessed regularly or is it value
cached after startup?
Regards,
Christian.
Am 07.03.2018 um 15:25 schrieb Mao, David:
We requires base driver to provide the mask of disabled RB.
This is why kernel read the CC_RB_BACKEND_DISABLE to collect the
harvest configuration.
Where did you get to know that the register is deprecated?
I think it should still be there.
Best Regards,
David
On Mar 7, 2018, at 9:49 PM, Liu, Monk <monk....@amd.com
<mailto:monk....@amd.com>> wrote:
+ UMD guys
Hi David
Do you know if*GC_USER_RB_BACKEND_DISABLE is still exist for
gfx9/vega10 ?*
**
*We found*CC_RB_BACKEND_DISABLE was deprecated but looks it is
still in use in kmd, so
I want to check with you both of above registers
Thanks
/Monk
*From:*amd-gfx
[mailto:amd-gfx-boun...@lists.freedesktop.org]*On Behalf
Of*Christian K?nig
*Sent:*2018年3月7日20:26
*To:*Liu, Monk <monk....@amd.com <mailto:monk....@amd.com>>;
Deucher, Alexander <alexander.deuc...@amd.com
<mailto:alexander.deuc...@amd.com>>
*Cc:*amd-gfx@lists.freedesktop.org
<mailto:amd-gfx@lists.freedesktop.org>
*Subject:*Re: deprecated register issues
Hi Monk,
I honestly don't have the slightest idea why we are still
accessing CC_RB_BACKEND_DISABLE. Maybe it still contains some
useful values?
Key point was that we needed to stop accessing it all the time
to avoid triggering problems.
Regards,
Christian.
Am 07.03.2018 um 13:11 schrieb Liu, Monk:
Hi Christian
I remember you and AlexD mentioned that a handful
registers are deprecated for greenland (gfx9)
e.g. CC_RB_BACKEND_DISABLE
do you know why we still have this routine ?
staticu32
gfx_v9_0_get_rb_active_bitmap(structamdgpu_device *adev)
{
u32 data, mask;
data =RREG32_SOC15(GC,
0, mmCC_RB_BACKEND_DISABLE);
data |=RREG32_SOC15(GC,
0, mmGC_USER_RB_BACKEND_DISABLE);
data &= CC_RB_BACKEND_DISABLE__BACKEND_DISABLE_MASK;
data >>=
GC_USER_RB_BACKEND_DISABLE__BACKEND_DISABLE__SHIFT;
mask
=amdgpu_gfx_create_bitmask(adev->gfx.config.max_backends_per_se/
adev->gfx.config.max_sh_per_se);
return(~data) & mask;
}
see that it still read CC_RB_BACKEND_DISABLE
thanks
/Monk
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx