[AMD Public Use] We need to figure out what the root cause is then. If we can't figure it out soon, we should revert the change for navi1x and continue to debug it until we can find the root cause and we can safely re-enable it.
Alex ________________________________ From: Chen, Guchun <guchun.c...@amd.com> Sent: Sunday, November 29, 2020 2:22 AM To: Bas Nieuwenhuizen <b...@basnieuwenhuizen.nl>; Kuehling, Felix <felix.kuehl...@amd.com> Cc: Gui, Jack <jack....@amd.com>; Zhou1, Tao <tao.zh...@amd.com>; amd-gfx mailing list <amd-gfx@lists.freedesktop.org>; Huang, Ray <ray.hu...@amd.com>; Deucher, Alexander <alexander.deuc...@amd.com>; Zhang, Hawking <hawking.zh...@amd.com> Subject: RE: [PATCH v3] drm/amd/amdgpu: set the default value of noretry to 1 for some dGPUs [AMD Public Use] Hi Bas Nieuwenhuizen, I don't think direct revert is one right approach, though it's able to fix your problem. noretry=0 will cause other test failure on several ASICs. Regards, Guchun -----Original Message----- From: amd-gfx <amd-gfx-boun...@lists.freedesktop.org> On Behalf Of Bas Nieuwenhuizen Sent: Sunday, November 29, 2020 8:38 AM To: Kuehling, Felix <felix.kuehl...@amd.com> Cc: Gui, Jack <jack....@amd.com>; Chen, Guchun <guchun.c...@amd.com>; Zhou1, Tao <tao.zh...@amd.com>; amd-gfx mailing list <amd-gfx@lists.freedesktop.org>; Huang, Ray <ray.hu...@amd.com>; Deucher, Alexander <alexander.deuc...@amd.com>; Zhang, Hawking <hawking.zh...@amd.com> Subject: Re: [PATCH v3] drm/amd/amdgpu: set the default value of noretry to 1 for some dGPUs Can we revert this patch to fix https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1374&data=04%7C01%7Cguchun.chen%40amd.com%7C6d626e2a3bae4877024f08d893ff15db%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637422071085800476%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Jxa2V1TuszoBKtF%2FPbIA3YwOrXHgLreBY%2FXej1HTZ4k%3D&reserved=0 ? On Thu, Oct 15, 2020 at 4:30 PM Felix Kuehling <felix.kuehl...@amd.com> wrote: > > Am 2020-10-14 um 11:35 p.m. schrieb Chengming Gui: > > noretry = 0 cause some dGPU's kfd page fault tests fail, so set > > noretry to 1 for these special ASICs: > > vega20/navi10/navi14/ARCTURUS > > > > v2: merge raven and default case due to the same setting > > v3: remove ARCTURUS > > > > Signed-off-by: Chengming Gui <jack....@amd.com> > > Change-Id: I3be70f463a49b0cd5c56456431d6c2cb98b13872 > > Acked-by: Felix Kuhling <felix.kuehl...@amd.com> > > > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 23 > > +++++++++++++++-------- > > 1 file changed, 15 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > > index 36604d751d62..f26eb4e54b12 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c > > @@ -425,20 +425,27 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device > > *adev) > > struct amdgpu_gmc *gmc = &adev->gmc; > > > > switch (adev->asic_type) { > > - case CHIP_RAVEN: > > - /* Raven currently has issues with noretry > > - * regardless of what we decide for other > > - * asics, we should leave raven with > > - * noretry = 0 until we root cause the > > - * issues. > > + case CHIP_VEGA20: > > + case CHIP_NAVI10: > > + case CHIP_NAVI14: > > + /* > > + * noretry = 0 will cause kfd page fault tests fail > > + * for some ASICs, so set default to 1 for these ASICs. > > */ > > if (amdgpu_noretry == -1) > > - gmc->noretry = 0; > > + gmc->noretry = 1; > > else > > gmc->noretry = amdgpu_noretry; > > break; > > + case CHIP_RAVEN: > > default: > > - /* default this to 0 for now, but we may want > > + /* Raven currently has issues with noretry > > + * regardless of what we decide for other > > + * asics, we should leave raven with > > + * noretry = 0 until we root cause the > > + * issues. > > + * > > + * default this to 0 for now, but we may want > > * to change this in the future for certain > > * GPUs as it can increase performance in > > * certain cases. > _______________________________________________ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist > s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cgu > chun.chen%40amd.com%7C6d626e2a3bae4877024f08d893ff15db%7C3dd8961fe4884 > e608e11a82d994e183d%7C0%7C0%7C637422071085800476%7CUnknown%7CTWFpbGZsb > 3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D% > 7C1000&sdata=VFqegGwPCj10q3Y5BdZsVq2a%2B4Tb358mYVDaNkA9zLU%3D& > reserved=0 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=04%7C01%7Cguchun.chen%40amd.com%7C6d626e2a3bae4877024f08d893ff15db%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637422071085800476%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=VFqegGwPCj10q3Y5BdZsVq2a%2B4Tb358mYVDaNkA9zLU%3D&reserved=0
_______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx