[Public]


> -----Original Message-----
> From: Sider, Graham <graham.si...@amd.com>
> Sent: Tuesday, April 4, 2023 12:00 PM
> To: Russell, Kent <kent.russ...@amd.com>; Somasekharan, Sreekant
> <sreekant.somasekha...@amd.com>; amd-gfx@lists.freedesktop.org
> Cc: Somasekharan, Sreekant <sreekant.somasekha...@amd.com>
> Subject: RE: [PATCH] drm/amdkfd: On GFX11 check PCIe atomics support and set
> CP_HQD_HQ_STATUS0[29]
> 
> [Public]
> 
> > -----Original Message-----
> > From: amd-gfx <amd-gfx-boun...@lists.freedesktop.org> On Behalf Of
> > Russell, Kent
> > Sent: Tuesday, April 4, 2023 9:43 AM
> > To: Somasekharan, Sreekant <sreekant.somasekha...@amd.com>; amd-
> > g...@lists.freedesktop.org
> > Cc: Somasekharan, Sreekant <sreekant.somasekha...@amd.com>
> > Subject: RE: [PATCH] drm/amdkfd: On GFX11 check PCIe atomics support and
> > set CP_HQD_HQ_STATUS0[29]
> >
> > Caution: This message originated from an External Source. Use proper
> > caution when opening attachments, clicking links, or responding.
> >
> >
> > [AMD Official Use Only - General]
> >
> > Comments inline
> >
> > > -----Original Message-----
> > > From: amd-gfx <amd-gfx-boun...@lists.freedesktop.org> On Behalf Of
> > > Sreekant Somasekharan
> > > Sent: Monday, April 3, 2023 3:59 PM
> > > To: amd-gfx@lists.freedesktop.org
> > > Cc: Somasekharan, Sreekant <sreekant.somasekha...@amd.com>
> > > Subject: [PATCH] drm/amdkfd: On GFX11 check PCIe atomics support and
> > > set CP_HQD_HQ_STATUS0[29]
> > >
> > > On GFX11, CP_HQD_HQ_STATUS0[29] bit will be used by CPFW to
> > > acknowledge whether PCIe atomics are supported. The default value of
> > > this bit is set to 0. Driver will check whether PCIe atomics are
> > > supported and set the bit to 1 if supported. This will force CPFW to use 
> > > real
> > atomic ops.
> > > If the bit is not set, CPFW will default to read/modify/write using
> > > the firmware itself.
> > >
> > > This is applicable only to RS64 based GFX11 with MEC FW greater than
> > > or equal to 509. If MEC FW is less than 509, PCIe atomics needs to be
> > > supported, else it will skip the device.
> > >
> > > This commit also involves moving amdgpu_amdkfd_device_probe()
> > function
> > > call after per-IP early_init loop in amdgpu_device_ip_early_init()
> > > function so as to check for RS64 enabled device.
> > >
> > > Signed-off-by: Sreekant Somasekharan
> > <sreekant.somasekha...@amd.com>
> > > ---
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c       |  2 +-
> > >  drivers/gpu/drm/amd/amdkfd/kfd_device.c          | 11 +++++++++++
> > >  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c |  9 +++++++++
> > >  3 files changed, 21 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > index 7116119ed038..b3a754ca0923 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > @@ -2150,7 +2150,6 @@ static int amdgpu_device_ip_early_init(struct
> > > amdgpu_device *adev)
> > >               adev->has_pr3 = parent ? pci_pr3_present(parent) : false;
> > >       }
> > >
> > > -     amdgpu_amdkfd_device_probe(adev);
> > >
> > >       adev->pm.pp_feature = amdgpu_pp_feature_mask;
> > >       if (amdgpu_sriov_vf(adev) || sched_policy ==
> > > KFD_SCHED_POLICY_NO_HWS)
> > > @@ -2206,6 +2205,7 @@ static int amdgpu_device_ip_early_init(struct
> > > amdgpu_device *adev)
> > >       if (!total)
> > >               return -ENODEV;
> > >
> > > +     amdgpu_amdkfd_device_probe(adev);
> > >       adev->cg_flags &= amdgpu_cg_mask;
> > >       adev->pg_flags &= amdgpu_pg_mask;
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> > > b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> > > index 521dfa88aad8..64a295a35d37 100644
> > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> > > @@ -204,6 +204,17 @@ static void kfd_device_info_init(struct kfd_dev
> > *kfd,
> > >                       /* Navi1x+ */
> > >                       if (gc_version >= IP_VERSION(10, 1, 1))
> > >                               kfd->device_info.needs_pci_atomics =
> > > true;
> > > +             } else if (gc_version < IP_VERSION(12, 0, 0)) {
> >
> >
> > What if we get a GFX9 with MEC v509? Wouldn't that trigger this too?
> > Wondering if this should be if (gc_version>=IP_VERSION(11,0,0) &&
> > gc_version < IP_VERSION(12,0,0)) thus ensuring it's only GFX11. Or maybe
> > there is some better check than that. Either way, checking that it's < GFX11
> > might false-positive on GFX10- too, so we should probably be explicit in our
> > GFX check that it's only GFX11.
> 
> The previous condition is for gc_version < IP_VERSION(11, 0, 0), so that
> condition will (and currently is) taken for gfx9/gfx10/etc.
> 
> That's to say the logic after this change will look like:
> 
> If (KFD_IS_SOC15(kfd)) {
>       <...>
>       If (gc_version < IP_VERSION(11, 0, 0)) {
>               <...>
>       } else if (gc_version < IP_VERSION(12, 0, 0)) {
>               <...>
>       }
> }
> 
> So this new path will only be taken for gfx11.

Thanks Graham. I should've pulled up the file and looked at it in context. 
Ignore my comment then.

 Kent
> 
> Best,
> Graham
> 
> >
> >  Kent
> >
> > > +                     /* On GFX11 running on RS64, MEC FW version must
> > > + be
> > > greater than
> > > +                      * or equal to version 509 to support
> > > + acknowledging
> > > whether
> > > +                      * PCIe atomics are supported. Before MEC
> > > + version 509,
> > > PCIe
> > > +                      * atomics are required. After that, the FW's
> > > + use of
> > > atomics
> > > +                      * is controlled by CP_HQD_HQ_STATUS0[29].
> > > +                      * This will fail on GFX11 when PCIe atomics are
> > > + not
> > > supported
> > > +                      * and MEC FW version < 509 for RS64 based CPFW.
> > > +                      */
> > > +                     kfd->device_info.needs_pci_atomics = true;
> > > +                     kfd->device_info.no_atomic_fw_version =
> > > + kfd->adev-
> > > >gfx.rs64_enable ? 509 : 0;
> > >               }
> > >       } else {
> > >               kfd->device_info.doorbell_size = 4; diff --git
> > > a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
> > > b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
> > > index 4a9af800b1f1..c5ea594abbf6 100644
> > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
> > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c
> > > @@ -143,6 +143,15 @@ static void init_mqd(struct mqd_manager *mm,
> > void
> > > **mqd,
> > >                       1 << CP_HQD_QUANTUM__QUANTUM_SCALE__SHIFT
> > > |
> > >                       1 <<
> > > CP_HQD_QUANTUM__QUANTUM_DURATION__SHIFT;
> > >
> > > +     /*
> > > +      * If PCIe atomics are supported, set CP_HQD_HQ_STATUS0[29] == 1
> > > +      * to force CPFW to use atomics. This is supported only on MEC FW
> > > +      * version >= 509 and on RS64 based CPFW only. On previous versions,
> > > +      * platforms running on GFX11 must support atomics else will
> > > + skip the
> > > device.
> > > +      */
> > > +     if (amdgpu_amdkfd_have_atomics_support((mm->dev->adev)))
> > > +             m->cp_hqd_hq_status0 |= 1 << 29;
> > > +
> > >       if (q->format == KFD_QUEUE_FORMAT_AQL) {
> > >               m->cp_hqd_aql_control =
> > >                       1 << CP_HQD_AQL_CONTROL__CONTROL0__SHIFT;
> > > --
> > > 2.25.1

Reply via email to