RE: [PATCH] drm/amdkfd: report atomics support in io_links over xgmi

2021-04-30 Thread Kim, Jonathan
[AMD Official Use Only - Internal Distribution Only]

> -Original Message-
> From: Kuehling, Felix 
> Sent: Friday, April 30, 2021 2:55 PM
> To: Zeng, Oak ; Kim, Jonathan
> ; amd-gfx@lists.freedesktop.org; Cornwall, Jay
> 
> Cc: Errabolu, Ramesh 
> Subject: Re: [PATCH] drm/amdkfd: report atomics support in io_links over
> xgmi
>
> Am 2021-04-30 um 2:16 p.m. schrieb Zeng, Oak:
> > Sorry I should be clearer.
> >
> > dev->gpu->pci_atomic_requested is set to the value of adev-
> >have_atomics_support - see function
> amdgpu_amdkfd_have_atomics_support. adev->have_atomics_support is
> set through function pci_enable_atomic_ops_to_root currently in
> amdgpu_device_init - I don't think this logic is correct for xgmi connected
> devices. For xgmi connected devices, adev->have_atomics_support should
> always set to true. +@Cornwall, Jay to comment as the original author of this
> codes.
> >
> > The codes Jon put below refers dev->gpu->pci_atomic_requested to set
> the io link properties. I think we should fix adev->have_atomics_support
> which is the source of dev->gpu->pci_atomic_requested. Once adev-
> >have_atomics_support is fixed, part of the code in kfd_topology.c doesn't
> need to be changed.
>
> I disagree. You're basically just changing the name of a variable. That 
> doesn't
> change any of the logic.
>
>
> >  Currently kfd_topology.c is the only consumer of adev-
> >have_atomics_support and seems we only use such information for gpu-
> gpu io-link not for gpu-cpu iolink properties. But I still think we fix it 
> from the
> source is better because in the future there might be code refers to adev-
> >have_atomics_support. The code I put below is wrong though, it should be:
> > If (!adev->gmc.xgmi.num_physical_nodes)
> > r = pci_enable_atomic_ops_to_root(adev->pdev,
> >
> PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> >
> PCI_EXP_DEVCAP2_ATOMIC_COMP64);
>
> No. pci_enable_atomic_ops_to_root enables the GPU to perform atomic
> ops on system memory over PCIe. The number of nodes in an XGMI hive has
> no influence on this. The only situation where we don't need this is on GPUs
> that connect to the host via XGMI. So the condition if
> (!adev->gmc.xgmi.connected_to_cpu) is correct.


Then it seems like this was missed in the review and a follow on fix will be 
needed to be applied (basically remove this condition and move the denial of 
the pci_atomic_requested check under the kfd topology !connected_to_cpu 
condition that currently only denies pcie_capability_read_dword checks for cpu 
POV flags:

> > +if (!adev->gmc.xgmi.num_physical_nodes) {
> > +if (!dev->gpu->pci_atomic_requested ||
> > +dev->gpu->device_info->asic_family ==
> > +CHIP_HAWAII)
> > +flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
> > +CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
> > +}

Thanks,

Jon

>
> Regards,
>   Felix
>
>
> >
> > Regards,
> > Oak
> >
> >
> >
> > On 2021-04-29, 10:45 PM, "Kuehling, Felix" 
> wrote:
> >
> >
> > Am 2021-04-29 um 9:12 p.m. schrieb Zeng, Oak:
> > > I think part of this can be done more clean in amdgpu_device_init:
> > >
> > > r = 0;
> > > If (!adev->gmc.xgmi.connected_to_cpu)
> > > /* enable PCIE atomic ops */
> > > r = pci_enable_atomic_ops_to_root(adev->pdev,
> > >
> PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> > >
> PCI_EXP_DEVCAP2_ATOMIC_COMP64);
> > > if (r) {
> > > adev->have_atomics_support = false;
> > > DRM_INFO("PCIE atomic ops is not supported\n");
> > > } else {
> > > adev->have_atomics_support = true;
> > > }
> >
> > This code is already in amdgpu_device_init. I'm not sure what's your
> > point. Are you suggesting that the topology flags should be updated in
> > amdgpu_device_init? That's not really possible because the KFD topology
> > data structures don't exist at that time (they do only after the call to
> > amdgpu_device_ip_init) and the data structures are not known in
> > amdgpu_device.c. It's cleaner to keep this compartmentalized in
> > kfd_topology.c.
> >
> > Regards,
> >   Felix
> >
> >
> > > Regards,
> > > Oak
> > >
> > >
> > >
> > > On 2021-04-29, 5:36 AM, "Kim, Jonathan" 
> wrote:
> > >
> > > Link atomics support over xGMI sho

Re: [PATCH] drm/amdkfd: report atomics support in io_links over xgmi

2021-04-30 Thread Felix Kuehling
Am 2021-04-30 um 2:16 p.m. schrieb Zeng, Oak:
> Sorry I should be clearer. 
>
> dev->gpu->pci_atomic_requested is set to the value of 
> adev->have_atomics_support - see function amdgpu_amdkfd_have_atomics_support. 
> adev->have_atomics_support is set through function 
> pci_enable_atomic_ops_to_root currently in amdgpu_device_init - I don't think 
> this logic is correct for xgmi connected devices. For xgmi connected devices, 
> adev->have_atomics_support should always set to true. +@Cornwall, Jay to 
> comment as the original author of this codes.
>
> The codes Jon put below refers dev->gpu->pci_atomic_requested to set the io 
> link properties. I think we should fix adev->have_atomics_support which is 
> the source of dev->gpu->pci_atomic_requested. Once adev->have_atomics_support 
> is fixed, part of the code in kfd_topology.c doesn't need to be changed.

I disagree. You're basically just changing the name of a variable. That
doesn't change any of the logic.


>  Currently kfd_topology.c is the only consumer of adev->have_atomics_support 
> and seems we only use such information for gpu-gpu io-link not for gpu-cpu 
> iolink properties. But I still think we fix it from the source is better 
> because in the future there might be code refers to 
> adev->have_atomics_support. The code I put below is wrong though, it should 
> be: 
> If (!adev->gmc.xgmi.num_physical_nodes)
>   r = pci_enable_atomic_ops_to_root(adev->pdev,
> PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> PCI_EXP_DEVCAP2_ATOMIC_COMP64);

No. pci_enable_atomic_ops_to_root enables the GPU to perform atomic ops
on system memory over PCIe. The number of nodes in an XGMI hive has no
influence on this. The only situation where we don't need this is on
GPUs that connect to the host via XGMI. So the condition if
(!adev->gmc.xgmi.connected_to_cpu) is correct.

Regards,
  Felix


>
> Regards,
> Oak 
>
>  
>
> On 2021-04-29, 10:45 PM, "Kuehling, Felix"  wrote:
>
>
> Am 2021-04-29 um 9:12 p.m. schrieb Zeng, Oak:
> > I think part of this can be done more clean in amdgpu_device_init:
> >
> > r = 0;
> > If (!adev->gmc.xgmi.connected_to_cpu)
> > /* enable PCIE atomic ops */
> > r = pci_enable_atomic_ops_to_root(adev->pdev,
> >   PCI_EXP_DEVCAP2_ATOMIC_COMP32 
> |
> >   
> PCI_EXP_DEVCAP2_ATOMIC_COMP64);
> > if (r) {
> > adev->have_atomics_support = false;
> > DRM_INFO("PCIE atomic ops is not supported\n");
> > } else {
> > adev->have_atomics_support = true;
> > }
>
> This code is already in amdgpu_device_init. I'm not sure what's your
> point. Are you suggesting that the topology flags should be updated in
> amdgpu_device_init? That's not really possible because the KFD topology
> data structures don't exist at that time (they do only after the call to
> amdgpu_device_ip_init) and the data structures are not known in
> amdgpu_device.c. It's cleaner to keep this compartmentalized in
> kfd_topology.c.
>
> Regards,
>   Felix
>
>
> > Regards,
> > Oak 
> >
> >  
> >
> > On 2021-04-29, 5:36 AM, "Kim, Jonathan"  wrote:
> >
> > Link atomics support over xGMI should be reported independently of 
> PCIe.
> >
> > Signed-off-by: Jonathan Kim 
> > Tested-by: Ramesh Errabolu 
> > ---
> >  drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 29 
> ++-
> >  1 file changed, 18 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> > index 083ac9babfa8..30430aefcfc7 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> > @@ -1196,6 +1196,7 @@ static void 
> kfd_fill_iolink_non_crat_info(struct kfd_topology_device *dev)
> >  {
> > struct kfd_iolink_properties *link, *cpu_link;
> > struct kfd_topology_device *cpu_dev;
> > +   struct amdgpu_device *adev;
> > uint32_t cap;
> > uint32_t cpu_flag = CRAT_IOLINK_FLAGS_ENABLED;
> > uint32_t flag = CRAT_IOLINK_FLAGS_ENABLED;
> > @@ -1203,18 +1204,24 @@ static void 
> kfd_fill_iolink_non_crat_info(struct kfd_topology_device *dev)
> > if (!dev || !dev->gpu)
> > return;
> >
> > -   pcie_capability_read_dword(dev->gpu->pdev,
> > -   PCI_EXP_DEVCAP2, );
> > -
> > -   if (!(cap & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> > -PCI_EXP_DEVCAP2_ATOMIC_COMP64)))
> > -   cpu_flag |= 

Re: [PATCH] drm/amdkfd: report atomics support in io_links over xgmi

2021-04-30 Thread Zeng, Oak
Sorry I should be clearer. 

dev->gpu->pci_atomic_requested is set to the value of 
adev->have_atomics_support - see function amdgpu_amdkfd_have_atomics_support. 
adev->have_atomics_support is set through function 
pci_enable_atomic_ops_to_root currently in amdgpu_device_init - I don't think 
this logic is correct for xgmi connected devices. For xgmi connected devices, 
adev->have_atomics_support should always set to true. +@Cornwall, Jay to 
comment as the original author of this codes.

The codes Jon put below refers dev->gpu->pci_atomic_requested to set the io 
link properties. I think we should fix adev->have_atomics_support which is the 
source of dev->gpu->pci_atomic_requested. Once adev->have_atomics_support is 
fixed, part of the code in kfd_topology.c doesn't need to be changed. Currently 
kfd_topology.c is the only consumer of adev->have_atomics_support and seems we 
only use such information for gpu-gpu io-link not for gpu-cpu iolink 
properties. But I still think we fix it from the source is better because in 
the future there might be code refers to adev->have_atomics_support. The code I 
put below is wrong though, it should be: 
If (!adev->gmc.xgmi.num_physical_nodes)
r = pci_enable_atomic_ops_to_root(adev->pdev,
  PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
  PCI_EXP_DEVCAP2_ATOMIC_COMP64);

Regards,
Oak 

 

On 2021-04-29, 10:45 PM, "Kuehling, Felix"  wrote:


Am 2021-04-29 um 9:12 p.m. schrieb Zeng, Oak:
> I think part of this can be done more clean in amdgpu_device_init:
>
>   r = 0;
>   If (!adev->gmc.xgmi.connected_to_cpu)
>   /* enable PCIE atomic ops */
>   r = pci_enable_atomic_ops_to_root(adev->pdev,
> PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> PCI_EXP_DEVCAP2_ATOMIC_COMP64);
>   if (r) {
>   adev->have_atomics_support = false;
>   DRM_INFO("PCIE atomic ops is not supported\n");
>   } else {
>   adev->have_atomics_support = true;
>   }

This code is already in amdgpu_device_init. I'm not sure what's your
point. Are you suggesting that the topology flags should be updated in
amdgpu_device_init? That's not really possible because the KFD topology
data structures don't exist at that time (they do only after the call to
amdgpu_device_ip_init) and the data structures are not known in
amdgpu_device.c. It's cleaner to keep this compartmentalized in
kfd_topology.c.

Regards,
  Felix


> Regards,
> Oak 
>
>  
>
> On 2021-04-29, 5:36 AM, "Kim, Jonathan"  wrote:
>
> Link atomics support over xGMI should be reported independently of 
PCIe.
>
> Signed-off-by: Jonathan Kim 
> Tested-by: Ramesh Errabolu 
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 29 
++-
>  1 file changed, 18 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> index 083ac9babfa8..30430aefcfc7 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> @@ -1196,6 +1196,7 @@ static void 
kfd_fill_iolink_non_crat_info(struct kfd_topology_device *dev)
>  {
>   struct kfd_iolink_properties *link, *cpu_link;
>   struct kfd_topology_device *cpu_dev;
> + struct amdgpu_device *adev;
>   uint32_t cap;
>   uint32_t cpu_flag = CRAT_IOLINK_FLAGS_ENABLED;
>   uint32_t flag = CRAT_IOLINK_FLAGS_ENABLED;
> @@ -1203,18 +1204,24 @@ static void 
kfd_fill_iolink_non_crat_info(struct kfd_topology_device *dev)
>   if (!dev || !dev->gpu)
>   return;
>
> - pcie_capability_read_dword(dev->gpu->pdev,
> - PCI_EXP_DEVCAP2, );
> -
> - if (!(cap & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> -  PCI_EXP_DEVCAP2_ATOMIC_COMP64)))
> - cpu_flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
> - CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
> + adev = (struct amdgpu_device *)(dev->gpu->kgd);
> + if (!adev->gmc.xgmi.connected_to_cpu) {
> + pcie_capability_read_dword(dev->gpu->pdev,
> + PCI_EXP_DEVCAP2, );
> +
> + if (!(cap & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> +  PCI_EXP_DEVCAP2_ATOMIC_COMP64)))
> + cpu_flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT 
|
> + CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
> + }
>
> - if (!dev->gpu->pci_atomic_requested ||
> - 

Re: [PATCH] drm/amdkfd: report atomics support in io_links over xgmi

2021-04-29 Thread Felix Kuehling

Am 2021-04-29 um 9:12 p.m. schrieb Zeng, Oak:
> I think part of this can be done more clean in amdgpu_device_init:
>
>   r = 0;
>   If (!adev->gmc.xgmi.connected_to_cpu)
>   /* enable PCIE atomic ops */
>   r = pci_enable_atomic_ops_to_root(adev->pdev,
> PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> PCI_EXP_DEVCAP2_ATOMIC_COMP64);
>   if (r) {
>   adev->have_atomics_support = false;
>   DRM_INFO("PCIE atomic ops is not supported\n");
>   } else {
>   adev->have_atomics_support = true;
>   }

This code is already in amdgpu_device_init. I'm not sure what's your
point. Are you suggesting that the topology flags should be updated in
amdgpu_device_init? That's not really possible because the KFD topology
data structures don't exist at that time (they do only after the call to
amdgpu_device_ip_init) and the data structures are not known in
amdgpu_device.c. It's cleaner to keep this compartmentalized in
kfd_topology.c.

Regards,
  Felix


> Regards,
> Oak 
>
>  
>
> On 2021-04-29, 5:36 AM, "Kim, Jonathan"  wrote:
>
> Link atomics support over xGMI should be reported independently of PCIe.
>
> Signed-off-by: Jonathan Kim 
> Tested-by: Ramesh Errabolu 
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 29 ++-
>  1 file changed, 18 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> index 083ac9babfa8..30430aefcfc7 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> @@ -1196,6 +1196,7 @@ static void kfd_fill_iolink_non_crat_info(struct 
> kfd_topology_device *dev)
>  {
>   struct kfd_iolink_properties *link, *cpu_link;
>   struct kfd_topology_device *cpu_dev;
> + struct amdgpu_device *adev;
>   uint32_t cap;
>   uint32_t cpu_flag = CRAT_IOLINK_FLAGS_ENABLED;
>   uint32_t flag = CRAT_IOLINK_FLAGS_ENABLED;
> @@ -1203,18 +1204,24 @@ static void kfd_fill_iolink_non_crat_info(struct 
> kfd_topology_device *dev)
>   if (!dev || !dev->gpu)
>   return;
>
> - pcie_capability_read_dword(dev->gpu->pdev,
> - PCI_EXP_DEVCAP2, );
> -
> - if (!(cap & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> -  PCI_EXP_DEVCAP2_ATOMIC_COMP64)))
> - cpu_flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
> - CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
> + adev = (struct amdgpu_device *)(dev->gpu->kgd);
> + if (!adev->gmc.xgmi.connected_to_cpu) {
> + pcie_capability_read_dword(dev->gpu->pdev,
> + PCI_EXP_DEVCAP2, );
> +
> + if (!(cap & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> +  PCI_EXP_DEVCAP2_ATOMIC_COMP64)))
> + cpu_flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
> + CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
> + }
>
> - if (!dev->gpu->pci_atomic_requested ||
> - dev->gpu->device_info->asic_family == CHIP_HAWAII)
> - flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
> - CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
> + if (!adev->gmc.xgmi.num_physical_nodes) {
> + if (!dev->gpu->pci_atomic_requested ||
> + dev->gpu->device_info->asic_family ==
> + CHIP_HAWAII)
> + flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
> + CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
> + }
>
>   /* GPU only creates direct links so apply flags setting to all */
>   list_for_each_entry(link, >io_link_props, list) {
> -- 
> 2.17.1
>
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdkfd: report atomics support in io_links over xgmi

2021-04-29 Thread Zeng, Oak
I think part of this can be done more clean in amdgpu_device_init:

r = 0;
If (!adev->gmc.xgmi.connected_to_cpu)
/* enable PCIE atomic ops */
r = pci_enable_atomic_ops_to_root(adev->pdev,
  PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
  PCI_EXP_DEVCAP2_ATOMIC_COMP64);
if (r) {
adev->have_atomics_support = false;
DRM_INFO("PCIE atomic ops is not supported\n");
} else {
adev->have_atomics_support = true;
}

Regards,
Oak 

 

On 2021-04-29, 5:36 AM, "Kim, Jonathan"  wrote:

Link atomics support over xGMI should be reported independently of PCIe.

Signed-off-by: Jonathan Kim 
Tested-by: Ramesh Errabolu 
---
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 29 ++-
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index 083ac9babfa8..30430aefcfc7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -1196,6 +1196,7 @@ static void kfd_fill_iolink_non_crat_info(struct 
kfd_topology_device *dev)
 {
struct kfd_iolink_properties *link, *cpu_link;
struct kfd_topology_device *cpu_dev;
+   struct amdgpu_device *adev;
uint32_t cap;
uint32_t cpu_flag = CRAT_IOLINK_FLAGS_ENABLED;
uint32_t flag = CRAT_IOLINK_FLAGS_ENABLED;
@@ -1203,18 +1204,24 @@ static void kfd_fill_iolink_non_crat_info(struct 
kfd_topology_device *dev)
if (!dev || !dev->gpu)
return;

-   pcie_capability_read_dword(dev->gpu->pdev,
-   PCI_EXP_DEVCAP2, );
-
-   if (!(cap & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
-PCI_EXP_DEVCAP2_ATOMIC_COMP64)))
-   cpu_flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
-   CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
+   adev = (struct amdgpu_device *)(dev->gpu->kgd);
+   if (!adev->gmc.xgmi.connected_to_cpu) {
+   pcie_capability_read_dword(dev->gpu->pdev,
+   PCI_EXP_DEVCAP2, );
+
+   if (!(cap & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
+PCI_EXP_DEVCAP2_ATOMIC_COMP64)))
+   cpu_flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
+   CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
+   }

-   if (!dev->gpu->pci_atomic_requested ||
-   dev->gpu->device_info->asic_family == CHIP_HAWAII)
-   flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
-   CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
+   if (!adev->gmc.xgmi.num_physical_nodes) {
+   if (!dev->gpu->pci_atomic_requested ||
+   dev->gpu->device_info->asic_family ==
+   CHIP_HAWAII)
+   flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
+   CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
+   }

/* GPU only creates direct links so apply flags setting to all */
list_for_each_entry(link, >io_link_props, list) {
-- 
2.17.1


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdkfd: report atomics support in io_links over xgmi

2021-04-29 Thread Kim, Jonathan
[AMD Official Use Only - Internal Distribution Only]

> -Original Message-
> From: Kuehling, Felix 
> Sent: Thursday, April 29, 2021 11:18 AM
> To: Kim, Jonathan ; amd-
> g...@lists.freedesktop.org
> Cc: Zeng, Oak ; Errabolu, Ramesh
> 
> Subject: Re: [PATCH] drm/amdkfd: report atomics support in io_links over
> xgmi
>
> Am 2021-04-29 um 11:04 a.m. schrieb Kim, Jonathan:
> > [AMD Official Use Only - Internal Distribution Only]
> >
> >> -Original Message-
> >> From: Kuehling, Felix 
> >> Sent: Thursday, April 29, 2021 10:49 AM
> >> To: Kim, Jonathan ; amd-
> >> g...@lists.freedesktop.org
> >> Cc: Zeng, Oak ; Errabolu, Ramesh
> >> 
> >> Subject: Re: [PATCH] drm/amdkfd: report atomics support in io_links
> >> over xgmi
> >>
> >> Am 2021-04-29 um 5:36 a.m. schrieb Jonathan Kim:
> >>> Link atomics support over xGMI should be reported independently of
> PCIe.
> >> I don't understand this change. I don't see any code that gets
> >> executed if (adev->gmc.xgmi.connected_to_cpu). Where is the code that
> >> reports atomics support for this case?
> > The atomics aren't set but rather NO_ATOMICS are set if non-xgmi and non
> PCIe supported:
> > cpu_flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
> > CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
> > adev->gmc.xgmi.connected_to_cpu == true would bypass this flag
> NO_ATOMICS setting.
> >
> >> Also, the PCIe code doesn't clear any atomic flags. It only sets
> >> flags that would be set for XGMI anyway. So I don't see why you need
> >> to make that code conditional.
> > As mentioned above they set NO_ATOMICS if not PCIe supported.
> > This has been observed on Aldebaran with AMD systems.
>
> OK. I missed that these flags are negative logic. Thanks, the change makes
> sense now. But the patch description is a bit misleading compared to the
> code. A different wording would make it clearer, e.g.: "Don't set
> NO_ATOMICS flags on XGMI links between CPU and GPU."

Thanks for the review Felix.  Will do.

Jon

>
> With that fixed, the patch is
>
> Reviewed-by: Felix Kuehling 
>
> Regards,
>   Felix
>
>
> >
> > Thanks,
> >
> > Jon
> >
> >> Regards,
> >>   Felix
> >>
> >>
> >>> Signed-off-by: Jonathan Kim 
> >>> Tested-by: Ramesh Errabolu 
> >>> ---
> >>>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 29
> >>> ++-
> >>>  1 file changed, 18 insertions(+), 11 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> >>> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> >>> index 083ac9babfa8..30430aefcfc7 100644
> >>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> >>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> >>> @@ -1196,6 +1196,7 @@ static void
> >>> kfd_fill_iolink_non_crat_info(struct
> >>> kfd_topology_device *dev)  {
> >>>  struct kfd_iolink_properties *link, *cpu_link;  struct
> >>> kfd_topology_device *cpu_dev;
> >>> +struct amdgpu_device *adev;
> >>>  uint32_t cap;
> >>>  uint32_t cpu_flag = CRAT_IOLINK_FLAGS_ENABLED;  uint32_t flag =
> >>> CRAT_IOLINK_FLAGS_ENABLED; @@ -1203,18
> >> +1204,24 @@
> >>> static void kfd_fill_iolink_non_crat_info(struct kfd_topology_device
> >>> *dev)  if (!dev || !dev->gpu)  return;
> >>>
> >>> -pcie_capability_read_dword(dev->gpu->pdev,
> >>> -PCI_EXP_DEVCAP2, );
> >>> -
> >>> -if (!(cap & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> >>> - PCI_EXP_DEVCAP2_ATOMIC_COMP64)))
> >>> -cpu_flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
> >>> -CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
> >>> +adev = (struct amdgpu_device *)(dev->gpu->kgd); if
> >>> +(!adev->gmc.xgmi.connected_to_cpu) {
> >>> +pcie_capability_read_dword(dev->gpu->pdev,
> >>> +PCI_EXP_DEVCAP2, );
> >>> +
> >>> +if (!(cap & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> >>> + PCI_EXP_DEVCAP2_ATOMIC_COMP64))) cpu_flag |=
> >> CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
> >>> +CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
> >>> +}
> >>>
> >>> -if (!dev->gpu->pci_atomic_requested ||
> >>> -dev->gpu->device_info->asic_family == CHIP_HAWAII)
> >>> -flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
> >>> -CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
> >>> +if (!adev->gmc.xgmi.num_physical_nodes) { if
> >>> +(!dev->gpu->pci_atomic_requested ||
> >>> +dev->gpu->device_info->asic_family ==
> >>> +CHIP_HAWAII)
> >>> +flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
> >>> +CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
> >>> +}
> >>>
> >>>  /* GPU only creates direct links so apply flags setting to all */
> >>> list_for_each_entry(link, >io_link_props, list) {
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdkfd: report atomics support in io_links over xgmi

2021-04-29 Thread Felix Kuehling
Am 2021-04-29 um 11:04 a.m. schrieb Kim, Jonathan:
> [AMD Official Use Only - Internal Distribution Only]
>
>> -Original Message-
>> From: Kuehling, Felix 
>> Sent: Thursday, April 29, 2021 10:49 AM
>> To: Kim, Jonathan ; amd-
>> g...@lists.freedesktop.org
>> Cc: Zeng, Oak ; Errabolu, Ramesh
>> 
>> Subject: Re: [PATCH] drm/amdkfd: report atomics support in io_links over
>> xgmi
>>
>> Am 2021-04-29 um 5:36 a.m. schrieb Jonathan Kim:
>>> Link atomics support over xGMI should be reported independently of PCIe.
>> I don't understand this change. I don't see any code that gets executed if
>> (adev->gmc.xgmi.connected_to_cpu). Where is the code that reports
>> atomics support for this case?
> The atomics aren't set but rather NO_ATOMICS are set if non-xgmi and non PCIe 
> supported:
> cpu_flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT | 
> CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
> adev->gmc.xgmi.connected_to_cpu == true would bypass this flag NO_ATOMICS 
> setting.
>
>> Also, the PCIe code doesn't clear any atomic flags. It only sets flags that
>> would be set for XGMI anyway. So I don't see why you need to make that
>> code conditional.
> As mentioned above they set NO_ATOMICS if not PCIe supported.
> This has been observed on Aldebaran with AMD systems.

OK. I missed that these flags are negative logic. Thanks, the change
makes sense now. But the patch description is a bit misleading compared
to the code. A different wording would make it clearer, e.g.: "Don't set
NO_ATOMICS flags on XGMI links between CPU and GPU."

With that fixed, the patch is

Reviewed-by: Felix Kuehling 

Regards,
  Felix


>
> Thanks,
>
> Jon
>
>> Regards,
>>   Felix
>>
>>
>>> Signed-off-by: Jonathan Kim 
>>> Tested-by: Ramesh Errabolu 
>>> ---
>>>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 29
>>> ++-
>>>  1 file changed, 18 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>> index 083ac9babfa8..30430aefcfc7 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>> @@ -1196,6 +1196,7 @@ static void kfd_fill_iolink_non_crat_info(struct
>>> kfd_topology_device *dev)  {
>>>  struct kfd_iolink_properties *link, *cpu_link;
>>>  struct kfd_topology_device *cpu_dev;
>>> +struct amdgpu_device *adev;
>>>  uint32_t cap;
>>>  uint32_t cpu_flag = CRAT_IOLINK_FLAGS_ENABLED;
>>>  uint32_t flag = CRAT_IOLINK_FLAGS_ENABLED; @@ -1203,18
>> +1204,24 @@
>>> static void kfd_fill_iolink_non_crat_info(struct kfd_topology_device *dev)
>>>  if (!dev || !dev->gpu)
>>>  return;
>>>
>>> -pcie_capability_read_dword(dev->gpu->pdev,
>>> -PCI_EXP_DEVCAP2, );
>>> -
>>> -if (!(cap & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
>>> - PCI_EXP_DEVCAP2_ATOMIC_COMP64)))
>>> -cpu_flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
>>> -CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
>>> +adev = (struct amdgpu_device *)(dev->gpu->kgd);
>>> +if (!adev->gmc.xgmi.connected_to_cpu) {
>>> +pcie_capability_read_dword(dev->gpu->pdev,
>>> +PCI_EXP_DEVCAP2, );
>>> +
>>> +if (!(cap & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
>>> + PCI_EXP_DEVCAP2_ATOMIC_COMP64)))
>>> +cpu_flag |=
>> CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
>>> +CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
>>> +}
>>>
>>> -if (!dev->gpu->pci_atomic_requested ||
>>> -dev->gpu->device_info->asic_family == CHIP_HAWAII)
>>> -flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
>>> -CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
>>> +if (!adev->gmc.xgmi.num_physical_nodes) {
>>> +if (!dev->gpu->pci_atomic_requested ||
>>> +dev->gpu->device_info->asic_family ==
>>> +CHIP_HAWAII)
>>> +flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
>>> +CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
>>> +}
>>>
>>>  /* GPU only creates direct links so apply flags setting to all */
>>>  list_for_each_entry(link, >io_link_props, list) {
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdkfd: report atomics support in io_links over xgmi

2021-04-29 Thread Kim, Jonathan
[AMD Official Use Only - Internal Distribution Only]

> -Original Message-
> From: Kuehling, Felix 
> Sent: Thursday, April 29, 2021 10:49 AM
> To: Kim, Jonathan ; amd-
> g...@lists.freedesktop.org
> Cc: Zeng, Oak ; Errabolu, Ramesh
> 
> Subject: Re: [PATCH] drm/amdkfd: report atomics support in io_links over
> xgmi
>
> Am 2021-04-29 um 5:36 a.m. schrieb Jonathan Kim:
> > Link atomics support over xGMI should be reported independently of PCIe.
>
> I don't understand this change. I don't see any code that gets executed if
> (adev->gmc.xgmi.connected_to_cpu). Where is the code that reports
> atomics support for this case?

The atomics aren't set but rather NO_ATOMICS are set if non-xgmi and non PCIe 
supported:
cpu_flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT | 
CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
adev->gmc.xgmi.connected_to_cpu == true would bypass this flag NO_ATOMICS 
setting.

>
> Also, the PCIe code doesn't clear any atomic flags. It only sets flags that
> would be set for XGMI anyway. So I don't see why you need to make that
> code conditional.

As mentioned above they set NO_ATOMICS if not PCIe supported.
This has been observed on Aldebaran with AMD systems.

Thanks,

Jon

>
> Regards,
>   Felix
>
>
> >
> > Signed-off-by: Jonathan Kim 
> > Tested-by: Ramesh Errabolu 
> > ---
> >  drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 29
> > ++-
> >  1 file changed, 18 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> > b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> > index 083ac9babfa8..30430aefcfc7 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> > @@ -1196,6 +1196,7 @@ static void kfd_fill_iolink_non_crat_info(struct
> > kfd_topology_device *dev)  {
> >  struct kfd_iolink_properties *link, *cpu_link;
> >  struct kfd_topology_device *cpu_dev;
> > +struct amdgpu_device *adev;
> >  uint32_t cap;
> >  uint32_t cpu_flag = CRAT_IOLINK_FLAGS_ENABLED;
> >  uint32_t flag = CRAT_IOLINK_FLAGS_ENABLED; @@ -1203,18
> +1204,24 @@
> > static void kfd_fill_iolink_non_crat_info(struct kfd_topology_device *dev)
> >  if (!dev || !dev->gpu)
> >  return;
> >
> > -pcie_capability_read_dword(dev->gpu->pdev,
> > -PCI_EXP_DEVCAP2, );
> > -
> > -if (!(cap & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> > - PCI_EXP_DEVCAP2_ATOMIC_COMP64)))
> > -cpu_flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
> > -CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
> > +adev = (struct amdgpu_device *)(dev->gpu->kgd);
> > +if (!adev->gmc.xgmi.connected_to_cpu) {
> > +pcie_capability_read_dword(dev->gpu->pdev,
> > +PCI_EXP_DEVCAP2, );
> > +
> > +if (!(cap & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> > + PCI_EXP_DEVCAP2_ATOMIC_COMP64)))
> > +cpu_flag |=
> CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
> > +CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
> > +}
> >
> > -if (!dev->gpu->pci_atomic_requested ||
> > -dev->gpu->device_info->asic_family == CHIP_HAWAII)
> > -flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
> > -CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
> > +if (!adev->gmc.xgmi.num_physical_nodes) {
> > +if (!dev->gpu->pci_atomic_requested ||
> > +dev->gpu->device_info->asic_family ==
> > +CHIP_HAWAII)
> > +flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
> > +CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
> > +}
> >
> >  /* GPU only creates direct links so apply flags setting to all */
> >  list_for_each_entry(link, >io_link_props, list) {
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdkfd: report atomics support in io_links over xgmi

2021-04-29 Thread Felix Kuehling
Am 2021-04-29 um 5:36 a.m. schrieb Jonathan Kim:
> Link atomics support over xGMI should be reported independently of PCIe.

I don't understand this change. I don't see any code that gets executed
if (adev->gmc.xgmi.connected_to_cpu). Where is the code that reports
atomics support for this case?

Also, the PCIe code doesn't clear any atomic flags. It only sets flags
that would be set for XGMI anyway. So I don't see why you need to make
that code conditional.

Regards,
  Felix


>
> Signed-off-by: Jonathan Kim 
> Tested-by: Ramesh Errabolu 
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 29 ++-
>  1 file changed, 18 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> index 083ac9babfa8..30430aefcfc7 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> @@ -1196,6 +1196,7 @@ static void kfd_fill_iolink_non_crat_info(struct 
> kfd_topology_device *dev)
>  {
>   struct kfd_iolink_properties *link, *cpu_link;
>   struct kfd_topology_device *cpu_dev;
> + struct amdgpu_device *adev;
>   uint32_t cap;
>   uint32_t cpu_flag = CRAT_IOLINK_FLAGS_ENABLED;
>   uint32_t flag = CRAT_IOLINK_FLAGS_ENABLED;
> @@ -1203,18 +1204,24 @@ static void kfd_fill_iolink_non_crat_info(struct 
> kfd_topology_device *dev)
>   if (!dev || !dev->gpu)
>   return;
>  
> - pcie_capability_read_dword(dev->gpu->pdev,
> - PCI_EXP_DEVCAP2, );
> -
> - if (!(cap & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> -  PCI_EXP_DEVCAP2_ATOMIC_COMP64)))
> - cpu_flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
> - CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
> + adev = (struct amdgpu_device *)(dev->gpu->kgd);
> + if (!adev->gmc.xgmi.connected_to_cpu) {
> + pcie_capability_read_dword(dev->gpu->pdev,
> + PCI_EXP_DEVCAP2, );
> +
> + if (!(cap & (PCI_EXP_DEVCAP2_ATOMIC_COMP32 |
> +  PCI_EXP_DEVCAP2_ATOMIC_COMP64)))
> + cpu_flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
> + CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
> + }
>  
> - if (!dev->gpu->pci_atomic_requested ||
> - dev->gpu->device_info->asic_family == CHIP_HAWAII)
> - flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
> - CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
> + if (!adev->gmc.xgmi.num_physical_nodes) {
> + if (!dev->gpu->pci_atomic_requested ||
> + dev->gpu->device_info->asic_family ==
> + CHIP_HAWAII)
> + flag |= CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT |
> + CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT;
> + }
>  
>   /* GPU only creates direct links so apply flags setting to all */
>   list_for_each_entry(link, >io_link_props, list) {
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx