On 26/01/2024 10:40, Richard Biener wrote:
The following avoids selecting an unsupported agent early, avoiding
later asserts when we rely on it being supported.

tested on x86_64-unknown-linux-gnu -> amdhsa-gcn on gfx1060

that's the alternative to the other patch.  I do indeed seem to get
the other (unsupported) agent selected somehow after the other supported
agent finished a kernel run.  Not sure if it's the CPU or the IGPU though.

OK?  Which variant?

So, looking at it again, the original intent of the assert was to alert toolchain developers that they missed adding a new name when porting to a new device, but I concur that it's not ideal when the assert encounters an unknown device in the wild.

However, if we're trying to do something more useful than merely fixing an ugly error message, maybe we should look at removing unsupported devices in "suitable_hsa_agent_p" instead? Unsupported GPUs wouldn't be assigned a device number at all.

Probably devices that are GPUs but skipped because they are unsupported should be mentioned on GOMP_DEBUG (as well as GCN_DEBUG)?

The goal should be that folks with your twin-GPU setup shouldn't have to work around it, but I don't really want to remove the message for people who only have one device but don't realize it is unsupported.

On the other hand, if a user has two devices that *are* supported, but the second one is preferred, they'll have to set OMP_DEFAULT_DEVICE explicitly, and is this so different?

As a user, WDYT?

Andrew


libgomp/
        * plugin/plugin-gcn.c (get_agent_info): When the agent isn't supported
        return NULL.
---
  libgomp/plugin/plugin-gcn.c | 7 +++++++
  1 file changed, 7 insertions(+)

diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index d8c3907c108..f453f630e06 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -1036,6 +1036,8 @@ print_kernel_dispatch (struct kernel_dispatch *dispatch, 
unsigned indent)
  /* }}}  */
  /* {{{ Utility functions  */
+static const char* isa_hsa_name (int isa);
+
  /* Cast the thread local storage to gcn_thread.  */
static inline struct gcn_thread *
@@ -1589,6 +1591,11 @@ get_agent_info (int n)
        GOMP_PLUGIN_error ("Attempt to use an uninitialized GCN agent.");
        return NULL;
      }
+  if (!isa_hsa_name (hsa_context.agents[n].device_isa))
+    {
+      GOMP_PLUGIN_error ("Attempt to use an unsupported GCN agent.");
+      return NULL;
+    }
    return &hsa_context.agents[n];
  }

Reply via email to