Hi Frederic-Emmanuel,

On 2024-09-27 05:51, Frederic-Emmanuel Picca wrote:
I just installed rocm-opencl-icd and try to run the clpeak program.

But I get this error message

$ clpeak
: CommandLine Error: Option 'sanitizer-early-opt-ep' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options
Abandon

I think this sort of error message occurs when two different versions of the clang libraries are loaded in the same program, but I'm not clear on what's pulling them in. IIRC, Kari had to mark the mesa-opencl-icd as a conflict of the rocm-opencl-icd package due to a problem like this one. That surprised me, as just having both packages installed on the system would cause the problem. I'm still not sure why that is.

In any case, it works on my machine. I cannot reproduce the bug, so it will require more investigation to determine why you're encountering this problem and I am not.

cgmb@scorbunny:~$ clpeak

Platform: AMD Accelerated Parallel Processing
  Device: gfx906:sramecc+:xnack-
    Driver version  : 3590.0 (HSA1.1,LC) (Linux x64)
    Compute units   : 60
    Clock frequency : 1801 MHz

    Global memory bandwidth (GBPS)
      float   : 767.18
      float2  : 804.05
      float4  : 785.04
      float8  : 779.17
      float16 : 596.06

    Single-precision compute (GFLOPS)
      float   : 13597.88
      float2  : 13077.49
      float4  : 12752.28
      float8  : 12658.15
      float16 : 12519.22

    Half-precision compute (GFLOPS)
      half   : 6663.12
      half2  : 24404.30
      half4  : 24140.63
      half8  : 23903.13
      half16 : 24237.96

    Double-precision compute (GFLOPS)
      double   : 3312.10
      double2  : 3221.78
      double4  : 3217.00
      double8  : 3176.58
      double16 : 3170.24

    Integer compute (GIOPS)
      int   : 4355.16
      int2  : 4351.48
      int4  : 4365.67
      int8  : 4352.32
      int16 : 4311.61

    Integer compute Fast 24bit (GIOPS)
      int   : 12316.54
      int2  : 11368.70
      int4  : 10943.03
      int8  : 10985.45
      int16 : 10581.34

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer              : 16.92
      enqueueReadBuffer               : 16.69
      enqueueWriteBuffer non-blocking : 16.78
      enqueueReadBuffer non-blocking  : 16.64
      enqueueMapBuffer(for read)      : 244032.22
        memcpy from mapped ptr        : 16.61
      enqueueUnmap(after write)       : 357913.94
        memcpy to mapped ptr          : 16.49

    Kernel launch latency : -1493179648.00 us

Sincerely,
Cory Bloor

Reply via email to