[lld] [flang] [llvm] [clang] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-01-25 Thread Konstantin Zhuravlyov via cfe-commits


@@ -49,6 +49,11 @@ constexpr uint32_t VersionMajorV5 = 1;
 /// HSA metadata minor version for code object V5.
 constexpr uint32_t VersionMinorV5 = 2;
 
+/// HSA metadata major version for code object V6.
+constexpr uint32_t VersionMajorV6 = 1;
+/// HSA metadata minor version for code object V6.
+constexpr uint32_t VersionMinorV6 = 3;

kzhuravl wrote:

@AlexVlx, this "HSA Metadata" is AMD-specific "HSA Metadata", so it is not part 
of the HSA standards. Maybe updating the comment to mention it is AMD-specific 
should be done.

I'd also prefer to not update the metadata version unless we change it.

https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[lld] [flang] [llvm] [clang] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-01-17 Thread Tony Tye via cfe-commits


@@ -520,6 +520,106 @@ Every processor supports every OS ABI (see 
:ref:`amdgpu-os`) with the following
 
  === ===  = = 
=== === ==
 
+Generic processors also exist. They group multiple processors into one,
+allowing to build code once and run it on multiple targets at the cost
+of less features being available.
+
+Generic processors are only available on Code Object V6 and up.
+
+  .. table:: AMDGPU Generic Processors
+ :name: amdgpu-generic-processor-table
+
+  == = 
=
+ Processor TargetSupported Target
+   TripleProcessorsFeatures
+   ArchitectureRestrictions
+
+
+
+
+
+
+
+
+  == = 
=
+ ``gfx9-generic`` ``amdgcn`` - ``gfx900``  - ``v_mad_mix`` 
instructions
+ - ``gfx902``are not available 
on
+ - ``gfx904````gfx900``, 
``gfx902``,
+ - ``gfx906````gfx909``, 
``gfx90c``
+ - ``gfx909``  - ``v_fma_mix`` 
instructions
+ - ``gfx90c``are not available 
on ``gfx904``
+   - sramecc is not 
available on
+ ``gfx906``
+   - The following 
instructions
+ are not available 
on ``gfx906``:
+
+ - ``v_fmac_f32``
+ - ``v_xnor_b32``
+ - 
``v_dot4_i32_i8``
+ - 
``v_dot8_i32_i4``
+ - 
``v_dot2_i32_i16``
+ - 
``v_dot2_u32_u16``
+ - 
``v_dot4_u32_u8``
+ - 
``v_dot8_u32_u4``
+ - 
``v_dot2_f32_f16``
+
+
+ ``gfx10.1-generic``  ``amdgcn`` - ``gfx1010`` - The following 
instructions are
+ - ``gfx1011``   not available on 
``gfx1011``
+ - ``gfx1012``   and ``gfx1012``
+ - ``gfx1013``
+ - 
``v_dot4_i32_i8``
+ - 
``v_dot8_i32_i4``
+ - 
``v_dot2_i32_i16``
+ - 
``v_dot2_u32_u16``
+ - 
``v_dot2c_f32_f16``
+ - 
``v_dot4c_i32_i8``
+ - 
``v_dot4_u32_u8``
+ - 
``v_dot8_u32_u4``
+ - 
``v_dot2_f32_f16``
+
+   - BVH Ray Tracing 
instructions
+ are not available 
on
+ ``gfx1013``
+
+
+ ``gfx10.3-generic``  ``amdgcn`` - ``gfx1030`` No restrictions.
+ - ``gfx1031``
+ - ``gfx1032``
+ - ``gfx1033``
+ - ``gfx1034``
+ - ``gfx1035``
+ - ``gfx1036``
+
+
+ ``gfx11-generic````amdgcn`` - ``gfx1100`` Various codegen 
pessimizations
+ - ``gfx1101`` are applied to all 
targets to
+ - ``gfx1102`` work around 
hardware bugs on one

t-tye wrote:

I do not think we should be stating hardware bugs exist in public 
documentation. We can simply say less efficient code sequences are generated in 
various cases. Not sure we should list them.

Do we use msaa-load-dst-sel-bug, valu-trans-use-hazard, user-sgpr-init16-bug 
elsewhere in the code? Not sure we 

[lld] [flang] [llvm] [clang] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-01-10 Thread Pierre van Houtryve via cfe-commits

https://github.com/Pierre-vh edited 
https://github.com/llvm/llvm-project/pull/76955
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits