@@ -520,6 +520,106 @@ Every processor supports every OS ABI (see
:ref:`amdgpu-os`) with the following
=== === = =
=== === ==
+Generic processors also exist. They group multiple processors into one,
+allowing to build code once and run it on multiple targets at the cost
+of less features being available.
+
+Generic processors are only available on Code Object V6 and up.
+
+ .. table:: AMDGPU Generic Processors
+ :name: amdgpu-generic-processor-table
+
+ == =
=
+ Processor TargetSupported Target
+ TripleProcessorsFeatures
+ ArchitectureRestrictions
+
+
+
+
+
+
+
+
+ == =
=
+ ``gfx9-generic`` ``amdgcn`` - ``gfx900`` - ``v_mad_mix``
instructions
+ - ``gfx902``are not available
on
+ - ``gfx904````gfx900``,
``gfx902``,
+ - ``gfx906````gfx909``,
``gfx90c``
+ - ``gfx909`` - ``v_fma_mix``
instructions
+ - ``gfx90c``are not available
on ``gfx904``
+ - sramecc is not
available on
+ ``gfx906``
+ - The following
instructions
+ are not available
on ``gfx906``:
+
+ - ``v_fmac_f32``
+ - ``v_xnor_b32``
+ -
``v_dot4_i32_i8``
+ -
``v_dot8_i32_i4``
+ -
``v_dot2_i32_i16``
+ -
``v_dot2_u32_u16``
+ -
``v_dot4_u32_u8``
+ -
``v_dot8_u32_u4``
+ -
``v_dot2_f32_f16``
+
+
+ ``gfx10.1-generic`` ``amdgcn`` - ``gfx1010`` - The following
instructions are
+ - ``gfx1011`` not available on
``gfx1011``
+ - ``gfx1012`` and ``gfx1012``
+ - ``gfx1013``
+ -
``v_dot4_i32_i8``
+ -
``v_dot8_i32_i4``
+ -
``v_dot2_i32_i16``
+ -
``v_dot2_u32_u16``
+ -
``v_dot2c_f32_f16``
+ -
``v_dot4c_i32_i8``
+ -
``v_dot4_u32_u8``
+ -
``v_dot8_u32_u4``
+ -
``v_dot2_f32_f16``
+
+ - BVH Ray Tracing
instructions
+ are not available
on
+ ``gfx1013``
+
+
+ ``gfx10.3-generic`` ``amdgcn`` - ``gfx1030`` No restrictions.
+ - ``gfx1031``
+ - ``gfx1032``
+ - ``gfx1033``
+ - ``gfx1034``
+ - ``gfx1035``
+ - ``gfx1036``
+
+
+ ``gfx11-generic````amdgcn`` - ``gfx1100`` Various codegen
pessimizations
+ - ``gfx1101`` are applied to all
targets to
+ - ``gfx1102`` work around
hardware bugs on one
t-tye wrote:
I do not think we should be stating hardware bugs exist in public
documentation. We can simply say less efficient code sequences are generated in
various cases. Not sure we should list them.
Do we use msaa-load-dst-sel-bug, valu-trans-use-hazard, user-sgpr-init16-bug
elsewhere in the code? Not sure we