================ @@ -1284,7 +1284,29 @@ The AMDGPU backend implements the following LLVM IR intrinsics. | ``// 5 MFMA`` | ``__builtin_amdgcn_sched_group_barrier(8, 5, 0)`` - llvm.amdgcn.iglp_opt An **experimental** intrinsic for instruction group level parallelism. The intrinsic + llvm.amdgcn.sched.group.barrier.rule It has the same behavior as sched.group.barrier, except the intrinsic includes a fourth argument: + + - RuleMask : The bitmask of rules which are applied to the SchedGroup. + + The RuleMask is handled as a 64 bit integer, so 64 rules are encodable with a single mask. + + Users can access the intrinsic by specifying the optional fourth argument in sched_group_barrier builtin + + | ``// 1 VMEM read invoking rules 1 and 2`` + | ``__builtin_amdgcn_sched_group_barrier(32, 1, 0, 3)`` + + Currently available rules are: + - 0x0000: No rule. + - 0x0001: Instructions in the SchedGroup must not write to the same register + that a previously occuring V_CNDMASK_B32_e64 reads from. + - 0x0002: Instructions in the SchedGroup must not write to the same register + that a previously occuring V_PERM_B32_e64 reads from. + - 0x0004: Instructions in the SchedGroup must require data produced by a + V_CNDMASK_B32_e64. + - 0x0008: Instructions in the SchedGroup must require data produced by a + V_PERM_B32_e64. + ---------------- arsenm wrote:
These scheduling rules seem way too specific. Especially that it's pointing out specific instruction encodings, by the internal pseudoinstruction names https://github.com/llvm/llvm-project/pull/85304 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits