On 28/07/2025 10:45, Tobias Burnus wrote:
Tiny cleanup patch - as fallout of trying to understand MI300 and
its fails better

(A) Replace 's_nop 0x0; s_nop 0x0; s_nop 0x0' by 's_nop 0x2'.

Advantage: fewer instructions - this helps on the hardware side by
permitting more follow-up instructions in the instruction cache and
it reduces the file size a bit. The latter is nice by itself but as
to many instructions will cause issues with debugging, it avoids some
assembler errors - admittedly only in near-edge cases, but still.

The latter is issue is https://gcc.gnu.org/PR119367 which is about
location views that are computed as, e.g., '.2byte .LM6-.LM5 and if
this overflows, there is an assembler error.

[I hit this issue during debugging when adding at least two 's_nop'
between all instructions - doing for at least two or (at least) five
will trigger this for libgomp's complex matmul, only. Workaround:
compile those files without '-g'.]

(B) Extend two comments:

Add a reminder (comment) that 'sc0' (scope 0) has a special
meaning for atomics (same as 'glc' in MI200). This is to
prevent messing around with TARGET_GLC_NAME by, e.g., replacing
'sc0' by 'sc1' - which has unintended effects for atomics ...

For template '%' formatting: avoid accidental reuse of letters
'V' and 'R' and make it quicker to understand them when encountered.
Granted, as those appear as 'case' in the switch statement and
are documented there; still, it helps both with understanding a
'%R' more quickly and moves a compile-time fail for already
written code to a handling it correctly already during coding.

OK for mainline?

Tobias

PS: I checked that s_nop 0x0 to 0xf is documented also for old Vega
and for RDNA; additionally, I don't see bootstrap issues in terms
due to llvm-mc complaining about the assembly - neither for a standard
build nor for a gfx942 build.

PPS: I still have to send a fixed version of the second 's_nop' patch,
addressing the review comments.

OK.

Combining the nops has been on the low-priority to-do list since forever. Thanks for cleaning this up.

Andrew

Reply via email to