> On 28 Jul 2025, at 17:10, Remi Machet <rmac...@nvidia.com> wrote: > > > On 7/28/25 17:02, Kyrylo Tkachov wrote: >> External email: Use caution opening links or attachments >> >> >> Hi Spencer, >> >>> On 28 Jul 2025, at 16:25, Spencer Abson <spencer.ab...@arm.com> wrote: >>> >>> Streaming-compatible functions can be compiled without SME enabled, but need >>> to use "SMSTART SM" and "SMSTOP SM" to temporarily switch into the streaming >>> mode of a callee. These switches are conditional on the current mode being >>> opposite to the target mode, so no SME instructions are executed if SME is >>> not >>> available. >>> >>> However, in GAS, "SMSTART SM" and "SMSTOP SM" always require +sme. A call >>> from a streaming-compatible function, compiled without SME enabled, to a non >>> -streaming function will be rejected as: >> I guess this is an argument for allowing SMTART/SMSTOP in gas >> unconditionally? >> Though it would need to be consistent with LLVM and we’d still need a >> workaround for older binutils, so okay... >> >> >>> Error: selected processor does not support `smstop sm'.. >>> >>> To work around this, we make use of the .inst directive to insert the >>> literal >>> encodings of "SMSTART SM" and "SMSTOP SM". >>> >>> gcc/ChangeLog: >>> PR target/121028 >>> * config/aarch64/aarch64-sme.md (aarch64_smstart_sm): Use the .inst >>> directive if !TARGET_SME. >>> (aarch64_smstop_sm): Likewise. >>> >>> gcc/testsuite/ChangeLog: >>> PR target/121028 >>> * gcc.target/aarch64/sme/call_sm_switch_1.c: Tell check-function >>> -bodies not to ignore .inst directives, and replace the test for >>> "smstart sm" with one for it's encoding. >>> * gcc.target/aarch64/sme/call_sm_switch_11.c: Likewise. >>> * gcc.target/aarch64/sme/pr121028.c: New test. >>> --- >>> gcc/config/aarch64/aarch64-sme.md | 12 ++++- >>> .../gcc.target/aarch64/sme/call_sm_switch_1.c | 4 +- >>> .../aarch64/sme/call_sm_switch_11.c | 5 +- >>> .../gcc.target/aarch64/sme/pr121028.c | 46 +++++++++++++++++++ >>> 4 files changed, 61 insertions(+), 6 deletions(-) >>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/pr121028.c >>> >>> diff --git a/gcc/config/aarch64/aarch64-sme.md >>> b/gcc/config/aarch64/aarch64-sme.md >>> index 6b3f4390943..5264bf75c0b 100644 >>> --- a/gcc/config/aarch64/aarch64-sme.md >>> +++ b/gcc/config/aarch64/aarch64-sme.md >>> @@ -62,6 +62,10 @@ >>> ;; (b) they are sometimes used conditionally, particularly in streaming- >>> ;; compatible code. >>> ;; >>> +;; To prevent the latter from upsetting the assembler, we emit the literal >>> +;; encodings of "SMSTART SM" and "SMSTOP SM" when compiling without >>> +;; TARGET_SME. >>> +;; >>> ;; ========================================================================= >>> >>> ;; ------------------------------------------------------------------------- >>> @@ -161,7 +165,9 @@ >>> (clobber (reg:VNx16BI P14_REGNUM)) >>> (clobber (reg:VNx16BI P15_REGNUM))] >>> "" >>> - "smstart\tsm" >>> + { >>> + return TARGET_SME ? "smstart\tsm" : ".inst 0xd503437f"; >>> + } >> Usually when we emit such .inst directives we also emit the mnemonic as an >> assembly comment to aid human assembly debugging. >> Ok by me with these changes, but maybe Richard is best-placed to comment on >> the SME parts. >> Thanks, >> Kyrill >> >> >>> ) >>> >>> ;; Turn off streaming mode. This clobbers all SVE state. >>> @@ -196,7 +202,9 @@ >>> (clobber (reg:VNx16BI P14_REGNUM)) >>> (clobber (reg:VNx16BI P15_REGNUM))] >>> "" >>> - "smstop\tsm" >>> + { >>> + return TARGET_SME ? "smstop\tsm" : ".inst 0xd503427f"; >>> + } >>> ) >>> >>> ;; ------------------------------------------------------------------------- >>> diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_1.c >>> b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_1.c >>> index 98922aaeae0..7d514bd46c1 100644 >>> --- a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_1.c >>> +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_1.c >>> @@ -1,5 +1,5 @@ >>> // { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls >>> -funwind-tables" } >>> -// { dg-final { check-function-bodies "**" "" } } >>> +// { dg-final { check-function-bodies "**" "" "" { target "*-*-*" } >>> {\t\.inst} } } >>> >>> void ns_callee (); >>> void s_callee () [[arm::streaming]]; >>> @@ -218,7 +218,7 @@ sc_caller_x1 (int *ptr, int a) >>> [[arm::streaming_compatible]] >>> ** bl ns_callee_stack >>> ** ldr x16, \[x29, #?16\] >>> ** tbz x16, 0, .* >>> -** smstart sm >>> +** .inst 0xd503437f >>> ** ... >>> */ >>> void >>> diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_11.c >>> b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_11.c >>> index ee6f98737d9..7f1eab716eb 100644 >>> --- a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_11.c >>> +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_11.c >>> @@ -1,5 +1,6 @@ >>> // { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls >>> -funwind-tables -mtrack-speculation" } >>> -// { dg-final { check-function-bodies "**" "" } } >>> +// { dg-final { check-function-bodies "**" "" "" { target "*-*-*" } >>> {\t\.inst} } } >>> + >>> >>> void ns_callee (); >>> void s_callee () [[arm::streaming]]; >>> @@ -196,7 +197,7 @@ sc_caller_x1 (int *ptr, int a) >>> [[arm::streaming_compatible]] >>> ** tst x16, #?1 >>> ** beq [^\n]* >>> ** csel x15, x15, xzr, ne >>> -** smstart sm >>> +** .inst 0xd503437f >>> ** ... >>> */ >>> void >>> diff --git a/gcc/testsuite/gcc.target/aarch64/sme/pr121028.c >>> b/gcc/testsuite/gcc.target/aarch64/sme/pr121028.c >>> new file mode 100644 >>> index 00000000000..fa36d4ec170 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/aarch64/sme/pr121028.c >>> @@ -0,0 +1,46 @@ >>> +// PR121028 >>> +// { dg-do assemble { target aarch64_asm_sme_ok } } >>> +// { dg-options "-O --save-temps" } >>> +// { dg-final { check-function-bodies "**" "" "" { target "*-*-*" } >>> {\t\.inst} } } >>> + >>> +void ns_callee (); >>> + >>> +/* >>> +** sc_caller_sme: >>> +** ... >>> +** mrs x16, svcr >>> +** str x16, \[x29, #?16\] >>> +** ldr x16, \[x29, #?16\] >>> +** tbz x16, 0, .* >>> +** smstop sm >>> +** bl ns_callee >>> +** ldr x16, \[x29, #?16\] >>> +** tbz x16, 0, .* >>> +** smstart sm >>> +** ... >>> +*/ >>> +void sc_caller_sme() __arm_streaming_compatible >>> +{ >>> + ns_callee (); >>> +} >>> + >>> +#pragma GCC target "+nosme" >>> + >>> +/* >>> +** sc_caller_nosme: >>> +** ... >>> +** bl __arm_sme_state >>> +** str x0, \[x29, #?16\] >>> +** ldr x16, \[x29, #?16\] >>> +** tbz x16, 0, .* >>> +** .inst 0xd503427f >>> +** bl ns_callee >>> +** ldr x16, \[x29, #?16\] >>> +** tbz x16, 0, .* >>> +** .inst 0xd503437f >>> +** ... >>> +*/ >>> +void sc_caller_nosme() __arm_streaming_compatible >>> +{ >>> + ns_callee (); >>> +} >>> -- >>> 2.34.1 >>> > Hi Spencer, > > I don't see a signed-off-by, did I miss it?
AFAIK Spencer and other @arm.com people contribute under a copyright assignment to the FSF scheme, which doesn’t require a signed-off-by tag. The tag is only needed for people who want to contribute their code under DCO: https://gcc.gnu.org/dco.html Thanks, Kyrill > > Remi >