https://github.com/vossjannik created https://github.com/llvm/llvm-project/pull/188463
Without constrained FP semantics, the optimizer may speculatively execute FP operations past guard branches (via if-conversion / branch folding), setting sticky FPSCR bits even when the guarding condition is false. Users who read FPSCR sticky exception bits can now pass -ffp-exception-behavior=maytrap on ARM32 to prevent this. Two coordinated changes: 1. Set HasStrictFP = true in ARMTargetInfo (ARM.cpp) so the frontend accepts -ffp-exception-behavior on ARM32, matching AArch64/x86/etc. 2. Mark STRICT_FADD/FSUB/FMUL/FDIV/FSQRT/FMA as Expand in ARMISelLowering so mutateStrictFPToFP converts them to normal VFP ops while preserving chain ordering that prevents speculation. No default behavior is changed. Update pragma-fp-warn.c (thumbv7 now supports strict FP pragmas) and add two new regression tests: a cc1 + IR test and a backend CodeGen test. >From 33a11fe8e81583fa44277e622fe9e0481ac6469a Mon Sep 17 00:00:00 2001 From: Jannik Voss <[email protected]> Date: Wed, 4 Mar 2026 09:34:44 +0100 Subject: [PATCH] [ARM] Enable -ffp-exception-behavior=maytrap support on ARM32 Without constrained FP semantics, the optimizer may speculatively execute FP operations past guard branches (via if-conversion / branch folding), setting sticky FPSCR bits even when the guarding condition is false. Users who read FPSCR sticky exception bits can now pass -ffp-exception-behavior=maytrap on ARM32 to prevent this. Two coordinated changes: 1. Set HasStrictFP = true in ARMTargetInfo (ARM.cpp) so the frontend accepts -ffp-exception-behavior on ARM32, matching AArch64/x86/etc. 2. Mark STRICT_FADD/FSUB/FMUL/FDIV/FSQRT/FMA as Expand in ARMISelLowering so mutateStrictFPToFP converts them to normal VFP ops while preserving chain ordering that prevents speculation. No default behavior is changed. Update pragma-fp-warn.c (thumbv7 now supports strict FP pragmas) and add two new regression tests: a cc1 + IR test and a backend CodeGen test. --- clang/lib/Basic/Targets/ARM.cpp | 7 ++ clang/test/CodeGen/arm-fp-exception-default.c | 20 ++++ clang/test/CodeGen/arm-fp-maytrap-ifconv.c | 41 ++++++++ clang/test/Parser/pragma-fp-warn.c | 2 +- llvm/lib/Target/ARM/ARMISelLowering.cpp | 32 +++++++ llvm/test/CodeGen/ARM/fp-maytrap-default.ll | 96 +++++++++++++++++++ 6 files changed, 197 insertions(+), 1 deletion(-) create mode 100644 clang/test/CodeGen/arm-fp-exception-default.c create mode 100644 clang/test/CodeGen/arm-fp-maytrap-ifconv.c create mode 100644 llvm/test/CodeGen/ARM/fp-maytrap-default.ll diff --git a/clang/lib/Basic/Targets/ARM.cpp b/clang/lib/Basic/Targets/ARM.cpp index f21e9ebbc903a..e8666a580ad6c 100644 --- a/clang/lib/Basic/Targets/ARM.cpp +++ b/clang/lib/Basic/Targets/ARM.cpp @@ -330,6 +330,13 @@ ARMTargetInfo::ARMTargetInfo(const llvm::Triple &Triple, : "\01mcount"; SoftFloatABI = llvm::is_contained(Opts.FeaturesAsWritten, "+soft-float-abi"); + + // Enable strict floating-point support so that -ffp-exception-behavior=maytrap + // is honored. The ARM backend handles STRICT_F* nodes via the + // mutateStrictFPToFP expansion path, which converts them to normal FP ops at + // instruction selection time while preserving the ordering side-effects that + // prevent speculative execution of FP operations past branches. + HasStrictFP = true; } StringRef ARMTargetInfo::getABI() const { return ABI; } diff --git a/clang/test/CodeGen/arm-fp-exception-default.c b/clang/test/CodeGen/arm-fp-exception-default.c new file mode 100644 index 0000000000000..9b1fbb2da4d7d --- /dev/null +++ b/clang/test/CodeGen/arm-fp-exception-default.c @@ -0,0 +1,20 @@ +// REQUIRES: arm-registered-target +// +// Test that -ffp-exception-behavior=maytrap is accepted on ARM32 and +// produces constrained FP intrinsics. HasStrictFP is set to true in +// ARMTargetInfo so the frontend no longer rejects this flag on ARM. +// The STRICT_F* nodes are handled via the mutateStrictFPToFP expansion path +// in the backend (see fp-maytrap-default.ll for the codegen half). + +// RUN: %clang_cc1 -triple armv7a-none-eabi -target-cpu cortex-a9 \ +// RUN: -ffp-exception-behavior=maytrap \ +// RUN: -disable-O0-optnone -emit-llvm %s -o - \ +// RUN: | FileCheck -check-prefix=IR %s + +float guarded_div(float a, float b) { + // IR-LABEL: define {{.*}} @guarded_div( + // IR: call float @llvm.experimental.constrained.fdiv.f32( + // IR-SAME: metadata !"fpexcept.maytrap" + // IR: attributes {{.*}} = { {{.*}}strictfp{{.*}} } + return a / b; +} diff --git a/clang/test/CodeGen/arm-fp-maytrap-ifconv.c b/clang/test/CodeGen/arm-fp-maytrap-ifconv.c new file mode 100644 index 0000000000000..91a1f8c97e1d9 --- /dev/null +++ b/clang/test/CodeGen/arm-fp-maytrap-ifconv.c @@ -0,0 +1,41 @@ +// REQUIRES: arm-registered-target +// +// End-to-end test: compile C to ARM assembly and verify that +// -ffp-exception-behavior=maytrap prevents if-conversion of FP operations, +// while the default (no flag) still allows if-conversion as before. +// +// The function "pick" has cheap FP ops in both branches, which is the classic +// pattern that triggers ARM if-conversion into a branchless predicated block. +// +// With maytrap: FP ops are constrained (have side-effects on FPSCR), so the +// optimizer must preserve the branch. Only the taken path executes its FP op. +// +// Without maytrap: the optimizer if-converts both paths into a single block +// with conditional moves, speculatively executing all FP ops. + +// RUN: %clang -target armv7a-none-eabi -mcpu=cortex-a9 -mfloat-abi=hard -O2 \ +// RUN: -ffp-exception-behavior=maytrap -S -o - %s \ +// RUN: | FileCheck -check-prefix=MAYTRAP %s + +// RUN: %clang -target armv7a-none-eabi -mcpu=cortex-a9 -mfloat-abi=hard -O2 \ +// RUN: -S -o - %s \ +// RUN: | FileCheck -check-prefix=DEFAULT %s + +// --- maytrap: branch preserved, FP ops in separate basic blocks --- +// MAYTRAP-LABEL: pick: +// MAYTRAP: beq +// MAYTRAP: vadd.f32 +// MAYTRAP: vsub.f32 + +// --- default: if-converted, no branch, branchless predicated block --- +// DEFAULT-LABEL: pick: +// DEFAULT-NOT: beq +// DEFAULT: vadd.f32 +// DEFAULT-NOT: vsub.f32 + +float pick(int flag, float a, float b) { + if (flag) + return a + b; + else + return a - b; +} diff --git a/clang/test/Parser/pragma-fp-warn.c b/clang/test/Parser/pragma-fp-warn.c index c52bd4e4805ab..ad16510573059 100644 --- a/clang/test/Parser/pragma-fp-warn.c +++ b/clang/test/Parser/pragma-fp-warn.c @@ -1,6 +1,6 @@ // RUN: %clang_cc1 -triple wasm32 -fsyntax-only -Wno-unknown-pragmas -Wignored-pragmas -verify %s -// RUN: %clang_cc1 -triple thumbv7 -fsyntax-only -Wno-unknown-pragmas -Wignored-pragmas -verify %s +// RUN: %clang_cc1 -DEXPOK -triple thumbv7 -fsyntax-only -Wno-unknown-pragmas -Wignored-pragmas -verify %s // RUN: %clang_cc1 -DEXPOK -triple aarch64 -fsyntax-only -Wno-unknown-pragmas -Wignored-pragmas -verify %s // RUN: %clang_cc1 -DEXPOK -triple x86_64 -fsyntax-only -Wno-unknown-pragmas -Wignored-pragmas -verify %s // RUN: %clang_cc1 -DEXPOK -triple systemz -fsyntax-only -Wno-unknown-pragmas -Wignored-pragmas -verify %s diff --git a/llvm/lib/Target/ARM/ARMISelLowering.cpp b/llvm/lib/Target/ARM/ARMISelLowering.cpp index 4fd845fbc07ac..2381d39c58869 100644 --- a/llvm/lib/Target/ARM/ARMISelLowering.cpp +++ b/llvm/lib/Target/ARM/ARMISelLowering.cpp @@ -851,6 +851,14 @@ ARMTargetLowering::ARMTargetLowering(const TargetMachine &TM_, setOperationAction(ISD::FNEG, MVT::f64, Expand); setOperationAction(ISD::FABS, MVT::f64, Expand); setOperationAction(ISD::FSQRT, MVT::f64, Expand); + // Strict FP counterparts must also be Expand so the legalizer converts + // them to library calls (via ConvertNodeToLibcall). + setOperationAction(ISD::STRICT_FADD, MVT::f64, Expand); + setOperationAction(ISD::STRICT_FSUB, MVT::f64, Expand); + setOperationAction(ISD::STRICT_FMUL, MVT::f64, Expand); + setOperationAction(ISD::STRICT_FMA, MVT::f64, Expand); + setOperationAction(ISD::STRICT_FDIV, MVT::f64, Expand); + setOperationAction(ISD::STRICT_FSQRT, MVT::f64, Expand); setOperationAction(ISD::FSIN, MVT::f64, Expand); setOperationAction(ISD::FCOS, MVT::f64, Expand); setOperationAction(ISD::FPOW, MVT::f64, Expand); @@ -1229,6 +1237,8 @@ ARMTargetLowering::ARMTargetLowering(const TargetMachine &TM_, if (!Subtarget->hasVFP4Base()) { setOperationAction(ISD::FMA, MVT::f64, Expand); setOperationAction(ISD::FMA, MVT::f32, Expand); + setOperationAction(ISD::STRICT_FMA, MVT::f64, Expand); + setOperationAction(ISD::STRICT_FMA, MVT::f32, Expand); } // Various VFP goodness @@ -1256,6 +1266,28 @@ ARMTargetLowering::ARMTargetLowering(const TargetMachine &TM_, setOperationAction(ISD::STRICT_FSETCCS, MVT::f32, Custom); setOperationAction(ISD::STRICT_FSETCC, MVT::f64, Custom); setOperationAction(ISD::STRICT_FSETCCS, MVT::f64, Custom); + + // Mark strict FP arithmetic as Expand so that mutateStrictFPToFP (in + // SelectionDAGISel) converts them to normal FP opcodes before instruction + // selection. This preserves the chain side-effects that prevent the + // optimizer from speculating FP operations past branches, while still + // using the existing VFP/NEON instruction patterns for codegen. + // Note: we do NOT set IsStrictFPEnabled, so the mutation path fires. + for (auto Op : {ISD::STRICT_FADD, ISD::STRICT_FSUB, ISD::STRICT_FMUL, + ISD::STRICT_FDIV, ISD::STRICT_FSQRT}) { + setOperationAction(Op, MVT::f32, Expand); + if (Subtarget->hasFP64()) + setOperationAction(Op, MVT::f64, Expand); + if (Subtarget->hasFullFP16()) + setOperationAction(Op, MVT::f16, Expand); + } + if (Subtarget->hasVFP4Base()) { + setOperationAction(ISD::STRICT_FMA, MVT::f32, Expand); + if (Subtarget->hasFP64()) + setOperationAction(ISD::STRICT_FMA, MVT::f64, Expand); + if (Subtarget->hasFullFP16()) + setOperationAction(ISD::STRICT_FMA, MVT::f16, Expand); + } } setOperationAction(ISD::FSINCOS, MVT::f64, Expand); diff --git a/llvm/test/CodeGen/ARM/fp-maytrap-default.ll b/llvm/test/CodeGen/ARM/fp-maytrap-default.ll new file mode 100644 index 0000000000000..62c46ec0e5f0f --- /dev/null +++ b/llvm/test/CodeGen/ARM/fp-maytrap-default.ll @@ -0,0 +1,96 @@ +; RUN: llc -mtriple=armv7a-none-eabi -mattr=vfp4 %s -o - | FileCheck %s +; RUN: llc -mtriple=thumbv7m-none-eabi -mattr=vfp4 %s -o - | FileCheck %s + +; Test that STRICT_F* nodes (produced by constrained FP intrinsics with +; fpexcept.maytrap) lower to native VFP instructions via the +; mutateStrictFPToFP path in SelectionDAGISel. The STRICT_F* operations +; have ISD action = Expand in ARMISelLowering, which triggers the mutation +; to non-strict FP nodes that then match normal VFP patterns. +; +; If a future upstream change breaks the mutation path (e.g. by adding +; explicit STRICT_F* patterns or changing the Expand fall-through logic), +; these CHECK lines will catch it. + +; CHECK-LABEL: test_fadd: +; CHECK: vadd.f32 +define float @test_fadd(float %a, float %b) #0 { + %r = call float @llvm.experimental.constrained.fadd.f32( + float %a, float %b, + metadata !"round.tonearest", metadata !"fpexcept.maytrap") #0 + ret float %r +} + +; CHECK-LABEL: test_fsub: +; CHECK: vsub.f32 +define float @test_fsub(float %a, float %b) #0 { + %r = call float @llvm.experimental.constrained.fsub.f32( + float %a, float %b, + metadata !"round.tonearest", metadata !"fpexcept.maytrap") #0 + ret float %r +} + +; CHECK-LABEL: test_fmul: +; CHECK: vmul.f32 +define float @test_fmul(float %a, float %b) #0 { + %r = call float @llvm.experimental.constrained.fmul.f32( + float %a, float %b, + metadata !"round.tonearest", metadata !"fpexcept.maytrap") #0 + ret float %r +} + +; CHECK-LABEL: test_fdiv: +; CHECK: vdiv.f32 +define float @test_fdiv(float %a, float %b) #0 { + %r = call float @llvm.experimental.constrained.fdiv.f32( + float %a, float %b, + metadata !"round.tonearest", metadata !"fpexcept.maytrap") #0 + ret float %r +} + +; CHECK-LABEL: test_fsqrt: +; CHECK: vsqrt.f32 +define float @test_fsqrt(float %a) #0 { + %r = call float @llvm.experimental.constrained.sqrt.f32( + float %a, + metadata !"round.tonearest", metadata !"fpexcept.maytrap") #0 + ret float %r +} + +; CHECK-LABEL: test_fma: +; CHECK: vfma.f32 +define float @test_fma(float %a, float %b, float %c) #0 { + %r = call float @llvm.experimental.constrained.fma.f32( + float %a, float %b, float %c, + metadata !"round.tonearest", metadata !"fpexcept.maytrap") #0 + ret float %r +} + +; Double-precision (f64) — requires VFP with double support +; CHECK-LABEL: test_fadd_f64: +; CHECK: vadd.f64 +define double @test_fadd_f64(double %a, double %b) #0 { + %r = call double @llvm.experimental.constrained.fadd.f64( + double %a, double %b, + metadata !"round.tonearest", metadata !"fpexcept.maytrap") #0 + ret double %r +} + +; CHECK-LABEL: test_fdiv_f64: +; CHECK: vdiv.f64 +define double @test_fdiv_f64(double %a, double %b) #0 { + %r = call double @llvm.experimental.constrained.fdiv.f64( + double %a, double %b, + metadata !"round.tonearest", metadata !"fpexcept.maytrap") #0 + ret double %r +} + +declare float @llvm.experimental.constrained.fadd.f32(float, float, metadata, metadata) +declare float @llvm.experimental.constrained.fsub.f32(float, float, metadata, metadata) +declare float @llvm.experimental.constrained.fmul.f32(float, float, metadata, metadata) +declare float @llvm.experimental.constrained.fdiv.f32(float, float, metadata, metadata) +declare float @llvm.experimental.constrained.sqrt.f32(float, metadata, metadata) +declare float @llvm.experimental.constrained.fma.f32(float, float, float, metadata, metadata) +declare double @llvm.experimental.constrained.fadd.f64(double, double, metadata, metadata) +declare double @llvm.experimental.constrained.fdiv.f64(double, double, metadata, metadata) + +attributes #0 = { strictfp } _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
