llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT--> @llvm/pr-subscribers-llvm-ir Author: Matt Arsenault (arsenm) <details> <summary>Changes</summary> Convert "denormal-fp-math" and "denormal-fp-math-f32" into a first class denormal_fpenv attribute. Previously the query for the effective deormal mode involved two string attribute queries with parsing. I'm introducing more uses of this, so it makes sense to convert this to a more efficient encoding. The old representation was also awkward since it was split across two separate attributes. The new encoding just stores the default and float modes as bitfields, largely avoiding the need to consider if the other mode is set. The syntax in the common cases looks like this: `denormal_fpenv(preservesign,preservesign)` `denormal_fpenv(float: preservesign,preservesign)` `denormal_fpenv(dynamic,dynamic float: preservesign,preservesign)` I wasn't sure about reusing the float type name instead of adding a new keyword. It's parsed as a type but only accepts float. I'm also debating switching the name to subnormal to match the current preferred IEEE terminology (also used by nofpclass and other contexts). This has a behavior change when using the command flag debug options to set the denormal mode. The behavior of the flag ignored functions with an explicit attribute set, per the default and f32 version. Now that these are one attribute, the flag logic can't distinguish which of the two components were explicitly set on the function. Only one test appeared to rely on this behavior, so I just avoided using the flags in it. This also does not perform all the code cleanups this enables. In particular the attributor handling could be cleaned up. I also guessed at how to support this in MLIR. I followed MemoryEffects as a reference; it appears bitfields are expanded into arguments to attributes, so the representation there is a bit uglier with the 2 2-element fields flattened into 4 arguments. --- Patch is 439.46 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/174293.diff 219 Files Affected: - (modified) clang/lib/CodeGen/CGCall.cpp (+8-18) - (modified) clang/lib/CodeGen/CGCall.h (+1-1) - (modified) clang/test/CodeGen/denormalfpmode-f32.c (+33-27) - (modified) clang/test/CodeGen/denormalfpmode.c (+4-4) - (modified) clang/test/CodeGenCUDA/flush-denormals.cu (+3-3) - (modified) clang/test/CodeGenCUDA/link-builtin-bitcode-denormal-fp-mode.cu (+15-15) - (modified) clang/test/CodeGenCUDA/propagate-attributes.cu (+3-9) - (modified) clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl (+41-41) - (modified) clang/test/CodeGenOpenCL/cl20-device-side-enqueue-attributes.cl (+3-3) - (modified) llvm/docs/LangRef.rst (+18-18) - (modified) llvm/docs/ReleaseNotes.md (+3) - (modified) llvm/include/llvm/ADT/FloatingPointMode.h (+81-12) - (modified) llvm/include/llvm/Analysis/ConstantFolding.h (+1-1) - (modified) llvm/include/llvm/AsmParser/LLParser.h (+2) - (modified) llvm/include/llvm/AsmParser/LLToken.h (+6) - (modified) llvm/include/llvm/Bitcode/LLVMBitCodes.h (+1) - (modified) llvm/include/llvm/IR/Attributes.h (+14) - (modified) llvm/include/llvm/IR/Attributes.td (+4-3) - (modified) llvm/include/llvm/IR/Function.h (+3-8) - (modified) llvm/lib/AsmParser/LLLexer.cpp (+6) - (modified) llvm/lib/AsmParser/LLParser.cpp (+101) - (modified) llvm/lib/Bitcode/Reader/BitcodeReader.cpp (+6) - (modified) llvm/lib/Bitcode/Writer/BitcodeWriter.cpp (+2) - (modified) llvm/lib/CodeGen/CommandFlags.cpp (+7-14) - (modified) llvm/lib/IR/Attributes.cpp (+30-7) - (modified) llvm/lib/IR/AutoUpgrade.cpp (+38-2) - (modified) llvm/lib/IR/Function.cpp (+8-22) - (modified) llvm/lib/IR/Verifier.cpp (+3) - (modified) llvm/lib/Support/FloatingPointMode.cpp (+16) - (modified) llvm/lib/Target/AMDGPU/SIModeRegisterDefaults.cpp (+3-13) - (modified) llvm/lib/Target/ARM/ARMAsmPrinter.cpp (+8-10) - (modified) llvm/lib/Target/ARM/ARMTargetMachine.cpp (+1-1) - (modified) llvm/lib/Transforms/IPO/AttributorAttributes.cpp (+13-21) - (modified) llvm/lib/Transforms/Utils/CodeExtractor.cpp (+1) - (modified) llvm/test/Analysis/CostModel/AMDGPU/fdiv.ll (+2-2) - (added) llvm/test/Assembler/denormal_fpenv.ll (+297) - (added) llvm/test/Assembler/invalid_denormal_fpenv.ll (+187) - (added) llvm/test/Bitcode/auto_upgrade_denormal_fp_math.ll (+324) - (modified) llvm/test/Bitcode/compatibility.ll (+228-2) - (modified) llvm/test/CodeGen/AArch64/sqrt-fastmath.ll (+1-1) - (modified) llvm/test/CodeGen/AArch64/stack-tagging-ex-1.ll (+1-1) - (modified) llvm/test/CodeGen/AArch64/stack-tagging-untag-placement.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/fdiv.f32.ll (+36-32) - (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/fmamix-constant-bus-violation.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/fp-atomics-gfx942.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/fp64-atomics-gfx90a.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/frem.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.fmul.legacy.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/madmix-constant-bus-violation.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-fdiv.f64.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/amdpal-msgpack-denormal.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/atomics-hw-remarks-gfx90a.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/clamp-modifier.ll (+3-3) - (modified) llvm/test/CodeGen/AMDGPU/clamp.ll (+4-4) - (modified) llvm/test/CodeGen/AMDGPU/dagcombine-fma-fmad.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/default-fp-mode.ll (+15-14) - (modified) llvm/test/CodeGen/AMDGPU/fabs-known-signbit-combine-fast-fdiv-lowering.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/fcanonicalize-elimination.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/fcanonicalize.bf16.ll (+3-3) - (modified) llvm/test/CodeGen/AMDGPU/fcanonicalize.f16.ll (+3-3) - (modified) llvm/test/CodeGen/AMDGPU/fcanonicalize.ll (+7-7) - (modified) llvm/test/CodeGen/AMDGPU/fdiv-nofpexcept.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/fdiv.ll (+3-3) - (modified) llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fadd.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmax.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmin.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fsub.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/fmaxnum.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/fminnum.f64.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/fminnum.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/fneg-combines.f16.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/fneg-combines.legal.f16.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/fneg-combines.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/fneg-combines.new.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/fp-atomics-gfx942.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/fp64-atomics-gfx90a.ll (+4-4) - (modified) llvm/test/CodeGen/AMDGPU/frem.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/fsub-as-fneg-src-modifier.ll (+6-6) - (modified) llvm/test/CodeGen/AMDGPU/global-atomicrmw-fadd-wrong-subtarget.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/global-atomicrmw-fadd.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmax.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmin.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/global-atomicrmw-fsub.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/global-atomics-fp-wrong-subtarget.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/global_atomic_optimizer_fp_rtn.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/global_atomics_optimizer_fp_no_rtn.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/global_atomics_scan_fadd.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/global_atomics_scan_fmax.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/global_atomics_scan_fmin.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/global_atomics_scan_fsub.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/hsa-fp-mode.ll (+4-4) - (modified) llvm/test/CodeGen/AMDGPU/known-never-snan.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.fmul.legacy.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.rcp.ll (+4-4) - (modified) llvm/test/CodeGen/AMDGPU/llvm.exp.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/llvm.exp10.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/llvm.exp2.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/llvm.is.fpclass.bf16.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/llvm.is.fpclass.f16.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/llvm.log.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/llvm.log10.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/llvm.log2.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/llvm.maxnum.f16.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/llvm.minnum.f16.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/mad-mix-bf16.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/mad-mix-hi-bf16.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/mad-mix-hi.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/mad-mix-lo-bf16.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/mad-mix-lo.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/mad-mix.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/madak.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/madmk.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/mul24-pass-ordering.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/omod.ll (+5-5) - (modified) llvm/test/CodeGen/AMDGPU/operand-folding.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/pal-metadata-3.0-dvgpr.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/pal-metadata-3.0.gfx1250.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/pal-metadata-3.0.gfx950.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/prevent-fmul-hoist-ir.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/rcp-pattern.ll (+4-4) - (modified) llvm/test/CodeGen/AMDGPU/rcp_iflag.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/repeated-divisor.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/rsq.f32-safe.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/rsq.f32.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/sdwa-peephole.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/sgpr-spill-overlap-wwm-reserve.mir (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/udivrem24.ll (+3-3) - (modified) llvm/test/CodeGen/AMDGPU/v_mac.ll (+3-3) - (modified) llvm/test/CodeGen/AMDGPU/v_mac_f16.ll (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/v_madak_f16.ll (+1-1) - (modified) llvm/test/CodeGen/ARM/build-attributes-fn-attr3.ll (+1-1) - (modified) llvm/test/CodeGen/ARM/build-attributes-fn-attr4.ll (+1-1) - (modified) llvm/test/CodeGen/ARM/build-attributes-fn-attr5.ll (+1-1) - (modified) llvm/test/CodeGen/ARM/build-attributes-fn-attr6.ll (+2-2) - (modified) llvm/test/CodeGen/ARM/clang-section.ll (+2-2) - (modified) llvm/test/CodeGen/ARM/cmse-clear-float-bigend.mir (+1-1) - (modified) llvm/test/CodeGen/ARM/softfp-constant-comparison.ll (+1-1) - (modified) llvm/test/CodeGen/Generic/denormal-fp-math-cl-opt.ll (+1-1) - (modified) llvm/test/CodeGen/NVPTX/div.ll (+1-1) - (modified) llvm/test/CodeGen/NVPTX/f32x2-instructions.ll (+1-1) - (modified) llvm/test/CodeGen/NVPTX/fast-math.ll (+1-1) - (modified) llvm/test/CodeGen/NVPTX/fexp2.ll (+1-1) - (modified) llvm/test/CodeGen/NVPTX/flog2.ll (+1-1) - (modified) llvm/test/CodeGen/NVPTX/math-intrins-sm80-ptx70-instcombine.ll (+2-2) - (modified) llvm/test/CodeGen/NVPTX/math-intrins.ll (+1-1) - (modified) llvm/test/CodeGen/NVPTX/nvptx-prec-divf32-flag.ll (+1-1) - (modified) llvm/test/CodeGen/NVPTX/rsqrt-opt.ll (+1-1) - (modified) llvm/test/CodeGen/NVPTX/sqrt-approx.ll (+1-1) - (modified) llvm/test/CodeGen/PowerPC/fmf-propagation.ll (+2-2) - (modified) llvm/test/CodeGen/PowerPC/recipest.ll (+1-1) - (modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/skip-vpt-debug.mir (+1-1) - (modified) llvm/test/CodeGen/Thumb2/mve-vpt-2-blocks-1-pred.mir (+1-1) - (modified) llvm/test/CodeGen/Thumb2/pacbti-m-outliner-4.ll (+1-1) - (modified) llvm/test/CodeGen/X86/clang-section-coff.ll (+2-2) - (modified) llvm/test/CodeGen/X86/is_fpclass.ll (+2-2) - (modified) llvm/test/CodeGen/X86/pow.ll (+1-1) - (modified) llvm/test/CodeGen/X86/sqrt-fastmath-mir.ll (+2-2) - (modified) llvm/test/CodeGen/X86/sqrt-fastmath-tune.ll (+2-2) - (modified) llvm/test/CodeGen/X86/sqrt-fastmath.ll (+5-5) - (modified) llvm/test/DebugInfo/COFF/fortran-contained-proc.ll (+2-2) - (modified) llvm/test/Instrumentation/NumericalStabilitySanitizer/basic.ll (+1-1) - (modified) llvm/test/Instrumentation/NumericalStabilitySanitizer/non_float_store.ll (+1-1) - (modified) llvm/test/Instrumentation/NumericalStabilitySanitizer/scalable_vector.ll (+1-1) - (modified) llvm/test/Other/opt-override-denormal-fp-math-f32.ll (+5-5) - (modified) llvm/test/Other/opt-override-denormal-fp-math-mixed.ll (+11-11) - (modified) llvm/test/Other/opt-override-denormal-fp-math.ll (+5-5) - (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-rmw-fadd.ll (+5-5) - (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-simplify-cfg-CAS-block.ll (+1-1) - (modified) llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-log.ll (+5-5) - (modified) llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-rcp.ll (+2-2) - (modified) llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-rsq.ll (+2-2) - (modified) llvm/test/Transforms/Attributor/denormal-fp-math.ll (+44-24) - (modified) llvm/test/Transforms/Attributor/nofpclass-canonicalize.ll (+57-57) - (modified) llvm/test/Transforms/Attributor/nofpclass-fdiv.ll (+4-4) - (modified) llvm/test/Transforms/Attributor/nofpclass-frem.ll (+4-4) - (modified) llvm/test/Transforms/Attributor/nofpclass-frexp.ll (+6-6) - (modified) llvm/test/Transforms/Attributor/nofpclass-ldexp.ll (+9-9) - (modified) llvm/test/Transforms/Attributor/nofpclass-log.ll (+7-7) - (modified) llvm/test/Transforms/Attributor/nofpclass-minimum-maximum.ll (+8-8) - (modified) llvm/test/Transforms/Attributor/nofpclass-minimumnum-maximumnum.ll (+8-8) - (modified) llvm/test/Transforms/Attributor/nofpclass-minnum-maxnum.ll (+8-8) - (modified) llvm/test/Transforms/Attributor/nofpclass-nan-fmul.ll (+4-4) - (modified) llvm/test/Transforms/Attributor/nofpclass-powi.ll (+4-4) - (modified) llvm/test/Transforms/Attributor/nofpclass-sqrt.ll (+7-7) - (modified) llvm/test/Transforms/Attributor/nofpclass.ll (+59-59) - (modified) llvm/test/Transforms/Attributor/reduced/register_benchmark_test.ll (+15-15) - (modified) llvm/test/Transforms/EarlyCSE/cannot-be-negative-zero-assert.ll (+1-1) - (modified) llvm/test/Transforms/IndVarSimplify/addrec_no_exec_on_every_iteration.ll (+1-1) - (modified) llvm/test/Transforms/InferAddressSpaces/AMDGPU/global-atomicrmw-fadd.ll (+1-1) - (modified) llvm/test/Transforms/Inline/AMDGPU/inline-denormal-fp-math.ll (+13-13) - (modified) llvm/test/Transforms/InstCombine/NVPTX/nvvm-intrins.ll (+2-2) - (modified) llvm/test/Transforms/InstCombine/combine-is.fpclass-and-fcmp.ll (+2-2) - (modified) llvm/test/Transforms/InstCombine/create-class-from-logic-fcmp.ll (+2-2) - (modified) llvm/test/Transforms/InstCombine/fcmp-denormals-are-zero.ll (+4-4) - (modified) llvm/test/Transforms/InstCombine/fcmp.ll (+3-3) - (modified) llvm/test/Transforms/InstCombine/fmod.ll (+2-2) - (modified) llvm/test/Transforms/InstCombine/is_fpclass.ll (+28-28) - (modified) llvm/test/Transforms/InstCombine/log-to-intrinsic.ll (+2-2) - (modified) llvm/test/Transforms/InstCombine/simplify-demanded-fpclass-canonicalize.ll (+3-3) - (modified) llvm/test/Transforms/InstSimplify/canonicalize.ll (+48-48) - (modified) llvm/test/Transforms/InstSimplify/constant-fold-fp-denormal.ll (+13-13) - (modified) llvm/test/Transforms/InstSimplify/floating-point-compare.ll (+9-9) - (modified) llvm/test/Transforms/SCCP/float-denormal-simplification.ll (+2-2) - (modified) llvm/test/Transforms/SCCP/no-fold-fcmp-dynamic-denormal-mode-issue114947.ll (+1-1) - (added) llvm/test/Verifier/denormal_fpenv.ll (+10) - (modified) llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/various_ir_values_dbgrecords.ll (+1-1) - (modified) llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/various_ir_values_dbgrecords.ll.expected (+1-1) - (modified) llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/various_ir_values_dbgrecords.ll.funcsig.expected (+1-1) - (modified) llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/various_ir_values_dbgrecords.ll.funcsig.globals.expected (+2-2) - (modified) llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/various_ir_values_dbgrecords.ll.funcsig.noglobals.expected (+1-1) - (modified) llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/various_ir_values_dbgrecords.ll.funcsig.transitiveglobals.expected (+1-1) - (modified) llvm/unittests/ADT/FloatingPointMode.cpp (+10) - (modified) mlir/include/mlir/Dialect/LLVMIR/LLVMAttrDefs.td (+18) - (modified) mlir/include/mlir/Dialect/LLVMIR/LLVMEnums.td (+26) - (modified) mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td (+1-2) - (modified) mlir/lib/Target/LLVMIR/ModuleImport.cpp (+18-10) - (modified) mlir/lib/Target/LLVMIR/ModuleTranslation.cpp (+24-7) - (modified) mlir/test/Target/LLVMIR/Import/function-attributes.ll (+18-4) - (modified) mlir/test/Target/LLVMIR/fp-math-function-attributes.mlir (+22-4) ``````````diff diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp index 4a9025b6e0b0f..c4e2e334134e6 100644 --- a/clang/lib/CodeGen/CGCall.cpp +++ b/clang/lib/CodeGen/CGCall.cpp @@ -1941,11 +1941,9 @@ static bool HasStrictReturn(const CodeGenModule &Module, QualType RetTy, static void addDenormalModeAttrs(llvm::DenormalMode FPDenormalMode, llvm::DenormalMode FP32DenormalMode, llvm::AttrBuilder &FuncAttrs) { - if (FPDenormalMode != llvm::DenormalMode::getDefault()) - FuncAttrs.addAttribute("denormal-fp-math", FPDenormalMode.str()); - - if (FP32DenormalMode != FPDenormalMode && FP32DenormalMode.isValid()) - FuncAttrs.addAttribute("denormal-fp-math-f32", FP32DenormalMode.str()); + llvm::DenormalFPEnv FPEnv(FPDenormalMode, FP32DenormalMode); + if (FPEnv != llvm::DenormalFPEnv::getDefault()) + FuncAttrs.addDenormalFPEnvAttr(FPEnv); } /// Add default attributes to a function, which have merge semantics under @@ -2178,24 +2176,16 @@ void CodeGen::mergeDefaultFunctionDefinitionAttributes( CodeGenOpts.FP32DenormalMode.mergeCalleeMode(DenormModeToMergeF32); } - if (Merged == llvm::DenormalMode::getDefault()) { - AttrsToRemove.addAttribute("denormal-fp-math"); - } else if (Merged != DenormModeToMerge) { - // Overwrite existing attribute - FuncAttrs.addAttribute("denormal-fp-math", - CodeGenOpts.FPDenormalMode.str()); - } + llvm::DenormalFPEnv MergedFPEnv(Merged, MergedF32); - if (MergedF32 == llvm::DenormalMode::getDefault()) { - AttrsToRemove.addAttribute("denormal-fp-math-f32"); - } else if (MergedF32 != DenormModeToMergeF32) { + if (MergedFPEnv == llvm::DenormalFPEnv::getDefault()) { + AttrsToRemove.addAttribute(llvm::Attribute::DenormalFPMath); + } else { // Overwrite existing attribute - FuncAttrs.addAttribute("denormal-fp-math-f32", - CodeGenOpts.FP32DenormalMode.str()); + FuncAttrs.addDenormalFPEnvAttr(MergedFPEnv); } F.removeFnAttrs(AttrsToRemove); - addDenormalModeAttrs(Merged, MergedF32, FuncAttrs); overrideFunctionFeaturesWithTargetFeatures(FuncAttrs, F, TargetOpts); diff --git a/clang/lib/CodeGen/CGCall.h b/clang/lib/CodeGen/CGCall.h index 4a86d58895dd9..145992652934f 100644 --- a/clang/lib/CodeGen/CGCall.h +++ b/clang/lib/CodeGen/CGCall.h @@ -410,7 +410,7 @@ class ReturnValueSlot { /// This is useful for adding attrs to bitcode modules that you want to link /// with but don't control, such as CUDA's libdevice. When linking with such /// a bitcode library, you might want to set e.g. its functions' -/// "denormal-fp-math" attribute to match the attr of the functions you're +/// denormal_fp_math attribute to match the attr of the functions you're /// codegen'ing. Otherwise, LLVM will interpret the bitcode module's lack of /// denormal-fp-math attrs as tantamount to denormal-fp-math=ieee, and then LLVM /// will propagate denormal-fp-math=ieee up to every transitive caller of a diff --git a/clang/test/CodeGen/denormalfpmode-f32.c b/clang/test/CodeGen/denormalfpmode-f32.c index 312d1c9277722..9f0340a0f55a8 100644 --- a/clang/test/CodeGen/denormalfpmode-f32.c +++ b/clang/test/CodeGen/denormalfpmode-f32.c @@ -1,48 +1,54 @@ -// RUN: %clang_cc1 %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-NONE -// RUN: %clang_cc1 -fdenormal-fp-math=ieee %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-NONE -// RUN: %clang_cc1 -fdenormal-fp-math=preserve-sign %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PS,CHECK-F32-NONE -// RUN: %clang_cc1 -fdenormal-fp-math=positive-zero %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-NONE -// RUN: %clang_cc1 -fdenormal-fp-math=dynamic %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-DYNAMIC,CHECK-F32-NONE +// RUN: %clang_cc1 %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE +// RUN: %clang_cc1 -fdenormal-fp-math=ieee %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE +// RUN: %clang_cc1 -fdenormal-fp-math=preserve-sign %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PS +// RUN: %clang_cc1 -fdenormal-fp-math=positive-zero %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PZ +// RUN: %clang_cc1 -fdenormal-fp-math=dynamic %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-DYNAMIC -// RUN: %clang_cc1 -fdenormal-fp-math-f32=ieee %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-NONE -// RUN: %clang_cc1 -fdenormal-fp-math=ieee -fdenormal-fp-math-f32=ieee %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-NONE -// RUN: %clang_cc1 -fdenormal-fp-math=preserve-sign -fdenormal-fp-math-f32=ieee %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PS,CHECK-F32-IEEE -// RUN: %clang_cc1 -fdenormal-fp-math=positive-zero -fdenormal-fp-math-f32=ieee %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-IEEE -// RUN: %clang_cc1 -fdenormal-fp-math=positive-zero -fdenormal-fp-math-f32=dynamic %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-DYNAMIC +// RUN: %clang_cc1 -fdenormal-fp-math-f32=ieee %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE +// RUN: %clang_cc1 -fdenormal-fp-math=ieee -fdenormal-fp-math-f32=ieee %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE +// RUN: %clang_cc1 -fdenormal-fp-math=preserve-sign -fdenormal-fp-math-f32=ieee %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PS-F32-IEEE +// RUN: %clang_cc1 -fdenormal-fp-math=positive-zero -fdenormal-fp-math-f32=ieee %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PZ-F32-IEEE +// RUN: %clang_cc1 -fdenormal-fp-math=positive-zero -fdenormal-fp-math-f32=dynamic %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PZ-F32-DYNAMIC // RUN: %clang_cc1 -fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-PS // RUN: %clang_cc1 -fdenormal-fp-math=ieee -fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-PS -// RUN: %clang_cc1 -fdenormal-fp-math=preserve-sign -fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PS,CHECK-F32-NONE -// RUN: %clang_cc1 -fdenormal-fp-math=positive-zero -fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-PS +// RUN: %clang_cc1 -fdenormal-fp-math=preserve-sign -fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PS +// RUN: %clang_cc1 -fdenormal-fp-math=positive-zero -fdenormal-fp-math-f32=preserve-sign %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PZ-F32-PS // RUN: %clang_cc1 -fdenormal-fp-math=ieee -fdenormal-fp-math-f32=dynamic %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-DYNAMIC // RUN: %clang_cc1 -fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-PZ // RUN: %clang_cc1 -fdenormal-fp-math-f32=dynamic %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-DYNAMIC -// RUN: %clang_cc1 -fdenormal-fp-math=ieee -fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-NONE,CHECK-F32-PZ -// RUN: %clang_cc1 -fdenormal-fp-math=dynamic -fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-DYNAMIC,CHECK-F32-PZ -// RUN: %clang_cc1 -fdenormal-fp-math=preserve-sign -fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PS,CHECK-F32-PZ -// RUN: %clang_cc1 -fdenormal-fp-math=positive-zero -fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PZ,CHECK-F32-NONE -// RUN: %clang_cc1 -fdenormal-fp-math=dynamic -fdenormal-fp-math-f32=dynamic %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-DYNAMIC,CHECK-F32-NONE +// RUN: %clang_cc1 -fdenormal-fp-math=ieee -fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-F32-PZ +// RUN: %clang_cc1 -fdenormal-fp-math=dynamic -fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-DYNAMIC-F32-PZ +// RUN: %clang_cc1 -fdenormal-fp-math=preserve-sign -fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PS-F32-PZ +// RUN: %clang_cc1 -fdenormal-fp-math=positive-zero -fdenormal-fp-math-f32=positive-zero %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-PZ +// RUN: %clang_cc1 -fdenormal-fp-math=dynamic -fdenormal-fp-math-f32=dynamic %s -emit-llvm -o - | FileCheck %s --check-prefixes=CHECK-ATTR,CHECK-DYNAMIC // CHECK-LABEL: main // CHECK-ATTR: attributes #0 = -// CHECK-NONE-NOT:"denormal-fp-math" -// CHECK-IEEE: "denormal-fp-math"="ieee,ieee" -// CHECK-PS: "denormal-fp-math"="preserve-sign,preserve-sign" -// CHECK-PZ: "denormal-fp-math"="positive-zero,positive-zero" -// CHECK-DYNAMIC: "denormal-fp-math"="dynamic,dynamic" +// CHECK-NONE-NOT: denormal_fpenv -// CHECK-F32-NONE-NOT:"denormal-fp-math-f32" -// CHECK-F32-IEEE: "denormal-fp-math-f32"="ieee,ieee" -// CHECK-F32-PS: "denormal-fp-math-f32"="preserve-sign,preserve-sign" -// CHECK-F32-PZ: "denormal-fp-math-f32"="positive-zero,positive-zero" +// CHECK-IEEE: denormal_fpenv(ieee,ieee) +// CHECK-PS: denormal_fpenv(preservesign,preservesign) +// CHECK-PZ: denormal_fpenv(positivezero,positivezero) +// CHECK-DYNAMIC: denormal_fpenv(dynamic,dynamic) -// CHECK-F32-DYNAMIC: "denormal-fp-math-f32"="dynamic,dynamic" +// CHECK-PS-F32-IEEE: denormal_fpenv(preservesign,preservesign float: ieee,ieee) +// CHECK-PZ-F32-IEEE: denormal_fpenv(positivezero,positivezero float: ieee,ieee) +// CHECK-PZ-F32-DYNAMIC: denormal_fpenv(positivezero,positivezero float: dynamic,dynamic) +// CHECK-PZ-F32-PS: denormal_fpenv(positivezero,positivezero float: preservesign,preservesign) +// CHECK-DYNAMIC-F32-PZ: denormal_fpenv(dynamic,dynamic float: positivezero,positivezero) +// CHECK: CHECK-PS-F32-PZ: denormal_fpenv(preservesign,preservesign float: positivezero,positivezero) + +// CHECK-F32-IEEE: denormal_fpenv(float: ieee,ieee) +// CHECK-F32-PS: denormal_fpenv(float: preservesign,preservesign) +// CHECK-F32-PZ: denormal_fpenv(float: positivezero,positivezero) +// CHECK-F32-DYNAMIC: denormal_fpenv(float: dynamic,dynamic) int main(void) { return 0; diff --git a/clang/test/CodeGen/denormalfpmode.c b/clang/test/CodeGen/denormalfpmode.c index cffff90d6fbe7..8d0ce644c4da1 100644 --- a/clang/test/CodeGen/denormalfpmode.c +++ b/clang/test/CodeGen/denormalfpmode.c @@ -6,10 +6,10 @@ // CHECK-LABEL: main // The ieee,ieee is the default, so omit the attribute -// CHECK-IEEE-NOT:"denormal-fp-math" -// CHECK-PS: attributes #0 = {{.*}}"denormal-fp-math"="preserve-sign,preserve-sign"{{.*}} -// CHECK-PZ: attributes #0 = {{.*}}"denormal-fp-math"="positive-zero,positive-zero"{{.*}} -// CHECK-DYNAMIC: attributes #0 = {{.*}}"denormal-fp-math"="dynamic,dynamic"{{.*}} +// CHECK-IEEE-NOT:denormal_fpenv +// CHECK-PS: attributes #0 = {{.*}}denormal_fpenv(preservesign,preservesign){{.*}} +// CHECK-PZ: attributes #0 = {{.*}}denormal_fpenv(positivezero,positivezero){{.*}} +// CHECK-DYNAMIC: attributes #0 = {{.*}}denormal_fpenv(dynamic,dynamic){{.*}} int main(void) { return 0; diff --git a/clang/test/CodeGenCUDA/flush-denormals.cu b/clang/test/CodeGenCUDA/flush-denormals.cu index b5abc29dea14b..7e1700ac64331 100644 --- a/clang/test/CodeGenCUDA/flush-denormals.cu +++ b/clang/test/CodeGenCUDA/flush-denormals.cu @@ -24,7 +24,7 @@ #include "Inputs/cuda.h" -// Checks that device function calls get emitted with the "denormal-fp-math-f32" +// Checks that device function calls get emitted with the denormal_fpenv // attribute set when we compile CUDA device code with // -fdenormal-fp-math-f32. Further, check that we reflect the presence or // absence of -fgpu-flush-denormals-to-zero in a module flag. @@ -41,8 +41,8 @@ // CHECK-LABEL: define void @foo() #0 extern "C" __device__ void foo() {} -// FTZ: attributes #0 = {{.*}} "denormal-fp-math-f32"="preserve-sign,preserve-sign" -// NOFTZ-NOT: "denormal-fp-math-f32" +// FTZ: attributes #0 = {{.*}} denormal_fpenv(float: preservesign,preservesign) +// NOFTZ-NOT: denormal_fpenv // PTXFTZ:!llvm.module.flags = !{{{.*}}[[MODFLAG:![0-9]+]]} // PTXFTZ:[[MODFLAG]] = !{i32 4, !"nvvm-reflect-ftz", i32 1} diff --git a/clang/test/CodeGenCUDA/link-builtin-bitcode-denormal-fp-mode.cu b/clang/test/CodeGenCUDA/link-builtin-bitcode-denormal-fp-mode.cu index ef02668c3697b..39460ad92d2b2 100644 --- a/clang/test/CodeGenCUDA/link-builtin-bitcode-denormal-fp-mode.cu +++ b/clang/test/CodeGenCUDA/link-builtin-bitcode-denormal-fp-mode.cu @@ -127,39 +127,39 @@ __global__ void kernel_f64(double* out, double* a, double* b, double* c) { // We should not be littering call sites with the attribute // Everything should use the default ieee with no explicit attribute -// FIXME: Should check-not "denormal-fp-math" within the denormal-fp-math-f32 +// FIXME: Should check-not denormal_fpenv within the denormal-fp-math-f32 // lines. // Default mode relies on the implicit check-not for the denormal-fp-math. -// PSZ: #[[$KERNELATTR]] = { {{.*}} "denormal-fp-math"="preserve-sign,preserve-sign" +// PSZ: #[[$KERNELATTR]] = { {{.*}} denormal_fpenv(preservesign,preservesign) // PSZ-SAME: "target-cpu"="gfx803" -// PSZ: #[[$FUNCATTR]] = { {{.*}} "denormal-fp-math-f32"="preserve-sign,preserve-sign" +// PSZ: #[[$FUNCATTR]] = { {{.*}} denormal_fpenv(float: preservesign,preservesign) // PSZ-SAME: "target-cpu"="gfx803" -// PSZ: #[[$WEAK_FUNCATTR]] = { {{.*}} "denormal-fp-math-f32"="preserve-sign,preserve-sign" +// PSZ: #[[$WEAK_FUNCATTR]] = { {{.*}} denormal_fpenv(float: preservesign,preservesign) // PSZ-SAME: "target-cpu"="gfx803" -// FIXME: Should check-not "denormal-fp-math" within the line -// IEEEF64-PSZF32: #[[$KERNELATTR]] = { {{.*}} "denormal-fp-math-f32"="preserve-sign,preserve-sign" +// FIXME: Should check-not denormal_fpenv within the line +// IEEEF64-PSZF32: #[[$KERNELATTR]] = { {{.*}} denormal_fpenv(float: preservesign,preservesign) // IEEEF64-PSZF32-SAME: "target-cpu"="gfx803" -// IEEEF64-PSZF32: #[[$FUNCATTR]] = { {{.*}} "denormal-fp-math-f32"="preserve-sign,preserve-sign" +// IEEEF64-PSZF32: #[[$FUNCATTR]] = { {{.*}} denormal_fpenv(float: preservesign,preservesign) // IEEEF64-PSZF32-SAME: "target-cpu"="gfx803" -// IEEEF64-PSZF32: #[[$WEAK_FUNCATTR]] = { {{.*}} "denormal-fp-math-f32"="preserve-sign,preserve-sign" +// IEEEF64-PSZF32: #[[$WEAK_FUNCATTR]] = { {{.*}} denormal_fpenv(float: preservesign,preservesign) // IEEEF64-PSZF32-SAME: "target-cpu"="gfx803" -// IEEEF32-PSZF64-DYNF32: #[[$KERNELATTR]] = { {{.*}} "denormal-fp-math"="preserve-sign,preserve-sign" "denormal-fp-math-f32"="ieee,ieee" {{.*}} "target-cpu"="gfx803" {{.*}} } +// IEEEF32-PSZF64-DYNF32: #[[$KERNELATTR]] = { {{.*}} denormal_fpenv(preservesign,preservesign float: ieee,ieee) {{.*}} "target-cpu"="gfx803" {{.*}} } // implicit check-not // implicit check-not -// IEEEF32-PSZF64-DYNFULL: #[[$KERNELATTR]] = { {{.*}} "denormal-fp-math"="preserve-sign,preserve-sign" "denormal-fp-math-f32"="ieee,ieee" +// IEEEF32-PSZF64-DYNFULL: #[[$KERNELATTR]] = { {{.*}} denormal_fpenv(preservesign,preservesign float: ieee,ieee) // IEEEF32-PSZF64-DYNFULL-SAME: "target-cpu"="gfx803" -// IEEEF32-PSZF64-DYNFULL: #[[$FUNCATTR]] = { {{.*}} "denormal-fp-math"="preserve-sign,preserve-sign" "denormal-fp-math-f32"="ieee,ieee" +// IEEEF32-PSZF64-DYNFULL: #[[$FUNCATTR]] = { {{.*}} denormal_fpenv(preservesign,preservesign float: ieee,ieee) // IEEEF32-PSZF64-DYNFULL-SAME: "target-cpu"="gfx803" -// IEEEF32-PSZF64-DYNFULL: #[[$WEAK_FUNCATTR]] = { {{.*}} "denormal-fp-math"="preserve-sign,preserve-sign" "denormal-fp-math-f32"="ieee,ieee" +// IEEEF32-PSZF64-DYNFULL: #[[$WEAK_FUNCATTR]] = { {{.*}} denormal_fpenv(preservesign,preservesign float: ieee,ieee) // IEEEF32-PSZF64-DYNFULL-SAME: "target-cpu"="gfx803" // -mlink-bitcode-file doesn't internalize or propagate attributes. -// NOINTERNALIZE-IEEEF32-PSZF64-DYNFULL: #[[$KERNELATTR]] = { {{.*}} "denormal-fp-math"="preserve-sign,preserve-sign" "denormal-fp-math-f32"="ieee,ieee" {{.*}} "target-cpu"="gfx803" {{.*}} } -// NOINTERNALIZE-IEEEF32-PSZF64-DYNFULL: #[[$FUNCATTR]] = { {{.*}} "denormal-fp-math"="dynamic,dynamic" {{.*}} } -// NOINTERNALIZE-IEEEF32-PSZF64-DYNFULL: #[[$WEAK_FUNCATTR]] = { {{.*}} "denormal-fp-math"="dynamic,dynamic" {{.*}} } +// NOINTERNALIZE-IEEEF32-PSZF64-DYNFULL: #[[$KERNELATTR]] = { {{.*}} denormal_fpenv(preservesign,preservesign float: ieee,ieee) {{.*}} "target-cpu"="gfx803" {{.*}} } +// NOINTERNALIZE-IEEEF32-PSZF64-DYNFULL: #[[$FUNCATTR]] = { {{.*}} denormal_fpenv(dynamic,dynamic) {{.*}} } +// NOINTERNALIZE-IEEEF32-PSZF64-DYNFULL: #[[$WEAK_FUNCATTR]] = { {{.*}} denormal_fpenv(dynamic,dynamic) {{.*}} } diff --git a/clang/test/CodeGenCUDA/propagate-attributes.cu b/clang/test/CodeGenCUDA/propagate-attributes.cu index a7e6b09ff97db..40a8c32e53d21 100644 --- a/clang/test/CodeGenCUDA/propagate-attributes.cu +++ b/clang/test/CodeGenCUDA/propagate-attributes.cu @@ -53,14 +53,14 @@ __global__ void kernel() { lib_fn(); } // line. // Check the attribute list for kernel. +// NOFTZ-NOT: denormal_fpenv + // CHECK: attributes [[kattr]] = { // CHECK-SAME: convergent // CHECK-SAME: norecurse -// FTZ-NOT: "denormal-fp-math" -// FTZ-SAME: "denormal-fp-math-f32"="preserve-sign,preserve-sign" -// NOFTZ-NOT: "denormal-fp-math-f32" +// FTZ-SAME: denormal_fpenv(float: preservesign,preservesign) // CHECK-SAME: "no-trapping-math"="true" @@ -70,10 +70,4 @@ __global__ void kernel() { lib_fn(); } // CHECK-SAME: convergent // CHECK-NOT: norecurse -// FTZ-NOT: "denormal-fp-math" -// NOFTZ-NOT: "denormal-fp-math" - -// FTZ-SAME: "denormal-fp-math-f32"="preserve-sign,preserve-sign" -// NOFTZ-NOT: "denormal-fp-math-f32" - // CHECK-SAME: "no-trapping-math"="true" diff --git a/clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl b/clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl index 2cbc9787a04b0..8fe3404fed366 100644 --- a/clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl +++ b/clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl @@ -69,7 +69,7 @@ kernel void test_target_features_kernel(global int *i) { // CHECK: @__test_target_features_kernel_block_invoke_kernel.runtime.handle = internal addrspace(1) externally_initialized constant %block.runtime.handle.t.3 zeroinitializer, section ".amdgpu.kernel.runtime.handle" // CHECK: @llvm.used = appending addrspace(1) global [10 x ptr] [ptr @__test_block_invoke_kernel, ptr addrspacecast (ptr addrspace(1) @__test_block_invoke_kernel.runtime.handle to ptr), ptr @__test_block_invoke_2_kernel, ptr addrspacecast (ptr addrspace(1) @__test_block_invoke_2_kernel.runtime.handle to ptr), ptr @__test_block_invoke_3_kernel, ptr addrspacecast (ptr addrspace(1) @__test_block_invoke_3_kernel.runtime.handle to ptr), ptr @__test_block_invoke_4_kernel, ptr addrspacecast (ptr addrspace(1) @__test_block_invoke_4_kernel.runtime.handle to ptr), ptr @__test_target_features_kernel_block_invoke_kernel, ptr addrspacecast (ptr addrspace(1) @__test_target_features_kernel_block_invoke_kernel.runtime.handle to ptr)], section "llvm.metadata" //. -// NOCPU: Function Attrs: convergent noinline norecurse nounwind optnone +// NOCPU: Function Attrs: convergent noinline norecurse nounwind optnone denormal_fpenv(float: preservesign,preservesign) // NOCPU-LABEL: define dso_local void @callee( // NOCPU-SAME: i64 noundef [[ID:%.*]], ptr addrspace(1) noundef [[OUT:%.*]]) #[[ATTR1:[0-9]+]] { // NOCPU-NEXT: [[ENTRY:.*:]] @@ -87,7 +87,7 @@ kernel void test_target_features_kernel(global int *i) { // NOCPU-NEXT: ret void // // -// NOCPU: Function Attrs: convergent noinline norecurse nounwind optnone +// NOCPU: Function Attrs: convergent noinline norecurse nounwind optnone denormal_fpenv(float: preservesign,prese... [truncated] `````````` </details> https://github.com/llvm/llvm-project/pull/174293 _______________________________________________ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
