[clang] [llvm] [LLVM] Add `__builtin_readsteadycounter` intrinsic and buiiltin for realtime clocks (PR #81331)
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/81331 >From 30341079e795c2668588b791f2c97b44006e7a04 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Fri, 9 Feb 2024 16:13:42 -0600 Subject: [PATCH] [WIP][LLVM] Add `__builtin_readsteadycounter` intrinsic and buiiltin Summary: This patch adds a new intrinsic and builtin function mirroring the existing `__builtin_readcyclecounter`. The difference is that this implementation targets a separate counter that some targets have which returns a fixed frequency clock that can be used to determine elapsed time, this is different compared to the cycle counter which often has variable frequency. This is currently only valid for the NVPTX and AMDGPU targets. --- clang/docs/LanguageExtensions.rst | 31 ++ clang/include/clang/Basic/Builtins.td | 6 ++ clang/lib/CodeGen/CGBuiltin.cpp | 4 ++ clang/test/CodeGen/builtins.c | 6 ++ llvm/include/llvm/CodeGen/ISDOpcodes.h| 6 ++ llvm/include/llvm/IR/Intrinsics.td| 2 + llvm/include/llvm/Support/TargetOpcodes.def | 3 + llvm/include/llvm/Target/GenericOpcodes.td| 6 ++ .../Target/GlobalISel/SelectionDAGCompat.td | 1 + .../include/llvm/Target/TargetSelectionDAG.td | 3 + llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp | 2 + llvm/lib/CodeGen/IntrinsicLowering.cpp| 6 ++ llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp | 6 +- .../SelectionDAG/LegalizeIntegerTypes.cpp | 7 ++- llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h | 2 +- .../SelectionDAG/SelectionDAGBuilder.cpp | 8 +++ .../SelectionDAG/SelectionDAGDumper.cpp | 1 + llvm/lib/CodeGen/TargetLoweringBase.cpp | 3 + .../lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp | 2 + .../Target/AMDGPU/AMDGPURegisterBankInfo.cpp | 1 + llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 4 ++ llvm/lib/Target/AMDGPU/SMInstructions.td | 14 + llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp | 3 + llvm/lib/Target/NVPTX/NVPTXInstrInfo.td | 1 - llvm/lib/Target/NVPTX/NVPTXIntrinsics.td | 4 ++ .../GlobalISel/legalizer-info-validation.mir | 3 + llvm/test/CodeGen/AMDGPU/readsteadycounter.ll | 24 +++ llvm/test/CodeGen/NVPTX/intrinsics.ll | 12 .../builtins/match-table-replacerreg.td | 24 +++ .../match-table-imms.td | 32 +- .../match-table-intrinsics.td | 5 +- .../match-table-patfrag-root.td | 4 +- .../GlobalISelCombinerEmitter/match-table.td | 62 +-- llvm/test/TableGen/GlobalISelEmitter.td | 2 +- 34 files changed, 228 insertions(+), 72 deletions(-) create mode 100644 llvm/test/CodeGen/AMDGPU/readsteadycounter.ll diff --git a/clang/docs/LanguageExtensions.rst b/clang/docs/LanguageExtensions.rst index e91156837290f7..4cc73599f9bae0 100644 --- a/clang/docs/LanguageExtensions.rst +++ b/clang/docs/LanguageExtensions.rst @@ -2764,6 +2764,37 @@ Query for this feature with ``__has_builtin(__builtin_readcyclecounter)``. Note that even if present, its use may depend on run-time privilege or other OS controlled state. +``__builtin_readsteadycounter`` +-- + +``__builtin_readsteadycounter`` is used to access the fixed frequency counter +register (or a similar steady-rate clock) on those targets that support it. +The function is similar to ``__builtin_readcyclecounter`` above except that the +frequency is fixed, making it suitable for measuring elapsed time. + +**Syntax**: + +.. code-block:: c++ + + __builtin_readsteadycounter() + +**Example of Use**: + +.. code-block:: c++ + + unsigned long long t0 = __builtin_readsteadycounter(); + do_something(); + unsigned long long t1 = __builtin_readsteadycounter(); + unsigned long long secs_to_do_something = (t1 - t0) / tick_rate; + +**Description**: + +The ``__builtin_readsteadycounter()`` builtin returns the frequency counter value. +When not supported by the target, the return value is always zero. This builtin +takes no arguments and produces an unsigned long long result. + +Query for this feature with ``__has_builtin(__builtin_readsteadycounter)``. + ``__builtin_dump_struct`` - diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td index 31a2bdeb2d3e5e..193d5851f9f29f 100644 --- a/clang/include/clang/Basic/Builtins.td +++ b/clang/include/clang/Basic/Builtins.td @@ -1110,6 +1110,12 @@ def ReadCycleCounter : Builtin { let Prototype = "unsigned long long int()"; } +def ReadSteadyCounter : Builtin { + let Spellings = ["__builtin_readsteadycounter"]; + let Attributes = [NoThrow]; + let Prototype = "unsigned long long int()"; +} + def Trap : Builtin { let Spellings = ["__builtin_trap"]; let Attributes = [NoThrow, NoReturn]; diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index a7a410dab1a018..ee0b7504769622 100644 ---
[clang] [llvm] [LLVM] Add `__builtin_readsteadycounter` intrinsic and buiiltin for realtime clocks (PR #81331)
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/81331 >From 109939223e7944472363134d72a223524e1e3f0a Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Fri, 9 Feb 2024 16:13:42 -0600 Subject: [PATCH] [WIP][LLVM] Add `__builtin_readsteadycounter` intrinsic and buiiltin Summary: This patch adds a new intrinsic and builtin function mirroring the existing `__builtin_readcyclecounter`. The difference is that this implementation targets a separate counter that some targets have which returns a fixed frequency clock that can be used to determine elapsed time, this is different compared to the cycle counter which often has variable frequency. This is currently only valid for the NVPTX and AMDGPU targets. --- clang/docs/LanguageExtensions.rst | 31 +++ clang/include/clang/Basic/Builtins.td | 6 clang/lib/CodeGen/CGBuiltin.cpp | 4 +++ clang/test/CodeGen/builtins.c | 6 llvm/include/llvm/CodeGen/ISDOpcodes.h| 6 llvm/include/llvm/IR/Intrinsics.td| 2 ++ llvm/include/llvm/Support/TargetOpcodes.def | 3 ++ llvm/include/llvm/Target/GenericOpcodes.td| 6 .../Target/GlobalISel/SelectionDAGCompat.td | 1 + .../include/llvm/Target/TargetSelectionDAG.td | 3 ++ llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp | 2 ++ llvm/lib/CodeGen/IntrinsicLowering.cpp| 6 llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp | 6 ++-- .../SelectionDAG/LegalizeIntegerTypes.cpp | 7 +++-- llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h | 2 +- .../SelectionDAG/SelectionDAGBuilder.cpp | 8 + .../SelectionDAG/SelectionDAGDumper.cpp | 1 + llvm/lib/CodeGen/TargetLoweringBase.cpp | 3 ++ .../lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp | 2 ++ .../Target/AMDGPU/AMDGPURegisterBankInfo.cpp | 1 + llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 4 +++ llvm/lib/Target/AMDGPU/SMInstructions.td | 14 + llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp | 3 ++ llvm/lib/Target/NVPTX/NVPTXInstrInfo.td | 1 - llvm/lib/Target/NVPTX/NVPTXIntrinsics.td | 4 +++ llvm/test/CodeGen/AMDGPU/readsteadycounter.ll | 24 ++ llvm/test/CodeGen/NVPTX/intrinsics.ll | 12 +++ 27 files changed, 161 insertions(+), 7 deletions(-) create mode 100644 llvm/test/CodeGen/AMDGPU/readsteadycounter.ll diff --git a/clang/docs/LanguageExtensions.rst b/clang/docs/LanguageExtensions.rst index e91156837290f7..4cc73599f9bae0 100644 --- a/clang/docs/LanguageExtensions.rst +++ b/clang/docs/LanguageExtensions.rst @@ -2764,6 +2764,37 @@ Query for this feature with ``__has_builtin(__builtin_readcyclecounter)``. Note that even if present, its use may depend on run-time privilege or other OS controlled state. +``__builtin_readsteadycounter`` +-- + +``__builtin_readsteadycounter`` is used to access the fixed frequency counter +register (or a similar steady-rate clock) on those targets that support it. +The function is similar to ``__builtin_readcyclecounter`` above except that the +frequency is fixed, making it suitable for measuring elapsed time. + +**Syntax**: + +.. code-block:: c++ + + __builtin_readsteadycounter() + +**Example of Use**: + +.. code-block:: c++ + + unsigned long long t0 = __builtin_readsteadycounter(); + do_something(); + unsigned long long t1 = __builtin_readsteadycounter(); + unsigned long long secs_to_do_something = (t1 - t0) / tick_rate; + +**Description**: + +The ``__builtin_readsteadycounter()`` builtin returns the frequency counter value. +When not supported by the target, the return value is always zero. This builtin +takes no arguments and produces an unsigned long long result. + +Query for this feature with ``__has_builtin(__builtin_readsteadycounter)``. + ``__builtin_dump_struct`` - diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td index 31a2bdeb2d3e5e..193d5851f9f29f 100644 --- a/clang/include/clang/Basic/Builtins.td +++ b/clang/include/clang/Basic/Builtins.td @@ -1110,6 +1110,12 @@ def ReadCycleCounter : Builtin { let Prototype = "unsigned long long int()"; } +def ReadSteadyCounter : Builtin { + let Spellings = ["__builtin_readsteadycounter"]; + let Attributes = [NoThrow]; + let Prototype = "unsigned long long int()"; +} + def Trap : Builtin { let Spellings = ["__builtin_trap"]; let Attributes = [NoThrow, NoReturn]; diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index a7a410dab1a018..ee0b7504769622 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -3443,6 +3443,10 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID, Function *F = CGM.getIntrinsic(Intrinsic::readcyclecounter); return RValue::get(Builder.CreateCall(F)); } + case Builtin::BI__builtin_readsteadycounter: { +Function *F =
[clang] [llvm] [LLVM] Add `__builtin_readsteadycounter` intrinsic and buiiltin for realtime clocks (PR #81331)
jhuber6 wrote: Added clang test and renamed to `readsteadycounter` as I think it's more descriptive and matches the existing `readcyclecounter` better. https://github.com/llvm/llvm-project/pull/81331 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [LLVM] Add `__builtin_readsteadycounter` intrinsic and buiiltin for realtime clocks (PR #81331)
https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/81331 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits