[clang] [llvm] [LLVM] Add `__builtin_readsteadycounter` intrinsic and buiiltin for realtime clocks (PR #81331)

2024-02-10 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 updated 
https://github.com/llvm/llvm-project/pull/81331

>From 30341079e795c2668588b791f2c97b44006e7a04 Mon Sep 17 00:00:00 2001
From: Joseph Huber 
Date: Fri, 9 Feb 2024 16:13:42 -0600
Subject: [PATCH] [WIP][LLVM] Add `__builtin_readsteadycounter` intrinsic and
 buiiltin

Summary:
This patch adds a new intrinsic and builtin function mirroring the
existing `__builtin_readcyclecounter`. The difference is that this
implementation targets a separate counter that some targets have which
returns a fixed frequency clock that can be used to determine elapsed
time, this is different compared to the cycle counter which often has
variable frequency. This is currently only valid for the NVPTX and
AMDGPU targets.
---
 clang/docs/LanguageExtensions.rst | 31 ++
 clang/include/clang/Basic/Builtins.td |  6 ++
 clang/lib/CodeGen/CGBuiltin.cpp   |  4 ++
 clang/test/CodeGen/builtins.c |  6 ++
 llvm/include/llvm/CodeGen/ISDOpcodes.h|  6 ++
 llvm/include/llvm/IR/Intrinsics.td|  2 +
 llvm/include/llvm/Support/TargetOpcodes.def   |  3 +
 llvm/include/llvm/Target/GenericOpcodes.td|  6 ++
 .../Target/GlobalISel/SelectionDAGCompat.td   |  1 +
 .../include/llvm/Target/TargetSelectionDAG.td |  3 +
 llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp  |  2 +
 llvm/lib/CodeGen/IntrinsicLowering.cpp|  6 ++
 llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp |  6 +-
 .../SelectionDAG/LegalizeIntegerTypes.cpp |  7 ++-
 llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h |  2 +-
 .../SelectionDAG/SelectionDAGBuilder.cpp  |  8 +++
 .../SelectionDAG/SelectionDAGDumper.cpp   |  1 +
 llvm/lib/CodeGen/TargetLoweringBase.cpp   |  3 +
 .../lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp |  2 +
 .../Target/AMDGPU/AMDGPURegisterBankInfo.cpp  |  1 +
 llvm/lib/Target/AMDGPU/SIISelLowering.cpp |  4 ++
 llvm/lib/Target/AMDGPU/SMInstructions.td  | 14 +
 llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp   |  3 +
 llvm/lib/Target/NVPTX/NVPTXInstrInfo.td   |  1 -
 llvm/lib/Target/NVPTX/NVPTXIntrinsics.td  |  4 ++
 .../GlobalISel/legalizer-info-validation.mir  |  3 +
 llvm/test/CodeGen/AMDGPU/readsteadycounter.ll | 24 +++
 llvm/test/CodeGen/NVPTX/intrinsics.ll | 12 
 .../builtins/match-table-replacerreg.td   | 24 +++
 .../match-table-imms.td   | 32 +-
 .../match-table-intrinsics.td |  5 +-
 .../match-table-patfrag-root.td   |  4 +-
 .../GlobalISelCombinerEmitter/match-table.td  | 62 +--
 llvm/test/TableGen/GlobalISelEmitter.td   |  2 +-
 34 files changed, 228 insertions(+), 72 deletions(-)
 create mode 100644 llvm/test/CodeGen/AMDGPU/readsteadycounter.ll

diff --git a/clang/docs/LanguageExtensions.rst 
b/clang/docs/LanguageExtensions.rst
index e91156837290f7..4cc73599f9bae0 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -2764,6 +2764,37 @@ Query for this feature with 
``__has_builtin(__builtin_readcyclecounter)``. Note
 that even if present, its use may depend on run-time privilege or other OS
 controlled state.
 
+``__builtin_readsteadycounter``
+--
+
+``__builtin_readsteadycounter`` is used to access the fixed frequency counter
+register (or a similar steady-rate clock) on those targets that support it.
+The function is similar to ``__builtin_readcyclecounter`` above except that the
+frequency is fixed, making it suitable for measuring elapsed time.
+
+**Syntax**:
+
+.. code-block:: c++
+
+  __builtin_readsteadycounter()
+
+**Example of Use**:
+
+.. code-block:: c++
+
+  unsigned long long t0 = __builtin_readsteadycounter();
+  do_something();
+  unsigned long long t1 = __builtin_readsteadycounter();
+  unsigned long long secs_to_do_something = (t1 - t0) / tick_rate;
+
+**Description**:
+
+The ``__builtin_readsteadycounter()`` builtin returns the frequency counter 
value.
+When not supported by the target, the return value is always zero. This builtin
+takes no arguments and produces an unsigned long long result.
+
+Query for this feature with ``__has_builtin(__builtin_readsteadycounter)``.
+
 ``__builtin_dump_struct``
 -
 
diff --git a/clang/include/clang/Basic/Builtins.td 
b/clang/include/clang/Basic/Builtins.td
index 31a2bdeb2d3e5e..193d5851f9f29f 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -1110,6 +1110,12 @@ def ReadCycleCounter : Builtin {
   let Prototype = "unsigned long long int()";
 }
 
+def ReadSteadyCounter : Builtin {
+  let Spellings = ["__builtin_readsteadycounter"];
+  let Attributes = [NoThrow];
+  let Prototype = "unsigned long long int()";
+}
+
 def Trap : Builtin {
   let Spellings = ["__builtin_trap"];
   let Attributes = [NoThrow, NoReturn];
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index a7a410dab1a018..ee0b7504769622 100644
--- 

[clang] [llvm] [LLVM] Add `__builtin_readsteadycounter` intrinsic and buiiltin for realtime clocks (PR #81331)

2024-02-10 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 updated 
https://github.com/llvm/llvm-project/pull/81331

>From 109939223e7944472363134d72a223524e1e3f0a Mon Sep 17 00:00:00 2001
From: Joseph Huber 
Date: Fri, 9 Feb 2024 16:13:42 -0600
Subject: [PATCH] [WIP][LLVM] Add `__builtin_readsteadycounter` intrinsic and
 buiiltin

Summary:
This patch adds a new intrinsic and builtin function mirroring the
existing `__builtin_readcyclecounter`. The difference is that this
implementation targets a separate counter that some targets have which
returns a fixed frequency clock that can be used to determine elapsed
time, this is different compared to the cycle counter which often has
variable frequency. This is currently only valid for the NVPTX and
AMDGPU targets.
---
 clang/docs/LanguageExtensions.rst | 31 +++
 clang/include/clang/Basic/Builtins.td |  6 
 clang/lib/CodeGen/CGBuiltin.cpp   |  4 +++
 clang/test/CodeGen/builtins.c |  6 
 llvm/include/llvm/CodeGen/ISDOpcodes.h|  6 
 llvm/include/llvm/IR/Intrinsics.td|  2 ++
 llvm/include/llvm/Support/TargetOpcodes.def   |  3 ++
 llvm/include/llvm/Target/GenericOpcodes.td|  6 
 .../Target/GlobalISel/SelectionDAGCompat.td   |  1 +
 .../include/llvm/Target/TargetSelectionDAG.td |  3 ++
 llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp  |  2 ++
 llvm/lib/CodeGen/IntrinsicLowering.cpp|  6 
 llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp |  6 ++--
 .../SelectionDAG/LegalizeIntegerTypes.cpp |  7 +++--
 llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h |  2 +-
 .../SelectionDAG/SelectionDAGBuilder.cpp  |  8 +
 .../SelectionDAG/SelectionDAGDumper.cpp   |  1 +
 llvm/lib/CodeGen/TargetLoweringBase.cpp   |  3 ++
 .../lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp |  2 ++
 .../Target/AMDGPU/AMDGPURegisterBankInfo.cpp  |  1 +
 llvm/lib/Target/AMDGPU/SIISelLowering.cpp |  4 +++
 llvm/lib/Target/AMDGPU/SMInstructions.td  | 14 +
 llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp   |  3 ++
 llvm/lib/Target/NVPTX/NVPTXInstrInfo.td   |  1 -
 llvm/lib/Target/NVPTX/NVPTXIntrinsics.td  |  4 +++
 llvm/test/CodeGen/AMDGPU/readsteadycounter.ll | 24 ++
 llvm/test/CodeGen/NVPTX/intrinsics.ll | 12 +++
 27 files changed, 161 insertions(+), 7 deletions(-)
 create mode 100644 llvm/test/CodeGen/AMDGPU/readsteadycounter.ll

diff --git a/clang/docs/LanguageExtensions.rst 
b/clang/docs/LanguageExtensions.rst
index e91156837290f7..4cc73599f9bae0 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -2764,6 +2764,37 @@ Query for this feature with 
``__has_builtin(__builtin_readcyclecounter)``. Note
 that even if present, its use may depend on run-time privilege or other OS
 controlled state.
 
+``__builtin_readsteadycounter``
+--
+
+``__builtin_readsteadycounter`` is used to access the fixed frequency counter
+register (or a similar steady-rate clock) on those targets that support it.
+The function is similar to ``__builtin_readcyclecounter`` above except that the
+frequency is fixed, making it suitable for measuring elapsed time.
+
+**Syntax**:
+
+.. code-block:: c++
+
+  __builtin_readsteadycounter()
+
+**Example of Use**:
+
+.. code-block:: c++
+
+  unsigned long long t0 = __builtin_readsteadycounter();
+  do_something();
+  unsigned long long t1 = __builtin_readsteadycounter();
+  unsigned long long secs_to_do_something = (t1 - t0) / tick_rate;
+
+**Description**:
+
+The ``__builtin_readsteadycounter()`` builtin returns the frequency counter 
value.
+When not supported by the target, the return value is always zero. This builtin
+takes no arguments and produces an unsigned long long result.
+
+Query for this feature with ``__has_builtin(__builtin_readsteadycounter)``.
+
 ``__builtin_dump_struct``
 -
 
diff --git a/clang/include/clang/Basic/Builtins.td 
b/clang/include/clang/Basic/Builtins.td
index 31a2bdeb2d3e5e..193d5851f9f29f 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -1110,6 +1110,12 @@ def ReadCycleCounter : Builtin {
   let Prototype = "unsigned long long int()";
 }
 
+def ReadSteadyCounter : Builtin {
+  let Spellings = ["__builtin_readsteadycounter"];
+  let Attributes = [NoThrow];
+  let Prototype = "unsigned long long int()";
+}
+
 def Trap : Builtin {
   let Spellings = ["__builtin_trap"];
   let Attributes = [NoThrow, NoReturn];
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index a7a410dab1a018..ee0b7504769622 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -3443,6 +3443,10 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl 
GD, unsigned BuiltinID,
 Function *F = CGM.getIntrinsic(Intrinsic::readcyclecounter);
 return RValue::get(Builder.CreateCall(F));
   }
+  case Builtin::BI__builtin_readsteadycounter: {
+Function *F = 

[clang] [llvm] [LLVM] Add `__builtin_readsteadycounter` intrinsic and buiiltin for realtime clocks (PR #81331)

2024-02-10 Thread Joseph Huber via cfe-commits

jhuber6 wrote:

Added clang test and renamed to `readsteadycounter` as I think it's more 
descriptive and matches the existing `readcyclecounter` better.

https://github.com/llvm/llvm-project/pull/81331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [LLVM] Add `__builtin_readsteadycounter` intrinsic and buiiltin for realtime clocks (PR #81331)

2024-02-10 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 edited 
https://github.com/llvm/llvm-project/pull/81331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits