[clang] [HLSL] standardize builtin unit tests (PR #83340)

2024-02-28 Thread Greg Roth via cfe-commits

pow2clk wrote:

Do we have the suggestions that this is responding to written down somewhere? I 
think it would be useful to have those guidelines for anyone who might want to 
contribute HLSL tests. At any rate, I'd like to know the ones that this is in 
response to. 

https://github.com/llvm/llvm-project/pull/83340
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][sema] consolidate diags for incompatible_vector_* (PR #83609)

2024-03-01 Thread Greg Roth via cfe-commits


@@ -5218,15 +5218,15 @@ bool CheckVectorElementCallArgs(Sema *S, CallExpr 
*TheCall) {
 // Note: type promotion is intended to be handeled via the intrinsics
 //  and not the builtin itself.
 S->Diag(TheCall->getBeginLoc(),
-diag::err_vec_builtin_incompatible_vector_all)
-<< TheCall->getDirectCallee()
+diag::err_vec_builtin_incompatible_vector)
+<< TheCall->getDirectCallee() << /*all args*/ true
 << SourceRange(A.get()->getBeginLoc(), B.get()->getEndLoc());
 retValue = true;
   }
   if (VecTyA->getNumElements() != VecTyB->getNumElements()) {
 // if we get here a HLSLVectorTruncation is needed.

pow2clk wrote:

This comment seems inaccurate as we're not inserting a truncation, but failing.

https://github.com/llvm/llvm-project/pull/83609
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][sema] consolidate diags for incompatible_vector_* (PR #83609)

2024-03-01 Thread Greg Roth via cfe-commits

https://github.com/pow2clk edited 
https://github.com/llvm/llvm-project/pull/83609
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][sema] consolidate diags for incompatible_vector_* (PR #83609)

2024-03-01 Thread Greg Roth via cfe-commits

https://github.com/pow2clk commented:

Just some nitpicks. I also find it much harder to rereview a change when a 
force push has gone through. I don't know how things are done around here, but 
if it can be avoided, I prefer it.

https://github.com/llvm/llvm-project/pull/83609
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][sema] consolidate diags for incompatible_vector_* (PR #83609)

2024-03-01 Thread Greg Roth via cfe-commits


@@ -22,7 +22,7 @@ float test_dot_vector_size_mismatch(float3 p0, float2 p1) {
 
 float test_dot_builtin_vector_size_mismatch(float3 p0, float2 p1) {
   return __builtin_hlsl_dot(p0, p1);
-  // expected-error@-1 {{all arguments to '__builtin_hlsl_dot' must have 
vectors of the same type}}
+  // expected-error@-1 {{all arguments to '__builtin_hlsl_dot' must have the 
same type}}

pow2clk wrote:

I don't think this error ever fires on non-vectors. I find the "vectors of the 
same type" wording to be clearer, but I don't feel strongly about it. 

https://github.com/llvm/llvm-project/pull/83609
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][sema] consolidate diags for incompatible_vector_* (PR #83609)

2024-03-01 Thread Greg Roth via cfe-commits


@@ -5241,8 +5241,8 @@ bool CheckVectorElementCallArgs(Sema *S, CallExpr 
*TheCall) {
 
   // Note: if we get here one of the args is a scalar which
   // requires a VectorSplat on Arg0 or Arg1
-  S->Diag(BuiltinLoc, diag::err_vec_builtin_non_vector_all)
-  << TheCall->getDirectCallee()
+  S->Diag(BuiltinLoc, diag::err_vec_builtin_non_vector)
+  << TheCall->getDirectCallee() << /*all args*/ true

pow2clk wrote:

I appreciate the comment about what the parameter means, there is some 
inconsistency in how it's labeled. After this point, it's always 
`/*isMoretThanTwoArgs*/` which I think is better. 

https://github.com/llvm/llvm-project/pull/83609
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-04 Thread Greg Roth via cfe-commits


@@ -414,9 +414,20 @@ void CGHLSLRuntime::emitEntryFunction(const FunctionDecl 
*FD,
 
 void CGHLSLRuntime::setHLSLFunctionAttributes(const FunctionDecl *FD,
   llvm::Function *Fn) {
-  if (FD->isInExportDeclContext()) {
-const StringRef ExportAttrKindStr = "hlsl.export";
-Fn->addFnAttr(ExportAttrKindStr);
+  if (FD) { // "explicit" functions with declarations
+if (FD->isInExportDeclContext()) {
+  const StringRef ExportAttrKindStr = "hlsl.export";
+  Fn->addFnAttr(ExportAttrKindStr);
+}
+// Respect noinline if the explicit functions use it
+// otherwise default to alwaysinline
+if (!Fn->hasFnAttribute(Attribute::NoInline))
+  Fn->addFnAttr(llvm::Attribute::AlwaysInline);
+  } else { // "implicit" autogenerated functions with no declaration
+// Implicit functions might get marked as noinline by default
+// but we override that for HLSL
+Fn->removeFnAttr(Attribute::NoInline);

pow3clk wrote:

I was trying not to litter too many `if(getLangOpts().HLSL)` conditionals 
throughout the code. Something similar to this is done in [ 
a](https://github.com/llvm/llvm-project/blob/3ebd79751f2d5e1c54047409865c051daba0a21b/clang/lib/CodeGen/CGOpenMPRuntime.cpp#L3222-L3224)
 [few]( 
https://github.com/llvm/llvm-project/blob/3ebd79751f2d5e1c54047409865c051daba0a21b/clang/lib/CodeGen/CGOpenMPRuntime.cpp#L1100-L1102)
 
[places](https://github.com/llvm/llvm-project/blob/3ebd79751f2d5e1c54047409865c051daba0a21b/clang/lib/CodeGen/CGStmtOpenMP.cpp#L582-L583)
 for OpenMP. I don't feel strongly about it though. It's trivial to drop an _if 
HLSL_ conditional there.

https://github.com/llvm/llvm-project/pull/106588
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-04 Thread Greg Roth via cfe-commits

https://github.com/pow3clk commented:

On my laptop account. I assure you it's who you think it is.

https://github.com/llvm/llvm-project/pull/106588
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-04 Thread Greg Roth via cfe-commits

https://github.com/pow3clk edited 
https://github.com/llvm/llvm-project/pull/106588
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-04 Thread Greg Roth via cfe-commits


@@ -290,8 +290,6 @@ struct BuiltinTypeDeclBuilder {
  SourceLocation()));
 MethodDecl->setLexicalDeclContext(Record);
 MethodDecl->setAccess(AccessSpecifier::AS_public);
-MethodDecl->addAttr(AlwaysInlineAttr::CreateImplicit(
-AST, SourceRange(), AlwaysInlineAttr::CXX11_clang_always_inline));

pow3clk wrote:

Would that involve generating just AST then? The DXC analog would still be 
generating LLVM IR, which this setting would catch and does for the 
inline-constructors.hlsl test among others.

 As above, I care more about getting this in than disputing this. The setting 
here is at worst redundant.

https://github.com/llvm/llvm-project/pull/106588
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-04 Thread Greg Roth via cfe-commits

https://github.com/pow2clk updated 
https://github.com/llvm/llvm-project/pull/106588

>From 12253818bd47aa8c324f6222586965f356b11c90 Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Wed, 24 Jul 2024 16:49:19 -0600
Subject: [PATCH 1/3] [HLSL] set alwaysinline on HLSL functions

HLSL inlines all its functions by default. This uses the alwaysinline
attribute to force that in the corresponding pass for user functions
by default and overrides the default noinline of some implicit functions.
This makes an instance of explicit inlining for buffer subscripts unnecessary.

Adds tests for function and constructor inlining and augments some existing
tests to verify correct inlining of implicitly created functions as well.

incidentally restore RUN line that I believe was mistakenly removed as part of 
#88918

fixes #89282
---
 clang/lib/CodeGen/CGHLSLRuntime.cpp   |  17 ++-
 clang/lib/CodeGen/CodeGenFunction.cpp |   4 +-
 clang/lib/Sema/HLSLExternalSemaSource.cpp |   2 -
 .../GlobalConstructorFunction.hlsl|  31 +++--
 .../CodeGenHLSL/GlobalConstructorLib.hlsl |  23 +++-
 clang/test/CodeGenHLSL/GlobalDestructors.hlsl |  51 +---
 .../builtins/RWBuffer-constructor.hlsl|   1 +
 .../builtins/RWBuffer-subscript.hlsl  |   5 +-
 .../test/CodeGenHLSL/inline-constructors.hlsl |  74 
 clang/test/CodeGenHLSL/inline-functions.hlsl  | 114 ++
 10 files changed, 279 insertions(+), 43 deletions(-)
 create mode 100644 clang/test/CodeGenHLSL/inline-constructors.hlsl
 create mode 100644 clang/test/CodeGenHLSL/inline-functions.hlsl

diff --git a/clang/lib/CodeGen/CGHLSLRuntime.cpp 
b/clang/lib/CodeGen/CGHLSLRuntime.cpp
index 4bd7b6ba58de0d..24d126ced0d9f7 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.cpp
+++ b/clang/lib/CodeGen/CGHLSLRuntime.cpp
@@ -414,9 +414,20 @@ void CGHLSLRuntime::emitEntryFunction(const FunctionDecl 
*FD,
 
 void CGHLSLRuntime::setHLSLFunctionAttributes(const FunctionDecl *FD,
   llvm::Function *Fn) {
-  if (FD->isInExportDeclContext()) {
-const StringRef ExportAttrKindStr = "hlsl.export";
-Fn->addFnAttr(ExportAttrKindStr);
+  if (FD) { // "explicit" functions with declarations
+if (FD->isInExportDeclContext()) {
+  const StringRef ExportAttrKindStr = "hlsl.export";
+  Fn->addFnAttr(ExportAttrKindStr);
+}
+// Respect noinline if the explicit functions use it
+// otherwise default to alwaysinline
+if (!Fn->hasFnAttribute(Attribute::NoInline))
+  Fn->addFnAttr(llvm::Attribute::AlwaysInline);
+  } else { // "implicit" autogenerated functions with no declaration
+// Implicit functions might get marked as noinline by default
+// but we override that for HLSL
+Fn->removeFnAttr(Attribute::NoInline);
+Fn->addFnAttr(Attribute::AlwaysInline);
   }
 }
 
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp 
b/clang/lib/CodeGen/CodeGenFunction.cpp
index a5747283e98058..aceeed0e66d130 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -1239,9 +1239,9 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, 
QualType RetTy,
   if (getLangOpts().OpenMP && CurCodeDecl)
 CGM.getOpenMPRuntime().emitFunctionProlog(*this, CurCodeDecl);
 
-  if (FD && getLangOpts().HLSL) {
+  if (getLangOpts().HLSL) {
 // Handle emitting HLSL entry functions.
-if (FD->hasAttr()) {
+if (FD && FD->hasAttr()) {
   CGM.getHLSLRuntime().emitEntryFunction(FD, Fn);
 }
 CGM.getHLSLRuntime().setHLSLFunctionAttributes(FD, Fn);
diff --git a/clang/lib/Sema/HLSLExternalSemaSource.cpp 
b/clang/lib/Sema/HLSLExternalSemaSource.cpp
index 9aacbe4ad9548e..0a534d94192560 100644
--- a/clang/lib/Sema/HLSLExternalSemaSource.cpp
+++ b/clang/lib/Sema/HLSLExternalSemaSource.cpp
@@ -290,8 +290,6 @@ struct BuiltinTypeDeclBuilder {
  SourceLocation()));
 MethodDecl->setLexicalDeclContext(Record);
 MethodDecl->setAccess(AccessSpecifier::AS_public);
-MethodDecl->addAttr(AlwaysInlineAttr::CreateImplicit(
-AST, SourceRange(), AlwaysInlineAttr::CXX11_clang_always_inline));
 Record->addDecl(MethodDecl);
 
 return *this;
diff --git a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl 
b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
index f954c9d2f029f2..b39311ad67cd62 100644
--- a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
+++ b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm 
-disable-llvm-passes %s -o - | FileCheck %s
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm 
-disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=CHECK,NOINLINE
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -O0 %s -o 
- | FileCheck %s --check-prefixes=CHECK,INLINE
 
 int i;
 
@@ -7,7 +8,7 @@ __attribute__((constructor)) void call_me

[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-04 Thread Greg Roth via cfe-commits

https://github.com/pow2clk updated 
https://github.com/llvm/llvm-project/pull/106588

>From 12253818bd47aa8c324f6222586965f356b11c90 Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Wed, 24 Jul 2024 16:49:19 -0600
Subject: [PATCH 1/4] [HLSL] set alwaysinline on HLSL functions

HLSL inlines all its functions by default. This uses the alwaysinline
attribute to force that in the corresponding pass for user functions
by default and overrides the default noinline of some implicit functions.
This makes an instance of explicit inlining for buffer subscripts unnecessary.

Adds tests for function and constructor inlining and augments some existing
tests to verify correct inlining of implicitly created functions as well.

incidentally restore RUN line that I believe was mistakenly removed as part of 
#88918

fixes #89282
---
 clang/lib/CodeGen/CGHLSLRuntime.cpp   |  17 ++-
 clang/lib/CodeGen/CodeGenFunction.cpp |   4 +-
 clang/lib/Sema/HLSLExternalSemaSource.cpp |   2 -
 .../GlobalConstructorFunction.hlsl|  31 +++--
 .../CodeGenHLSL/GlobalConstructorLib.hlsl |  23 +++-
 clang/test/CodeGenHLSL/GlobalDestructors.hlsl |  51 +---
 .../builtins/RWBuffer-constructor.hlsl|   1 +
 .../builtins/RWBuffer-subscript.hlsl  |   5 +-
 .../test/CodeGenHLSL/inline-constructors.hlsl |  74 
 clang/test/CodeGenHLSL/inline-functions.hlsl  | 114 ++
 10 files changed, 279 insertions(+), 43 deletions(-)
 create mode 100644 clang/test/CodeGenHLSL/inline-constructors.hlsl
 create mode 100644 clang/test/CodeGenHLSL/inline-functions.hlsl

diff --git a/clang/lib/CodeGen/CGHLSLRuntime.cpp 
b/clang/lib/CodeGen/CGHLSLRuntime.cpp
index 4bd7b6ba58de0d..24d126ced0d9f7 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.cpp
+++ b/clang/lib/CodeGen/CGHLSLRuntime.cpp
@@ -414,9 +414,20 @@ void CGHLSLRuntime::emitEntryFunction(const FunctionDecl 
*FD,
 
 void CGHLSLRuntime::setHLSLFunctionAttributes(const FunctionDecl *FD,
   llvm::Function *Fn) {
-  if (FD->isInExportDeclContext()) {
-const StringRef ExportAttrKindStr = "hlsl.export";
-Fn->addFnAttr(ExportAttrKindStr);
+  if (FD) { // "explicit" functions with declarations
+if (FD->isInExportDeclContext()) {
+  const StringRef ExportAttrKindStr = "hlsl.export";
+  Fn->addFnAttr(ExportAttrKindStr);
+}
+// Respect noinline if the explicit functions use it
+// otherwise default to alwaysinline
+if (!Fn->hasFnAttribute(Attribute::NoInline))
+  Fn->addFnAttr(llvm::Attribute::AlwaysInline);
+  } else { // "implicit" autogenerated functions with no declaration
+// Implicit functions might get marked as noinline by default
+// but we override that for HLSL
+Fn->removeFnAttr(Attribute::NoInline);
+Fn->addFnAttr(Attribute::AlwaysInline);
   }
 }
 
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp 
b/clang/lib/CodeGen/CodeGenFunction.cpp
index a5747283e98058..aceeed0e66d130 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -1239,9 +1239,9 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, 
QualType RetTy,
   if (getLangOpts().OpenMP && CurCodeDecl)
 CGM.getOpenMPRuntime().emitFunctionProlog(*this, CurCodeDecl);
 
-  if (FD && getLangOpts().HLSL) {
+  if (getLangOpts().HLSL) {
 // Handle emitting HLSL entry functions.
-if (FD->hasAttr()) {
+if (FD && FD->hasAttr()) {
   CGM.getHLSLRuntime().emitEntryFunction(FD, Fn);
 }
 CGM.getHLSLRuntime().setHLSLFunctionAttributes(FD, Fn);
diff --git a/clang/lib/Sema/HLSLExternalSemaSource.cpp 
b/clang/lib/Sema/HLSLExternalSemaSource.cpp
index 9aacbe4ad9548e..0a534d94192560 100644
--- a/clang/lib/Sema/HLSLExternalSemaSource.cpp
+++ b/clang/lib/Sema/HLSLExternalSemaSource.cpp
@@ -290,8 +290,6 @@ struct BuiltinTypeDeclBuilder {
  SourceLocation()));
 MethodDecl->setLexicalDeclContext(Record);
 MethodDecl->setAccess(AccessSpecifier::AS_public);
-MethodDecl->addAttr(AlwaysInlineAttr::CreateImplicit(
-AST, SourceRange(), AlwaysInlineAttr::CXX11_clang_always_inline));
 Record->addDecl(MethodDecl);
 
 return *this;
diff --git a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl 
b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
index f954c9d2f029f2..b39311ad67cd62 100644
--- a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
+++ b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm 
-disable-llvm-passes %s -o - | FileCheck %s
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm 
-disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=CHECK,NOINLINE
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -O0 %s -o 
- | FileCheck %s --check-prefixes=CHECK,INLINE
 
 int i;
 
@@ -7,7 +8,7 @@ __attribute__((constructor)) void call_me

[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-05 Thread Greg Roth via cfe-commits

https://github.com/pow3clk edited 
https://github.com/llvm/llvm-project/pull/106588
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-09 Thread Greg Roth via cfe-commits

https://github.com/pow2clk commented:

Thanks Damyan!

https://github.com/llvm/llvm-project/pull/106588
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-09 Thread Greg Roth via cfe-commits

https://github.com/pow2clk edited 
https://github.com/llvm/llvm-project/pull/106588
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-09 Thread Greg Roth via cfe-commits


@@ -0,0 +1,114 @@
+// RUN: %clang_cc1 -x hlsl -triple  dxil-pc-shadermodel6.3-library %s 
-emit-llvm -disable-llvm-passes -o - | FileCheck %s 
--check-prefixes=CHECK,NOINLINE
+// RUN: %clang_cc1 -x hlsl -triple  dxil-pc-shadermodel6.3-library %s 
-emit-llvm -O0 -o - | FileCheck %s --check-prefixes=CHECK,INLINE
+// RUN: %clang_cc1 -x hlsl -triple  dxil-pc-shadermodel6.0-compute %s 
-emit-llvm -disable-llvm-passes -o - | FileCheck %s 
--check-prefixes=CHECK,NOINLINE
+// RUN: %clang_cc1 -x hlsl -triple  dxil-pc-shadermodel6.0-compute %s 
-emit-llvm -O0 -o - | FileCheck %s --check-prefixes=CHECK,INLINE
+
+// Tests that user functions will always be inlined.
+// This includes exported functions and mangled entry point implementation 
functions.
+// The unmangled entry functions must not be alwaysinlined.
+
+#define MAX 100
+
+float nums[MAX];
+
+// Verify that all functions have the alwaysinline attribute
+// CHECK: Function Attrs: alwaysinline
+// CHECK: define void @"?swap@@YAXY0GE@III@Z"(ptr noundef byval([100 x i32]) 
align 4 %Buf, i32 noundef %ix1, i32 noundef %ix2) [[IntAttr:\#[0-9]+]]

pow2clk wrote:

The reason why some of these won't be removed is because of the 
`-disable-llvm-passes` flag which skips optimizations that remove trivially 
dead functions.

This comment did make me realize there is a bug in waiting though. We don't yet 
remove trivially dead functions like this at all because the internal linkage 
marking takes place after any pass that would remove them. I have a tentative 
fix that depends on this change here #106146. Once this is in, I'll focus on 
that. Once that's in, these won't show up in the INLINE case, so I moved them 
to be NOINLINE exclusive.

https://github.com/llvm/llvm-project/pull/106588
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-09 Thread Greg Roth via cfe-commits


@@ -1239,9 +1239,9 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, 
QualType RetTy,
   if (getLangOpts().OpenMP && CurCodeDecl)
 CGM.getOpenMPRuntime().emitFunctionProlog(*this, CurCodeDecl);
 
-  if (FD && getLangOpts().HLSL) {
+  if (getLangOpts().HLSL) {
 // Handle emitting HLSL entry functions.
-if (FD->hasAttr()) {
+if (FD && FD->hasAttr()) {
   CGM.getHLSLRuntime().emitEntryFunction(FD, Fn);
 }
 CGM.getHLSLRuntime().setHLSLFunctionAttributes(FD, Fn);

pow2clk wrote:

Yeah, the `emitFunctionProlog` form is how it looked in an earlier version of a 
change Helena made. I think it was #102275 which introduced 
`setHLSLFucntionAttributes`. I decided to follow her lead, but I don't know the 
reasoning behind it. 

This becomes moot with the new approach I'm taking. 

https://github.com/llvm/llvm-project/pull/106588
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-09 Thread Greg Roth via cfe-commits


@@ -2474,7 +2474,9 @@ void 
CodeGenModule::SetLLVMFunctionAttributesForDefinition(const Decl *D,
 // If we don't have a declaration to control inlining, the function isn't
 // explicitly marked as alwaysinline for semantic reasons, and inlining is
 // disabled, mark the function as noinline.
+// HLSL functions must be always inlined
 if (!F->hasFnAttribute(llvm::Attribute::AlwaysInline) &&
+!getLangOpts().HLSL &&

pow2clk wrote:

I investigated this in some depth. The situation is that the autogenerated 
initialization and destruction functions get their attributes set through this 
call and others before their bodies are generated, which calls 
`setHLSLFunctionAttributes` while user functions generate their bodies and make 
the hlsl attributes call before this one. This seems error-prone to me, but the 
code has been around long enough that I'm not ready to go changing it even 
though some initial experiments showed no regressions. 

In the meantime, since so much inline attribute logic is in this function, I've 
concluded that the best solution is to add our inline logic here and not in 
`setHLSLFunctionAttributes` at all. That way it doesn't matter as much where in 
the IR function creation process it takes place and any consequences can be 
accounted for along with the existing logic. That's what my latest commit does. 

https://github.com/llvm/llvm-project/pull/106588
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-09 Thread Greg Roth via cfe-commits

https://github.com/pow2clk updated 
https://github.com/llvm/llvm-project/pull/106588

>From 12253818bd47aa8c324f6222586965f356b11c90 Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Wed, 24 Jul 2024 16:49:19 -0600
Subject: [PATCH 1/6] [HLSL] set alwaysinline on HLSL functions

HLSL inlines all its functions by default. This uses the alwaysinline
attribute to force that in the corresponding pass for user functions
by default and overrides the default noinline of some implicit functions.
This makes an instance of explicit inlining for buffer subscripts unnecessary.

Adds tests for function and constructor inlining and augments some existing
tests to verify correct inlining of implicitly created functions as well.

incidentally restore RUN line that I believe was mistakenly removed as part of 
#88918

fixes #89282
---
 clang/lib/CodeGen/CGHLSLRuntime.cpp   |  17 ++-
 clang/lib/CodeGen/CodeGenFunction.cpp |   4 +-
 clang/lib/Sema/HLSLExternalSemaSource.cpp |   2 -
 .../GlobalConstructorFunction.hlsl|  31 +++--
 .../CodeGenHLSL/GlobalConstructorLib.hlsl |  23 +++-
 clang/test/CodeGenHLSL/GlobalDestructors.hlsl |  51 +---
 .../builtins/RWBuffer-constructor.hlsl|   1 +
 .../builtins/RWBuffer-subscript.hlsl  |   5 +-
 .../test/CodeGenHLSL/inline-constructors.hlsl |  74 
 clang/test/CodeGenHLSL/inline-functions.hlsl  | 114 ++
 10 files changed, 279 insertions(+), 43 deletions(-)
 create mode 100644 clang/test/CodeGenHLSL/inline-constructors.hlsl
 create mode 100644 clang/test/CodeGenHLSL/inline-functions.hlsl

diff --git a/clang/lib/CodeGen/CGHLSLRuntime.cpp 
b/clang/lib/CodeGen/CGHLSLRuntime.cpp
index 4bd7b6ba58de0d..24d126ced0d9f7 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.cpp
+++ b/clang/lib/CodeGen/CGHLSLRuntime.cpp
@@ -414,9 +414,20 @@ void CGHLSLRuntime::emitEntryFunction(const FunctionDecl 
*FD,
 
 void CGHLSLRuntime::setHLSLFunctionAttributes(const FunctionDecl *FD,
   llvm::Function *Fn) {
-  if (FD->isInExportDeclContext()) {
-const StringRef ExportAttrKindStr = "hlsl.export";
-Fn->addFnAttr(ExportAttrKindStr);
+  if (FD) { // "explicit" functions with declarations
+if (FD->isInExportDeclContext()) {
+  const StringRef ExportAttrKindStr = "hlsl.export";
+  Fn->addFnAttr(ExportAttrKindStr);
+}
+// Respect noinline if the explicit functions use it
+// otherwise default to alwaysinline
+if (!Fn->hasFnAttribute(Attribute::NoInline))
+  Fn->addFnAttr(llvm::Attribute::AlwaysInline);
+  } else { // "implicit" autogenerated functions with no declaration
+// Implicit functions might get marked as noinline by default
+// but we override that for HLSL
+Fn->removeFnAttr(Attribute::NoInline);
+Fn->addFnAttr(Attribute::AlwaysInline);
   }
 }
 
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp 
b/clang/lib/CodeGen/CodeGenFunction.cpp
index a5747283e98058..aceeed0e66d130 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -1239,9 +1239,9 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, 
QualType RetTy,
   if (getLangOpts().OpenMP && CurCodeDecl)
 CGM.getOpenMPRuntime().emitFunctionProlog(*this, CurCodeDecl);
 
-  if (FD && getLangOpts().HLSL) {
+  if (getLangOpts().HLSL) {
 // Handle emitting HLSL entry functions.
-if (FD->hasAttr()) {
+if (FD && FD->hasAttr()) {
   CGM.getHLSLRuntime().emitEntryFunction(FD, Fn);
 }
 CGM.getHLSLRuntime().setHLSLFunctionAttributes(FD, Fn);
diff --git a/clang/lib/Sema/HLSLExternalSemaSource.cpp 
b/clang/lib/Sema/HLSLExternalSemaSource.cpp
index 9aacbe4ad9548e..0a534d94192560 100644
--- a/clang/lib/Sema/HLSLExternalSemaSource.cpp
+++ b/clang/lib/Sema/HLSLExternalSemaSource.cpp
@@ -290,8 +290,6 @@ struct BuiltinTypeDeclBuilder {
  SourceLocation()));
 MethodDecl->setLexicalDeclContext(Record);
 MethodDecl->setAccess(AccessSpecifier::AS_public);
-MethodDecl->addAttr(AlwaysInlineAttr::CreateImplicit(
-AST, SourceRange(), AlwaysInlineAttr::CXX11_clang_always_inline));
 Record->addDecl(MethodDecl);
 
 return *this;
diff --git a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl 
b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
index f954c9d2f029f2..b39311ad67cd62 100644
--- a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
+++ b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm 
-disable-llvm-passes %s -o - | FileCheck %s
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm 
-disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=CHECK,NOINLINE
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -O0 %s -o 
- | FileCheck %s --check-prefixes=CHECK,INLINE
 
 int i;
 
@@ -7,7 +8,7 @@ __attribute__((constructor)) void call_me

[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-09 Thread Greg Roth via cfe-commits

https://github.com/pow2clk updated 
https://github.com/llvm/llvm-project/pull/106588

>From 12253818bd47aa8c324f6222586965f356b11c90 Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Wed, 24 Jul 2024 16:49:19 -0600
Subject: [PATCH 1/7] [HLSL] set alwaysinline on HLSL functions

HLSL inlines all its functions by default. This uses the alwaysinline
attribute to force that in the corresponding pass for user functions
by default and overrides the default noinline of some implicit functions.
This makes an instance of explicit inlining for buffer subscripts unnecessary.

Adds tests for function and constructor inlining and augments some existing
tests to verify correct inlining of implicitly created functions as well.

incidentally restore RUN line that I believe was mistakenly removed as part of 
#88918

fixes #89282
---
 clang/lib/CodeGen/CGHLSLRuntime.cpp   |  17 ++-
 clang/lib/CodeGen/CodeGenFunction.cpp |   4 +-
 clang/lib/Sema/HLSLExternalSemaSource.cpp |   2 -
 .../GlobalConstructorFunction.hlsl|  31 +++--
 .../CodeGenHLSL/GlobalConstructorLib.hlsl |  23 +++-
 clang/test/CodeGenHLSL/GlobalDestructors.hlsl |  51 +---
 .../builtins/RWBuffer-constructor.hlsl|   1 +
 .../builtins/RWBuffer-subscript.hlsl  |   5 +-
 .../test/CodeGenHLSL/inline-constructors.hlsl |  74 
 clang/test/CodeGenHLSL/inline-functions.hlsl  | 114 ++
 10 files changed, 279 insertions(+), 43 deletions(-)
 create mode 100644 clang/test/CodeGenHLSL/inline-constructors.hlsl
 create mode 100644 clang/test/CodeGenHLSL/inline-functions.hlsl

diff --git a/clang/lib/CodeGen/CGHLSLRuntime.cpp 
b/clang/lib/CodeGen/CGHLSLRuntime.cpp
index 4bd7b6ba58de0d..24d126ced0d9f7 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.cpp
+++ b/clang/lib/CodeGen/CGHLSLRuntime.cpp
@@ -414,9 +414,20 @@ void CGHLSLRuntime::emitEntryFunction(const FunctionDecl 
*FD,
 
 void CGHLSLRuntime::setHLSLFunctionAttributes(const FunctionDecl *FD,
   llvm::Function *Fn) {
-  if (FD->isInExportDeclContext()) {
-const StringRef ExportAttrKindStr = "hlsl.export";
-Fn->addFnAttr(ExportAttrKindStr);
+  if (FD) { // "explicit" functions with declarations
+if (FD->isInExportDeclContext()) {
+  const StringRef ExportAttrKindStr = "hlsl.export";
+  Fn->addFnAttr(ExportAttrKindStr);
+}
+// Respect noinline if the explicit functions use it
+// otherwise default to alwaysinline
+if (!Fn->hasFnAttribute(Attribute::NoInline))
+  Fn->addFnAttr(llvm::Attribute::AlwaysInline);
+  } else { // "implicit" autogenerated functions with no declaration
+// Implicit functions might get marked as noinline by default
+// but we override that for HLSL
+Fn->removeFnAttr(Attribute::NoInline);
+Fn->addFnAttr(Attribute::AlwaysInline);
   }
 }
 
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp 
b/clang/lib/CodeGen/CodeGenFunction.cpp
index a5747283e98058..aceeed0e66d130 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -1239,9 +1239,9 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, 
QualType RetTy,
   if (getLangOpts().OpenMP && CurCodeDecl)
 CGM.getOpenMPRuntime().emitFunctionProlog(*this, CurCodeDecl);
 
-  if (FD && getLangOpts().HLSL) {
+  if (getLangOpts().HLSL) {
 // Handle emitting HLSL entry functions.
-if (FD->hasAttr()) {
+if (FD && FD->hasAttr()) {
   CGM.getHLSLRuntime().emitEntryFunction(FD, Fn);
 }
 CGM.getHLSLRuntime().setHLSLFunctionAttributes(FD, Fn);
diff --git a/clang/lib/Sema/HLSLExternalSemaSource.cpp 
b/clang/lib/Sema/HLSLExternalSemaSource.cpp
index 9aacbe4ad9548e..0a534d94192560 100644
--- a/clang/lib/Sema/HLSLExternalSemaSource.cpp
+++ b/clang/lib/Sema/HLSLExternalSemaSource.cpp
@@ -290,8 +290,6 @@ struct BuiltinTypeDeclBuilder {
  SourceLocation()));
 MethodDecl->setLexicalDeclContext(Record);
 MethodDecl->setAccess(AccessSpecifier::AS_public);
-MethodDecl->addAttr(AlwaysInlineAttr::CreateImplicit(
-AST, SourceRange(), AlwaysInlineAttr::CXX11_clang_always_inline));
 Record->addDecl(MethodDecl);
 
 return *this;
diff --git a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl 
b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
index f954c9d2f029f2..b39311ad67cd62 100644
--- a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
+++ b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm 
-disable-llvm-passes %s -o - | FileCheck %s
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm 
-disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=CHECK,NOINLINE
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -O0 %s -o 
- | FileCheck %s --check-prefixes=CHECK,INLINE
 
 int i;
 
@@ -7,7 +8,7 @@ __attribute__((constructor)) void call_me

[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-09 Thread Greg Roth via cfe-commits


@@ -2471,11 +2471,14 @@ void 
CodeGenModule::SetLLVMFunctionAttributesForDefinition(const Decl *D,
 B.addAttribute(llvm::Attribute::StackProtectReq);
 
   if (!D) {
+// HLSL functions must always be inlined
+if (getLangOpts().HLSL && !F->hasFnAttribute("hlsl.shader"))

pow2clk wrote:

In earlier discussion, we resolved to make this the default and not worry about 
noinline as of yet seeing as there are a lot of issues to resolve before that 
can work properly. In a previous version, I was forced to check for noinline as 
it was a sign that a function had passed through this function, but my 
preference is to not respect it until we've worked out the issues that prevent 
it from working right. 

 It wouldn't simplify the logic since the entry function doesn't get marked 
inline until it is passed into this function. 

https://github.com/llvm/llvm-project/pull/106588
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-09 Thread Greg Roth via cfe-commits

https://github.com/pow2clk updated 
https://github.com/llvm/llvm-project/pull/106588

>From 12253818bd47aa8c324f6222586965f356b11c90 Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Wed, 24 Jul 2024 16:49:19 -0600
Subject: [PATCH 1/8] [HLSL] set alwaysinline on HLSL functions

HLSL inlines all its functions by default. This uses the alwaysinline
attribute to force that in the corresponding pass for user functions
by default and overrides the default noinline of some implicit functions.
This makes an instance of explicit inlining for buffer subscripts unnecessary.

Adds tests for function and constructor inlining and augments some existing
tests to verify correct inlining of implicitly created functions as well.

incidentally restore RUN line that I believe was mistakenly removed as part of 
#88918

fixes #89282
---
 clang/lib/CodeGen/CGHLSLRuntime.cpp   |  17 ++-
 clang/lib/CodeGen/CodeGenFunction.cpp |   4 +-
 clang/lib/Sema/HLSLExternalSemaSource.cpp |   2 -
 .../GlobalConstructorFunction.hlsl|  31 +++--
 .../CodeGenHLSL/GlobalConstructorLib.hlsl |  23 +++-
 clang/test/CodeGenHLSL/GlobalDestructors.hlsl |  51 +---
 .../builtins/RWBuffer-constructor.hlsl|   1 +
 .../builtins/RWBuffer-subscript.hlsl  |   5 +-
 .../test/CodeGenHLSL/inline-constructors.hlsl |  74 
 clang/test/CodeGenHLSL/inline-functions.hlsl  | 114 ++
 10 files changed, 279 insertions(+), 43 deletions(-)
 create mode 100644 clang/test/CodeGenHLSL/inline-constructors.hlsl
 create mode 100644 clang/test/CodeGenHLSL/inline-functions.hlsl

diff --git a/clang/lib/CodeGen/CGHLSLRuntime.cpp 
b/clang/lib/CodeGen/CGHLSLRuntime.cpp
index 4bd7b6ba58de0d..24d126ced0d9f7 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.cpp
+++ b/clang/lib/CodeGen/CGHLSLRuntime.cpp
@@ -414,9 +414,20 @@ void CGHLSLRuntime::emitEntryFunction(const FunctionDecl 
*FD,
 
 void CGHLSLRuntime::setHLSLFunctionAttributes(const FunctionDecl *FD,
   llvm::Function *Fn) {
-  if (FD->isInExportDeclContext()) {
-const StringRef ExportAttrKindStr = "hlsl.export";
-Fn->addFnAttr(ExportAttrKindStr);
+  if (FD) { // "explicit" functions with declarations
+if (FD->isInExportDeclContext()) {
+  const StringRef ExportAttrKindStr = "hlsl.export";
+  Fn->addFnAttr(ExportAttrKindStr);
+}
+// Respect noinline if the explicit functions use it
+// otherwise default to alwaysinline
+if (!Fn->hasFnAttribute(Attribute::NoInline))
+  Fn->addFnAttr(llvm::Attribute::AlwaysInline);
+  } else { // "implicit" autogenerated functions with no declaration
+// Implicit functions might get marked as noinline by default
+// but we override that for HLSL
+Fn->removeFnAttr(Attribute::NoInline);
+Fn->addFnAttr(Attribute::AlwaysInline);
   }
 }
 
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp 
b/clang/lib/CodeGen/CodeGenFunction.cpp
index a5747283e98058..aceeed0e66d130 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -1239,9 +1239,9 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, 
QualType RetTy,
   if (getLangOpts().OpenMP && CurCodeDecl)
 CGM.getOpenMPRuntime().emitFunctionProlog(*this, CurCodeDecl);
 
-  if (FD && getLangOpts().HLSL) {
+  if (getLangOpts().HLSL) {
 // Handle emitting HLSL entry functions.
-if (FD->hasAttr()) {
+if (FD && FD->hasAttr()) {
   CGM.getHLSLRuntime().emitEntryFunction(FD, Fn);
 }
 CGM.getHLSLRuntime().setHLSLFunctionAttributes(FD, Fn);
diff --git a/clang/lib/Sema/HLSLExternalSemaSource.cpp 
b/clang/lib/Sema/HLSLExternalSemaSource.cpp
index 9aacbe4ad9548e..0a534d94192560 100644
--- a/clang/lib/Sema/HLSLExternalSemaSource.cpp
+++ b/clang/lib/Sema/HLSLExternalSemaSource.cpp
@@ -290,8 +290,6 @@ struct BuiltinTypeDeclBuilder {
  SourceLocation()));
 MethodDecl->setLexicalDeclContext(Record);
 MethodDecl->setAccess(AccessSpecifier::AS_public);
-MethodDecl->addAttr(AlwaysInlineAttr::CreateImplicit(
-AST, SourceRange(), AlwaysInlineAttr::CXX11_clang_always_inline));
 Record->addDecl(MethodDecl);
 
 return *this;
diff --git a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl 
b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
index f954c9d2f029f2..b39311ad67cd62 100644
--- a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
+++ b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm 
-disable-llvm-passes %s -o - | FileCheck %s
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm 
-disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=CHECK,NOINLINE
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -O0 %s -o 
- | FileCheck %s --check-prefixes=CHECK,INLINE
 
 int i;
 
@@ -7,7 +8,7 @@ __attribute__((constructor)) void call_me

[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-10 Thread Greg Roth via cfe-commits


@@ -2471,11 +2471,14 @@ void 
CodeGenModule::SetLLVMFunctionAttributesForDefinition(const Decl *D,
 B.addAttribute(llvm::Attribute::StackProtectReq);
 
   if (!D) {
+// HLSL functions must always be inlined
+if (getLangOpts().HLSL && !F->hasFnAttribute("hlsl.shader"))

pow2clk wrote:

Just an update. As a result of the waning minutes of a design discussion, we 
resolved to produce a warning when users apply noinline and explicitly set 
noinline for the outermost entry function which would allow checking for that 
in place of hlsl.shader here, which might allow graceful resolution of any 
other functions that happen to get it applied here which we could have an 
assert for. 

https://github.com/llvm/llvm-project/pull/106588
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-10 Thread Greg Roth via cfe-commits

https://github.com/pow2clk updated 
https://github.com/llvm/llvm-project/pull/106588

>From 12253818bd47aa8c324f6222586965f356b11c90 Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Wed, 24 Jul 2024 16:49:19 -0600
Subject: [PATCH 1/9] [HLSL] set alwaysinline on HLSL functions

HLSL inlines all its functions by default. This uses the alwaysinline
attribute to force that in the corresponding pass for user functions
by default and overrides the default noinline of some implicit functions.
This makes an instance of explicit inlining for buffer subscripts unnecessary.

Adds tests for function and constructor inlining and augments some existing
tests to verify correct inlining of implicitly created functions as well.

incidentally restore RUN line that I believe was mistakenly removed as part of 
#88918

fixes #89282
---
 clang/lib/CodeGen/CGHLSLRuntime.cpp   |  17 ++-
 clang/lib/CodeGen/CodeGenFunction.cpp |   4 +-
 clang/lib/Sema/HLSLExternalSemaSource.cpp |   2 -
 .../GlobalConstructorFunction.hlsl|  31 +++--
 .../CodeGenHLSL/GlobalConstructorLib.hlsl |  23 +++-
 clang/test/CodeGenHLSL/GlobalDestructors.hlsl |  51 +---
 .../builtins/RWBuffer-constructor.hlsl|   1 +
 .../builtins/RWBuffer-subscript.hlsl  |   5 +-
 .../test/CodeGenHLSL/inline-constructors.hlsl |  74 
 clang/test/CodeGenHLSL/inline-functions.hlsl  | 114 ++
 10 files changed, 279 insertions(+), 43 deletions(-)
 create mode 100644 clang/test/CodeGenHLSL/inline-constructors.hlsl
 create mode 100644 clang/test/CodeGenHLSL/inline-functions.hlsl

diff --git a/clang/lib/CodeGen/CGHLSLRuntime.cpp 
b/clang/lib/CodeGen/CGHLSLRuntime.cpp
index 4bd7b6ba58de0d..24d126ced0d9f7 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.cpp
+++ b/clang/lib/CodeGen/CGHLSLRuntime.cpp
@@ -414,9 +414,20 @@ void CGHLSLRuntime::emitEntryFunction(const FunctionDecl 
*FD,
 
 void CGHLSLRuntime::setHLSLFunctionAttributes(const FunctionDecl *FD,
   llvm::Function *Fn) {
-  if (FD->isInExportDeclContext()) {
-const StringRef ExportAttrKindStr = "hlsl.export";
-Fn->addFnAttr(ExportAttrKindStr);
+  if (FD) { // "explicit" functions with declarations
+if (FD->isInExportDeclContext()) {
+  const StringRef ExportAttrKindStr = "hlsl.export";
+  Fn->addFnAttr(ExportAttrKindStr);
+}
+// Respect noinline if the explicit functions use it
+// otherwise default to alwaysinline
+if (!Fn->hasFnAttribute(Attribute::NoInline))
+  Fn->addFnAttr(llvm::Attribute::AlwaysInline);
+  } else { // "implicit" autogenerated functions with no declaration
+// Implicit functions might get marked as noinline by default
+// but we override that for HLSL
+Fn->removeFnAttr(Attribute::NoInline);
+Fn->addFnAttr(Attribute::AlwaysInline);
   }
 }
 
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp 
b/clang/lib/CodeGen/CodeGenFunction.cpp
index a5747283e98058..aceeed0e66d130 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -1239,9 +1239,9 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, 
QualType RetTy,
   if (getLangOpts().OpenMP && CurCodeDecl)
 CGM.getOpenMPRuntime().emitFunctionProlog(*this, CurCodeDecl);
 
-  if (FD && getLangOpts().HLSL) {
+  if (getLangOpts().HLSL) {
 // Handle emitting HLSL entry functions.
-if (FD->hasAttr()) {
+if (FD && FD->hasAttr()) {
   CGM.getHLSLRuntime().emitEntryFunction(FD, Fn);
 }
 CGM.getHLSLRuntime().setHLSLFunctionAttributes(FD, Fn);
diff --git a/clang/lib/Sema/HLSLExternalSemaSource.cpp 
b/clang/lib/Sema/HLSLExternalSemaSource.cpp
index 9aacbe4ad9548e..0a534d94192560 100644
--- a/clang/lib/Sema/HLSLExternalSemaSource.cpp
+++ b/clang/lib/Sema/HLSLExternalSemaSource.cpp
@@ -290,8 +290,6 @@ struct BuiltinTypeDeclBuilder {
  SourceLocation()));
 MethodDecl->setLexicalDeclContext(Record);
 MethodDecl->setAccess(AccessSpecifier::AS_public);
-MethodDecl->addAttr(AlwaysInlineAttr::CreateImplicit(
-AST, SourceRange(), AlwaysInlineAttr::CXX11_clang_always_inline));
 Record->addDecl(MethodDecl);
 
 return *this;
diff --git a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl 
b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
index f954c9d2f029f2..b39311ad67cd62 100644
--- a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
+++ b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm 
-disable-llvm-passes %s -o - | FileCheck %s
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm 
-disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=CHECK,NOINLINE
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -O0 %s -o 
- | FileCheck %s --check-prefixes=CHECK,INLINE
 
 int i;
 
@@ -7,7 +8,7 @@ __attribute__((constructor)) void call_me

[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-10 Thread Greg Roth via cfe-commits

https://github.com/pow2clk updated 
https://github.com/llvm/llvm-project/pull/106588

>From 12253818bd47aa8c324f6222586965f356b11c90 Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Wed, 24 Jul 2024 16:49:19 -0600
Subject: [PATCH 01/10] [HLSL] set alwaysinline on HLSL functions

HLSL inlines all its functions by default. This uses the alwaysinline
attribute to force that in the corresponding pass for user functions
by default and overrides the default noinline of some implicit functions.
This makes an instance of explicit inlining for buffer subscripts unnecessary.

Adds tests for function and constructor inlining and augments some existing
tests to verify correct inlining of implicitly created functions as well.

incidentally restore RUN line that I believe was mistakenly removed as part of 
#88918

fixes #89282
---
 clang/lib/CodeGen/CGHLSLRuntime.cpp   |  17 ++-
 clang/lib/CodeGen/CodeGenFunction.cpp |   4 +-
 clang/lib/Sema/HLSLExternalSemaSource.cpp |   2 -
 .../GlobalConstructorFunction.hlsl|  31 +++--
 .../CodeGenHLSL/GlobalConstructorLib.hlsl |  23 +++-
 clang/test/CodeGenHLSL/GlobalDestructors.hlsl |  51 +---
 .../builtins/RWBuffer-constructor.hlsl|   1 +
 .../builtins/RWBuffer-subscript.hlsl  |   5 +-
 .../test/CodeGenHLSL/inline-constructors.hlsl |  74 
 clang/test/CodeGenHLSL/inline-functions.hlsl  | 114 ++
 10 files changed, 279 insertions(+), 43 deletions(-)
 create mode 100644 clang/test/CodeGenHLSL/inline-constructors.hlsl
 create mode 100644 clang/test/CodeGenHLSL/inline-functions.hlsl

diff --git a/clang/lib/CodeGen/CGHLSLRuntime.cpp 
b/clang/lib/CodeGen/CGHLSLRuntime.cpp
index 4bd7b6ba58de0d..24d126ced0d9f7 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.cpp
+++ b/clang/lib/CodeGen/CGHLSLRuntime.cpp
@@ -414,9 +414,20 @@ void CGHLSLRuntime::emitEntryFunction(const FunctionDecl 
*FD,
 
 void CGHLSLRuntime::setHLSLFunctionAttributes(const FunctionDecl *FD,
   llvm::Function *Fn) {
-  if (FD->isInExportDeclContext()) {
-const StringRef ExportAttrKindStr = "hlsl.export";
-Fn->addFnAttr(ExportAttrKindStr);
+  if (FD) { // "explicit" functions with declarations
+if (FD->isInExportDeclContext()) {
+  const StringRef ExportAttrKindStr = "hlsl.export";
+  Fn->addFnAttr(ExportAttrKindStr);
+}
+// Respect noinline if the explicit functions use it
+// otherwise default to alwaysinline
+if (!Fn->hasFnAttribute(Attribute::NoInline))
+  Fn->addFnAttr(llvm::Attribute::AlwaysInline);
+  } else { // "implicit" autogenerated functions with no declaration
+// Implicit functions might get marked as noinline by default
+// but we override that for HLSL
+Fn->removeFnAttr(Attribute::NoInline);
+Fn->addFnAttr(Attribute::AlwaysInline);
   }
 }
 
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp 
b/clang/lib/CodeGen/CodeGenFunction.cpp
index a5747283e98058..aceeed0e66d130 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -1239,9 +1239,9 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, 
QualType RetTy,
   if (getLangOpts().OpenMP && CurCodeDecl)
 CGM.getOpenMPRuntime().emitFunctionProlog(*this, CurCodeDecl);
 
-  if (FD && getLangOpts().HLSL) {
+  if (getLangOpts().HLSL) {
 // Handle emitting HLSL entry functions.
-if (FD->hasAttr()) {
+if (FD && FD->hasAttr()) {
   CGM.getHLSLRuntime().emitEntryFunction(FD, Fn);
 }
 CGM.getHLSLRuntime().setHLSLFunctionAttributes(FD, Fn);
diff --git a/clang/lib/Sema/HLSLExternalSemaSource.cpp 
b/clang/lib/Sema/HLSLExternalSemaSource.cpp
index 9aacbe4ad9548e..0a534d94192560 100644
--- a/clang/lib/Sema/HLSLExternalSemaSource.cpp
+++ b/clang/lib/Sema/HLSLExternalSemaSource.cpp
@@ -290,8 +290,6 @@ struct BuiltinTypeDeclBuilder {
  SourceLocation()));
 MethodDecl->setLexicalDeclContext(Record);
 MethodDecl->setAccess(AccessSpecifier::AS_public);
-MethodDecl->addAttr(AlwaysInlineAttr::CreateImplicit(
-AST, SourceRange(), AlwaysInlineAttr::CXX11_clang_always_inline));
 Record->addDecl(MethodDecl);
 
 return *this;
diff --git a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl 
b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
index f954c9d2f029f2..b39311ad67cd62 100644
--- a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
+++ b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm 
-disable-llvm-passes %s -o - | FileCheck %s
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm 
-disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=CHECK,NOINLINE
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -O0 %s -o 
- | FileCheck %s --check-prefixes=CHECK,INLINE
 
 int i;
 
@@ -7,7 +8,7 @@ __attribute__((constructor)) void call_

[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-11 Thread Greg Roth via cfe-commits

https://github.com/pow2clk updated 
https://github.com/llvm/llvm-project/pull/106588

>From 12253818bd47aa8c324f6222586965f356b11c90 Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Wed, 24 Jul 2024 16:49:19 -0600
Subject: [PATCH 01/11] [HLSL] set alwaysinline on HLSL functions

HLSL inlines all its functions by default. This uses the alwaysinline
attribute to force that in the corresponding pass for user functions
by default and overrides the default noinline of some implicit functions.
This makes an instance of explicit inlining for buffer subscripts unnecessary.

Adds tests for function and constructor inlining and augments some existing
tests to verify correct inlining of implicitly created functions as well.

incidentally restore RUN line that I believe was mistakenly removed as part of 
#88918

fixes #89282
---
 clang/lib/CodeGen/CGHLSLRuntime.cpp   |  17 ++-
 clang/lib/CodeGen/CodeGenFunction.cpp |   4 +-
 clang/lib/Sema/HLSLExternalSemaSource.cpp |   2 -
 .../GlobalConstructorFunction.hlsl|  31 +++--
 .../CodeGenHLSL/GlobalConstructorLib.hlsl |  23 +++-
 clang/test/CodeGenHLSL/GlobalDestructors.hlsl |  51 +---
 .../builtins/RWBuffer-constructor.hlsl|   1 +
 .../builtins/RWBuffer-subscript.hlsl  |   5 +-
 .../test/CodeGenHLSL/inline-constructors.hlsl |  74 
 clang/test/CodeGenHLSL/inline-functions.hlsl  | 114 ++
 10 files changed, 279 insertions(+), 43 deletions(-)
 create mode 100644 clang/test/CodeGenHLSL/inline-constructors.hlsl
 create mode 100644 clang/test/CodeGenHLSL/inline-functions.hlsl

diff --git a/clang/lib/CodeGen/CGHLSLRuntime.cpp 
b/clang/lib/CodeGen/CGHLSLRuntime.cpp
index 4bd7b6ba58de0d..24d126ced0d9f7 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.cpp
+++ b/clang/lib/CodeGen/CGHLSLRuntime.cpp
@@ -414,9 +414,20 @@ void CGHLSLRuntime::emitEntryFunction(const FunctionDecl 
*FD,
 
 void CGHLSLRuntime::setHLSLFunctionAttributes(const FunctionDecl *FD,
   llvm::Function *Fn) {
-  if (FD->isInExportDeclContext()) {
-const StringRef ExportAttrKindStr = "hlsl.export";
-Fn->addFnAttr(ExportAttrKindStr);
+  if (FD) { // "explicit" functions with declarations
+if (FD->isInExportDeclContext()) {
+  const StringRef ExportAttrKindStr = "hlsl.export";
+  Fn->addFnAttr(ExportAttrKindStr);
+}
+// Respect noinline if the explicit functions use it
+// otherwise default to alwaysinline
+if (!Fn->hasFnAttribute(Attribute::NoInline))
+  Fn->addFnAttr(llvm::Attribute::AlwaysInline);
+  } else { // "implicit" autogenerated functions with no declaration
+// Implicit functions might get marked as noinline by default
+// but we override that for HLSL
+Fn->removeFnAttr(Attribute::NoInline);
+Fn->addFnAttr(Attribute::AlwaysInline);
   }
 }
 
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp 
b/clang/lib/CodeGen/CodeGenFunction.cpp
index a5747283e98058..aceeed0e66d130 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -1239,9 +1239,9 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, 
QualType RetTy,
   if (getLangOpts().OpenMP && CurCodeDecl)
 CGM.getOpenMPRuntime().emitFunctionProlog(*this, CurCodeDecl);
 
-  if (FD && getLangOpts().HLSL) {
+  if (getLangOpts().HLSL) {
 // Handle emitting HLSL entry functions.
-if (FD->hasAttr()) {
+if (FD && FD->hasAttr()) {
   CGM.getHLSLRuntime().emitEntryFunction(FD, Fn);
 }
 CGM.getHLSLRuntime().setHLSLFunctionAttributes(FD, Fn);
diff --git a/clang/lib/Sema/HLSLExternalSemaSource.cpp 
b/clang/lib/Sema/HLSLExternalSemaSource.cpp
index 9aacbe4ad9548e..0a534d94192560 100644
--- a/clang/lib/Sema/HLSLExternalSemaSource.cpp
+++ b/clang/lib/Sema/HLSLExternalSemaSource.cpp
@@ -290,8 +290,6 @@ struct BuiltinTypeDeclBuilder {
  SourceLocation()));
 MethodDecl->setLexicalDeclContext(Record);
 MethodDecl->setAccess(AccessSpecifier::AS_public);
-MethodDecl->addAttr(AlwaysInlineAttr::CreateImplicit(
-AST, SourceRange(), AlwaysInlineAttr::CXX11_clang_always_inline));
 Record->addDecl(MethodDecl);
 
 return *this;
diff --git a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl 
b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
index f954c9d2f029f2..b39311ad67cd62 100644
--- a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
+++ b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm 
-disable-llvm-passes %s -o - | FileCheck %s
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm 
-disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=CHECK,NOINLINE
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -O0 %s -o 
- | FileCheck %s --check-prefixes=CHECK,INLINE
 
 int i;
 
@@ -7,7 +8,7 @@ __attribute__((constructor)) void call_

[clang] [llvm] [HLSL][DXIL][SPIRV] Create llvm dot intrinsic and use for HLSL (PR #102872)

2024-08-12 Thread Greg Roth via cfe-commits

https://github.com/pow2clk created 
https://github.com/llvm/llvm-project/pull/102872

Per https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294
`dot` should be an LLVM intrinsic. This adds the llvm intrinsics
and updates HLSL builtin codegen to emit them.

Removed some stale comments that gave the obsolete impression that
type conversions should be expected to match overloads.

With dot moving into an LLVM intrinsic, the lowering to dx-specific
operations doesn't take place until DXIL intrinsic expansion. This
moves the introduction of arity-specific DX opcodes to DXIL
intrinsic expansion.

The new LLVM integer intrinsics replace the previous dx intrinsics.
This updates the DXIL intrinsic expansion code and tests to use and
expect the new integer intrinsics and the flattened DX floating
vector size variants only after op lowering.

Use the new LLVM dot intrinsics to build SPIRV instructions.
This involves generating multiply and add operations for integers
and the existing OpDot operation for floating point. This includes
adding some generic opcodes for signed, unsigned and floats.
These require updating an existing test for all such opcodes.

New tests for generating SPIRV float and integer dot intrinsics are
added as well.

Fixes #88056

>From 6fde4bc98d0156024cf7acc27e2e986b9bec3993 Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Fri, 2 Aug 2024 20:10:04 -0600
Subject: [PATCH 1/3] Create llvm dot intrinsic

Per https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294
`dot` should be an LLVM intrinsic. This adds the llvm intrinsics
and updates HLSL builtin codegen to emit them.

Removed some stale comments that gave the obsolete impression that
type conversions should be expected to match overloads.

With dot moving into an LLVM intrinsic, the lowering to dx-specific
operations doesn't take place until DXIL intrinsic expansion. This
moves the introduction of arity-specific DX opcodes to DXIL
intrinsic expansion.

Part of #88056
---
 clang/lib/CodeGen/CGBuiltin.cpp   |  47 +++--
 .../CodeGenHLSL/builtins/dot-builtin.hlsl |  12 +-
 clang/test/CodeGenHLSL/builtins/dot.hlsl  | 160 +-
 llvm/include/llvm/IR/Intrinsics.td|   9 +
 .../Target/DirectX/DXILIntrinsicExpansion.cpp |  61 +--
 5 files changed, 159 insertions(+), 130 deletions(-)

diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 7fe80b0cbdfbfa..67148e32014ed2 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -18470,22 +18470,14 @@ llvm::Value 
*CodeGenFunction::EmitScalarOrConstFoldImmArg(unsigned ICEArguments,
   return Arg;
 }
 
-Intrinsic::ID getDotProductIntrinsic(QualType QT, int elementCount) {
-  if (QT->hasFloatingRepresentation()) {
-switch (elementCount) {
-case 2:
-  return Intrinsic::dx_dot2;
-case 3:
-  return Intrinsic::dx_dot3;
-case 4:
-  return Intrinsic::dx_dot4;
-}
-  }
-  if (QT->hasSignedIntegerRepresentation())
-return Intrinsic::dx_sdot;
-
-  assert(QT->hasUnsignedIntegerRepresentation());
-  return Intrinsic::dx_udot;
+// Return dot product intrinsic that corresponds to the QT scalar type
+Intrinsic::ID getDotProductIntrinsic(QualType QT) {
+  if (QT->isFloatingType())
+return Intrinsic::fdot;
+  if (QT->isSignedIntegerType())
+return Intrinsic::sdot;
+  assert(QT->isUnsignedIntegerType());
+  return Intrinsic::udot;
 }
 
 Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID,
@@ -18528,37 +18520,38 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned 
BuiltinID,
 Value *Op1 = EmitScalarExpr(E->getArg(1));
 llvm::Type *T0 = Op0->getType();
 llvm::Type *T1 = Op1->getType();
+
+// If the arguments are scalars, just emit a multiply
 if (!T0->isVectorTy() && !T1->isVectorTy()) {
   if (T0->isFloatingPointTy())
-return Builder.CreateFMul(Op0, Op1, "dx.dot");
+return Builder.CreateFMul(Op0, Op1, "dot");
 
   if (T0->isIntegerTy())
-return Builder.CreateMul(Op0, Op1, "dx.dot");
+return Builder.CreateMul(Op0, Op1, "dot");
 
-  // Bools should have been promoted
   llvm_unreachable(
   "Scalar dot product is only supported on ints and floats.");
 }
+// For vectors, validate types and emit the appropriate intrinsic
+
 // A VectorSplat should have happened
 assert(T0->isVectorTy() && T1->isVectorTy() &&
"Dot product of vector and scalar is not supported.");
 
-// A vector sext or sitofp should have happened
-assert(T0->getScalarType() == T1->getScalarType() &&
-   "Dot product of vectors need the same element types.");
-
 auto *VecTy0 = E->getArg(0)->getType()->getAs();
 [[maybe_unused]] auto *VecTy1 =
 E->getArg(1)->getType()->getAs();
-// A HLSLVectorTruncation should have happend
+
+assert(VecTy0->getElementType() == VecTy1->getElementType() &&
+   "Dot product of vectors need the same element ty

[clang] [llvm] [HLSL][DXIL][SPIRV] Create llvm dot intrinsic and use for HLSL (PR #102872)

2024-08-12 Thread Greg Roth via cfe-commits

https://github.com/pow2clk commented:

The three commits are independently committable, but this is the grouping 
@farzonl and I agreed on. Reviewing them individually still might make this 
easier: 

1. Create llvm dot intrinsic (6fde4bc98d0156024cf7acc27e2e986b9bec3993)
2. Update DX intrinsic expansion for new llvm intrinsics 
(7ca6bc5940321c18f5634bb960fa795366097e45)
3. Add SPIRV generation for HLSL dot (490c0c05c5762a622d037b472c85234ce3f39c96)

https://github.com/llvm/llvm-project/pull/102872
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL][DXIL][SPIRV] Create llvm dot intrinsic and use for HLSL (PR #102872)

2024-08-12 Thread Greg Roth via cfe-commits

https://github.com/pow2clk edited 
https://github.com/llvm/llvm-project/pull/102872
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL][DXIL][SPIRV] Create llvm dot intrinsic and use for HLSL (PR #102872)

2024-08-12 Thread Greg Roth via cfe-commits


@@ -18528,37 +18520,38 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned 
BuiltinID,
 Value *Op1 = EmitScalarExpr(E->getArg(1));
 llvm::Type *T0 = Op0->getType();
 llvm::Type *T1 = Op1->getType();
+
+// If the arguments are scalars, just emit a multiply
 if (!T0->isVectorTy() && !T1->isVectorTy()) {
   if (T0->isFloatingPointTy())
-return Builder.CreateFMul(Op0, Op1, "dx.dot");
+return Builder.CreateFMul(Op0, Op1, "dot");
 
   if (T0->isIntegerTy())
-return Builder.CreateMul(Op0, Op1, "dx.dot");
+return Builder.CreateMul(Op0, Op1, "dot");
 
-  // Bools should have been promoted
   llvm_unreachable(
   "Scalar dot product is only supported on ints and floats.");
 }
+// For vectors, validate types and emit the appropriate intrinsic
+
 // A VectorSplat should have happened
 assert(T0->isVectorTy() && T1->isVectorTy() &&
"Dot product of vector and scalar is not supported.");
 
-// A vector sext or sitofp should have happened
-assert(T0->getScalarType() == T1->getScalarType() &&
-   "Dot product of vectors need the same element types.");
-
 auto *VecTy0 = E->getArg(0)->getType()->getAs();
 [[maybe_unused]] auto *VecTy1 =
 E->getArg(1)->getType()->getAs();
-// A HLSLVectorTruncation should have happend
+
+assert(VecTy0->getElementType() == VecTy1->getElementType() &&

pow2clk wrote:

Switched to clang types to match signedness of integers

https://github.com/llvm/llvm-project/pull/102872
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL][DXIL][SPIRV] Create llvm dot intrinsic and use for HLSL (PR #102872)

2024-08-12 Thread Greg Roth via cfe-commits


@@ -659,7 +659,7 @@ def Dot3 :  DXILOp<55, dot3> {
 
 def Dot4 :  DXILOp<56, dot4> {
   let Doc = "dot product of two float vectors Dot(a,b) = a[0]*b[0] + ... + "
-"a[n]*b[n] where n is between 0 and 3";
+"a[n]*b[n] where n is 0 to 3 inclusive";

pow2clk wrote:

Just something incidental as I found these descriptions misleading since the 
only numbers "between" 0 and 3 are 1 and 2.

https://github.com/llvm/llvm-project/pull/102872
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL][DXIL][SPIRV] Create llvm dot intrinsic and use for HLSL (PR #102872)

2024-08-12 Thread Greg Roth via cfe-commits


@@ -7,155 +7,155 @@
 // RUN:   -o - | FileCheck %s --check-prefixes=CHECK,NO_HALF
 
 #ifdef __HLSL_ENABLE_16_BIT
-// NATIVE_HALF: %dx.dot = mul i16 %0, %1
-// NATIVE_HALF: ret i16 %dx.dot
+// NATIVE_HALF: %dot = mul i16 %0, %1
+// NATIVE_HALF: ret i16 %dot
 int16_t test_dot_short(int16_t p0, int16_t p1) { return dot(p0, p1); }
 
-// NATIVE_HALF: %dx.dot = call i16 @llvm.dx.sdot.v2i16(<2 x i16> %0, <2 x i16> 
%1)
-// NATIVE_HALF: ret i16 %dx.dot
+// NATIVE_HALF: %dot = call i16 @llvm.sdot.v2i16(<2 x i16> %0, <2 x i16> %1)
+// NATIVE_HALF: ret i16 %dot
 int16_t test_dot_short2(int16_t2 p0, int16_t2 p1) { return dot(p0, p1); }
 
-// NATIVE_HALF: %dx.dot = call i16 @llvm.dx.sdot.v3i16(<3 x i16> %0, <3 x i16> 
%1)
-// NATIVE_HALF: ret i16 %dx.dot
+// NATIVE_HALF: %dot = call i16 @llvm.sdot.v3i16(<3 x i16> %0, <3 x i16> %1)
+// NATIVE_HALF: ret i16 %dot
 int16_t test_dot_short3(int16_t3 p0, int16_t3 p1) { return dot(p0, p1); }
 
-// NATIVE_HALF: %dx.dot = call i16 @llvm.dx.sdot.v4i16(<4 x i16> %0, <4 x i16> 
%1)
-// NATIVE_HALF: ret i16 %dx.dot
+// NATIVE_HALF: %dot = call i16 @llvm.sdot.v4i16(<4 x i16> %0, <4 x i16> %1)
+// NATIVE_HALF: ret i16 %dot
 int16_t test_dot_short4(int16_t4 p0, int16_t4 p1) { return dot(p0, p1); }
 
-// NATIVE_HALF: %dx.dot = mul i16 %0, %1
-// NATIVE_HALF: ret i16 %dx.dot
+// NATIVE_HALF: %dot = mul i16 %0, %1
+// NATIVE_HALF: ret i16 %dot
 uint16_t test_dot_ushort(uint16_t p0, uint16_t p1) { return dot(p0, p1); }
 
-// NATIVE_HALF: %dx.dot = call i16 @llvm.dx.udot.v2i16(<2 x i16> %0, <2 x i16> 
%1)
-// NATIVE_HALF: ret i16 %dx.dot
+// NATIVE_HALF: %dot = call i16 @llvm.udot.v2i16(<2 x i16> %0, <2 x i16> %1)
+// NATIVE_HALF: ret i16 %dot
 uint16_t test_dot_ushort2(uint16_t2 p0, uint16_t2 p1) { return dot(p0, p1); }
 
-// NATIVE_HALF: %dx.dot = call i16 @llvm.dx.udot.v3i16(<3 x i16> %0, <3 x i16> 
%1)
-// NATIVE_HALF: ret i16 %dx.dot
+// NATIVE_HALF: %dot = call i16 @llvm.udot.v3i16(<3 x i16> %0, <3 x i16> %1)
+// NATIVE_HALF: ret i16 %dot
 uint16_t test_dot_ushort3(uint16_t3 p0, uint16_t3 p1) { return dot(p0, p1); }
 
-// NATIVE_HALF: %dx.dot = call i16 @llvm.dx.udot.v4i16(<4 x i16> %0, <4 x i16> 
%1)
-// NATIVE_HALF: ret i16 %dx.dot
+// NATIVE_HALF: %dot = call i16 @llvm.udot.v4i16(<4 x i16> %0, <4 x i16> %1)
+// NATIVE_HALF: ret i16 %dot
 uint16_t test_dot_ushort4(uint16_t4 p0, uint16_t4 p1) { return dot(p0, p1); }
 #endif
 
-// CHECK: %dx.dot = mul i32 %0, %1
-// CHECK: ret i32 %dx.dot
+// CHECK: %dot = mul i32 %0, %1
+// CHECK: ret i32 %dot
 int test_dot_int(int p0, int p1) { return dot(p0, p1); }
 
-// CHECK: %dx.dot = call i32 @llvm.dx.sdot.v2i32(<2 x i32> %0, <2 x i32> %1)
-// CHECK: ret i32 %dx.dot
+// CHECK: %dot = call i32 @llvm.sdot.v2i32(<2 x i32> %0, <2 x i32> %1)
+// CHECK: ret i32 %dot
 int test_dot_int2(int2 p0, int2 p1) { return dot(p0, p1); }
 
-// CHECK: %dx.dot = call i32 @llvm.dx.sdot.v3i32(<3 x i32> %0, <3 x i32> %1)
-// CHECK: ret i32 %dx.dot
+// CHECK: %dot = call i32 @llvm.sdot.v3i32(<3 x i32> %0, <3 x i32> %1)
+// CHECK: ret i32 %dot
 int test_dot_int3(int3 p0, int3 p1) { return dot(p0, p1); }
 
-// CHECK: %dx.dot = call i32 @llvm.dx.sdot.v4i32(<4 x i32> %0, <4 x i32> %1)
-// CHECK: ret i32 %dx.dot
+// CHECK: %dot = call i32 @llvm.sdot.v4i32(<4 x i32> %0, <4 x i32> %1)
+// CHECK: ret i32 %dot
 int test_dot_int4(int4 p0, int4 p1) { return dot(p0, p1); }
 
-// CHECK: %dx.dot = mul i32 %0, %1
-// CHECK: ret i32 %dx.dot
+// CHECK: %dot = mul i32 %0, %1
+// CHECK: ret i32 %dot
 uint test_dot_uint(uint p0, uint p1) { return dot(p0, p1); }
 
-// CHECK: %dx.dot = call i32 @llvm.dx.udot.v2i32(<2 x i32> %0, <2 x i32> %1)
-// CHECK: ret i32 %dx.dot
+// CHECK: %dot = call i32 @llvm.udot.v2i32(<2 x i32> %0, <2 x i32> %1)
+// CHECK: ret i32 %dot
 uint test_dot_uint2(uint2 p0, uint2 p1) { return dot(p0, p1); }
 
-// CHECK: %dx.dot = call i32 @llvm.dx.udot.v3i32(<3 x i32> %0, <3 x i32> %1)
-// CHECK: ret i32 %dx.dot
+// CHECK: %dot = call i32 @llvm.udot.v3i32(<3 x i32> %0, <3 x i32> %1)
+// CHECK: ret i32 %dot
 uint test_dot_uint3(uint3 p0, uint3 p1) { return dot(p0, p1); }
 
-// CHECK: %dx.dot = call i32 @llvm.dx.udot.v4i32(<4 x i32> %0, <4 x i32> %1)
-// CHECK: ret i32 %dx.dot
+// CHECK: %dot = call i32 @llvm.udot.v4i32(<4 x i32> %0, <4 x i32> %1)
+// CHECK: ret i32 %dot
 uint test_dot_uint4(uint4 p0, uint4 p1) { return dot(p0, p1); }
 
-// CHECK: %dx.dot = mul i64 %0, %1
-// CHECK: ret i64 %dx.dot
+// CHECK: %dot = mul i64 %0, %1
+// CHECK: ret i64 %dot
 int64_t test_dot_long(int64_t p0, int64_t p1) { return dot(p0, p1); }
 
-// CHECK: %dx.dot = call i64 @llvm.dx.sdot.v2i64(<2 x i64> %0, <2 x i64> %1)
-// CHECK: ret i64 %dx.dot
+// CHECK: %dot = call i64 @llvm.sdot.v2i64(<2 x i64> %0, <2 x i64> %1)
+// CHECK: ret i64 %dot
 int64_t test_dot_long2(int64_t2 p0, int64_t2 p1) { return dot(p0, p1); }
 
-// CHECK: %dx.dot = call i64 @llvm.dx.sdot.v3i64(<3 x i64> %0, <3 x i64> %1)
-// CHECK: ret i64 %dx.dot
+// CHECK: %dot = call i64 @llvm.sdot.v3i64(<3 x i64> %

[clang] [llvm] [HLSL][DXIL][SPIRV] Create llvm dot intrinsic and use for HLSL (PR #102872)

2024-08-12 Thread Greg Roth via cfe-commits


@@ -380,6 +383,20 @@ bool SPIRVInstructionSelector::spvSelect(Register ResVReg,
   MIB.addImm(V);
 return MIB.constrainAllUses(TII, TRI, RBI);
   }
+
+  case TargetOpcode::G_FDOTPROD: {
+MachineBasicBlock &BB = *I.getParent();
+return BuildMI(BB, I, I.getDebugLoc(), TII.get(SPIRV::OpDot))
+.addDef(ResVReg)
+.addUse(GR.getSPIRVTypeID(ResType))
+.addUse(I.getOperand(1).getReg())
+.addUse(I.getOperand(2).getReg())
+.constrainAllUses(TII, TRI, RBI);
+  }

pow2clk wrote:

There is a similar implementation here: 
https://github.com/llvm/llvm-project/blob/a0241e710fcae9f439e57d3a294b1ace97c6906c/llvm/lib/Target/SPIRV/SPIRVBuiltins.cpp#L1524
 , but I'm not sure if they are mergeable and this is what was discussed.

https://github.com/llvm/llvm-project/pull/102872
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL][DXIL][SPIRV] Create llvm dot intrinsic and use for HLSL (PR #102872)

2024-08-12 Thread Greg Roth via cfe-commits


@@ -1057,6 +1057,27 @@ def G_FTANH : GenericInstruction {
   let hasSideEffects = false;
 }
 
+/// Floating point vector dot product
+def G_FDOTPROD : GenericInstruction {
+  let OutOperandList = (outs type0:$dst);
+  let InOperandList = (ins type0:$src1, type0:$src2);
+  let hasSideEffects = false;
+}
+
+/// Signed integer vector dot product
+def G_SDOTPROD : GenericInstruction {
+  let OutOperandList = (outs type0:$dst);
+  let InOperandList = (ins type0:$src1, type0:$src2);
+  let hasSideEffects = false;
+}
+
+/// Unsigned integer vector dot product
+def G_UDOTPROD : GenericInstruction {

pow2clk wrote:

The unweildy names are because G_UDOT and G_SDOT clashed with existing AArch64 
intrinsics that take three arguments as one is an accumulated inout parameter. 
https://github.com/llvm/llvm-project/blob/908c89e04b6019bdb08bb5f1c861af42046db623/llvm/lib/Target/AArch64/AArch64InstrGISel.td#L254

https://github.com/llvm/llvm-project/pull/102872
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL][DXIL][SPIRV] Create llvm dot intrinsic and use for HLSL (PR #102872)

2024-08-12 Thread Greg Roth via cfe-commits

https://github.com/pow2clk updated 
https://github.com/llvm/llvm-project/pull/102872

>From 6fde4bc98d0156024cf7acc27e2e986b9bec3993 Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Fri, 2 Aug 2024 20:10:04 -0600
Subject: [PATCH 1/3] Create llvm dot intrinsic

Per https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294
`dot` should be an LLVM intrinsic. This adds the llvm intrinsics
and updates HLSL builtin codegen to emit them.

Removed some stale comments that gave the obsolete impression that
type conversions should be expected to match overloads.

With dot moving into an LLVM intrinsic, the lowering to dx-specific
operations doesn't take place until DXIL intrinsic expansion. This
moves the introduction of arity-specific DX opcodes to DXIL
intrinsic expansion.

Part of #88056
---
 clang/lib/CodeGen/CGBuiltin.cpp   |  47 +++--
 .../CodeGenHLSL/builtins/dot-builtin.hlsl |  12 +-
 clang/test/CodeGenHLSL/builtins/dot.hlsl  | 160 +-
 llvm/include/llvm/IR/Intrinsics.td|   9 +
 .../Target/DirectX/DXILIntrinsicExpansion.cpp |  61 +--
 5 files changed, 159 insertions(+), 130 deletions(-)

diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 7fe80b0cbdfbfa..67148e32014ed2 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -18470,22 +18470,14 @@ llvm::Value 
*CodeGenFunction::EmitScalarOrConstFoldImmArg(unsigned ICEArguments,
   return Arg;
 }
 
-Intrinsic::ID getDotProductIntrinsic(QualType QT, int elementCount) {
-  if (QT->hasFloatingRepresentation()) {
-switch (elementCount) {
-case 2:
-  return Intrinsic::dx_dot2;
-case 3:
-  return Intrinsic::dx_dot3;
-case 4:
-  return Intrinsic::dx_dot4;
-}
-  }
-  if (QT->hasSignedIntegerRepresentation())
-return Intrinsic::dx_sdot;
-
-  assert(QT->hasUnsignedIntegerRepresentation());
-  return Intrinsic::dx_udot;
+// Return dot product intrinsic that corresponds to the QT scalar type
+Intrinsic::ID getDotProductIntrinsic(QualType QT) {
+  if (QT->isFloatingType())
+return Intrinsic::fdot;
+  if (QT->isSignedIntegerType())
+return Intrinsic::sdot;
+  assert(QT->isUnsignedIntegerType());
+  return Intrinsic::udot;
 }
 
 Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID,
@@ -18528,37 +18520,38 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned 
BuiltinID,
 Value *Op1 = EmitScalarExpr(E->getArg(1));
 llvm::Type *T0 = Op0->getType();
 llvm::Type *T1 = Op1->getType();
+
+// If the arguments are scalars, just emit a multiply
 if (!T0->isVectorTy() && !T1->isVectorTy()) {
   if (T0->isFloatingPointTy())
-return Builder.CreateFMul(Op0, Op1, "dx.dot");
+return Builder.CreateFMul(Op0, Op1, "dot");
 
   if (T0->isIntegerTy())
-return Builder.CreateMul(Op0, Op1, "dx.dot");
+return Builder.CreateMul(Op0, Op1, "dot");
 
-  // Bools should have been promoted
   llvm_unreachable(
   "Scalar dot product is only supported on ints and floats.");
 }
+// For vectors, validate types and emit the appropriate intrinsic
+
 // A VectorSplat should have happened
 assert(T0->isVectorTy() && T1->isVectorTy() &&
"Dot product of vector and scalar is not supported.");
 
-// A vector sext or sitofp should have happened
-assert(T0->getScalarType() == T1->getScalarType() &&
-   "Dot product of vectors need the same element types.");
-
 auto *VecTy0 = E->getArg(0)->getType()->getAs();
 [[maybe_unused]] auto *VecTy1 =
 E->getArg(1)->getType()->getAs();
-// A HLSLVectorTruncation should have happend
+
+assert(VecTy0->getElementType() == VecTy1->getElementType() &&
+   "Dot product of vectors need the same element types.");
+
 assert(VecTy0->getNumElements() == VecTy1->getNumElements() &&
"Dot product requires vectors to be of the same size.");
 
 return Builder.CreateIntrinsic(
 /*ReturnType=*/T0->getScalarType(),
-getDotProductIntrinsic(E->getArg(0)->getType(),
-   VecTy0->getNumElements()),
-ArrayRef{Op0, Op1}, nullptr, "dx.dot");
+getDotProductIntrinsic(VecTy0->getElementType()),
+ArrayRef{Op0, Op1}, nullptr, "dot");
   } break;
   case Builtin::BI__builtin_hlsl_lerp: {
 Value *X = EmitScalarExpr(E->getArg(0));
diff --git a/clang/test/CodeGenHLSL/builtins/dot-builtin.hlsl 
b/clang/test/CodeGenHLSL/builtins/dot-builtin.hlsl
index b0b95074c972d5..6036f9430db4f0 100644
--- a/clang/test/CodeGenHLSL/builtins/dot-builtin.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/dot-builtin.hlsl
@@ -2,8 +2,8 @@
 
 // CHECK-LABEL: builtin_bool_to_float_type_promotion
 // CHECK: %conv1 = uitofp i1 %loadedv to double
-// CHECK: %dx.dot = fmul double %conv, %conv1
-// CHECK: %conv2 = fptrunc double %dx.dot to float
+// CHECK: %dot = fmul double %conv, %conv1
+// CHECK: %conv2 = fptrunc double %dot to float
 

[clang] [llvm] [HLSL][DXIL][SPIRV] Create llvm dot intrinsic and use for HLSL (PR #102872)

2024-08-12 Thread Greg Roth via cfe-commits


@@ -1366,6 +1383,67 @@ bool SPIRVInstructionSelector::selectRsqrt(Register 
ResVReg,
   .constrainAllUses(TII, TRI, RBI);
 }
 
+// Since there is no integer dot implementation, expand by piecewise 
multiplying

pow2clk wrote:

Those are fairly recent SPIRV extensions. I didn't think incorporating them was 
within scope. 

https://github.com/llvm/llvm-project/pull/102872
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL][DXIL][SPIRV] Create llvm dot intrinsic and use for HLSL (PR #102872)

2024-08-12 Thread Greg Roth via cfe-commits


@@ -18470,22 +18470,14 @@ llvm::Value 
*CodeGenFunction::EmitScalarOrConstFoldImmArg(unsigned ICEArguments,
   return Arg;
 }
 
-Intrinsic::ID getDotProductIntrinsic(QualType QT, int elementCount) {
-  if (QT->hasFloatingRepresentation()) {
-switch (elementCount) {
-case 2:
-  return Intrinsic::dx_dot2;
-case 3:
-  return Intrinsic::dx_dot3;
-case 4:
-  return Intrinsic::dx_dot4;
-}
-  }
-  if (QT->hasSignedIntegerRepresentation())
-return Intrinsic::dx_sdot;
-
-  assert(QT->hasUnsignedIntegerRepresentation());
-  return Intrinsic::dx_udot;
+// Return dot product intrinsic that corresponds to the QT scalar type
+Intrinsic::ID getDotProductIntrinsic(QualType QT) {
+  if (QT->isFloatingType())
+return Intrinsic::fdot;
+  if (QT->isSignedIntegerType())
+return Intrinsic::sdot;

pow2clk wrote:

Justin's proposal didn't say explicitly, but linked to the [HLSL documentation 
](https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-dot),
 which explicitly includes integers. In the discussion, there was an [explicit 
request](https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294/3) for 
integer versions that no one disapproved of there. 

https://github.com/llvm/llvm-project/pull/102872
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL][DXIL][SPIRV] Create llvm dot intrinsic and use for HLSL (PR #102872)

2024-08-12 Thread Greg Roth via cfe-commits


@@ -1045,6 +1045,15 @@ let IntrProperties = [IntrNoMem, IntrSpeculatable, 
IntrWillReturn] in {
   def int_nearbyint : DefaultAttrsIntrinsic<[llvm_anyfloat_ty], 
[LLVMMatchType<0>]>;
   def int_round : DefaultAttrsIntrinsic<[llvm_anyfloat_ty], 
[LLVMMatchType<0>]>;
   def int_roundeven: DefaultAttrsIntrinsic<[llvm_anyfloat_ty], 
[LLVMMatchType<0>]>;
+  def int_udot : Intrinsic<[LLVMVectorElementType<0>],
+   [llvm_anyint_ty, LLVMScalarOrSameVectorWidth<0, 
LLVMVectorElementType<0>>],
+   [IntrNoMem, IntrWillReturn, Commutative] >;
+  def int_sdot : Intrinsic<[LLVMVectorElementType<0>],
+   [llvm_anyint_ty, LLVMScalarOrSameVectorWidth<0, 
LLVMVectorElementType<0>>],
+   [IntrNoMem, IntrWillReturn, Commutative] >;
+  def int_fdot : Intrinsic<[LLVMVectorElementType<0>],

pow2clk wrote:

The default properties seem to be 

* `IntrNoCallBack`
* `IntrNoSync`
* `IntrNoFree`
* `IntrWillReturn`

It seems we need to set `IntrNoMem` too then.

https://github.com/llvm/llvm-project/pull/102872
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL][DXIL][SPIRV] Create llvm dot intrinsic and use for HLSL (PR #102872)

2024-08-12 Thread Greg Roth via cfe-commits

https://github.com/pow2clk updated 
https://github.com/llvm/llvm-project/pull/102872

>From 6fde4bc98d0156024cf7acc27e2e986b9bec3993 Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Fri, 2 Aug 2024 20:10:04 -0600
Subject: [PATCH 1/4] Create llvm dot intrinsic

Per https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294
`dot` should be an LLVM intrinsic. This adds the llvm intrinsics
and updates HLSL builtin codegen to emit them.

Removed some stale comments that gave the obsolete impression that
type conversions should be expected to match overloads.

With dot moving into an LLVM intrinsic, the lowering to dx-specific
operations doesn't take place until DXIL intrinsic expansion. This
moves the introduction of arity-specific DX opcodes to DXIL
intrinsic expansion.

Part of #88056
---
 clang/lib/CodeGen/CGBuiltin.cpp   |  47 +++--
 .../CodeGenHLSL/builtins/dot-builtin.hlsl |  12 +-
 clang/test/CodeGenHLSL/builtins/dot.hlsl  | 160 +-
 llvm/include/llvm/IR/Intrinsics.td|   9 +
 .../Target/DirectX/DXILIntrinsicExpansion.cpp |  61 +--
 5 files changed, 159 insertions(+), 130 deletions(-)

diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 7fe80b0cbdfbfa..67148e32014ed2 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -18470,22 +18470,14 @@ llvm::Value 
*CodeGenFunction::EmitScalarOrConstFoldImmArg(unsigned ICEArguments,
   return Arg;
 }
 
-Intrinsic::ID getDotProductIntrinsic(QualType QT, int elementCount) {
-  if (QT->hasFloatingRepresentation()) {
-switch (elementCount) {
-case 2:
-  return Intrinsic::dx_dot2;
-case 3:
-  return Intrinsic::dx_dot3;
-case 4:
-  return Intrinsic::dx_dot4;
-}
-  }
-  if (QT->hasSignedIntegerRepresentation())
-return Intrinsic::dx_sdot;
-
-  assert(QT->hasUnsignedIntegerRepresentation());
-  return Intrinsic::dx_udot;
+// Return dot product intrinsic that corresponds to the QT scalar type
+Intrinsic::ID getDotProductIntrinsic(QualType QT) {
+  if (QT->isFloatingType())
+return Intrinsic::fdot;
+  if (QT->isSignedIntegerType())
+return Intrinsic::sdot;
+  assert(QT->isUnsignedIntegerType());
+  return Intrinsic::udot;
 }
 
 Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID,
@@ -18528,37 +18520,38 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned 
BuiltinID,
 Value *Op1 = EmitScalarExpr(E->getArg(1));
 llvm::Type *T0 = Op0->getType();
 llvm::Type *T1 = Op1->getType();
+
+// If the arguments are scalars, just emit a multiply
 if (!T0->isVectorTy() && !T1->isVectorTy()) {
   if (T0->isFloatingPointTy())
-return Builder.CreateFMul(Op0, Op1, "dx.dot");
+return Builder.CreateFMul(Op0, Op1, "dot");
 
   if (T0->isIntegerTy())
-return Builder.CreateMul(Op0, Op1, "dx.dot");
+return Builder.CreateMul(Op0, Op1, "dot");
 
-  // Bools should have been promoted
   llvm_unreachable(
   "Scalar dot product is only supported on ints and floats.");
 }
+// For vectors, validate types and emit the appropriate intrinsic
+
 // A VectorSplat should have happened
 assert(T0->isVectorTy() && T1->isVectorTy() &&
"Dot product of vector and scalar is not supported.");
 
-// A vector sext or sitofp should have happened
-assert(T0->getScalarType() == T1->getScalarType() &&
-   "Dot product of vectors need the same element types.");
-
 auto *VecTy0 = E->getArg(0)->getType()->getAs();
 [[maybe_unused]] auto *VecTy1 =
 E->getArg(1)->getType()->getAs();
-// A HLSLVectorTruncation should have happend
+
+assert(VecTy0->getElementType() == VecTy1->getElementType() &&
+   "Dot product of vectors need the same element types.");
+
 assert(VecTy0->getNumElements() == VecTy1->getNumElements() &&
"Dot product requires vectors to be of the same size.");
 
 return Builder.CreateIntrinsic(
 /*ReturnType=*/T0->getScalarType(),
-getDotProductIntrinsic(E->getArg(0)->getType(),
-   VecTy0->getNumElements()),
-ArrayRef{Op0, Op1}, nullptr, "dx.dot");
+getDotProductIntrinsic(VecTy0->getElementType()),
+ArrayRef{Op0, Op1}, nullptr, "dot");
   } break;
   case Builtin::BI__builtin_hlsl_lerp: {
 Value *X = EmitScalarExpr(E->getArg(0));
diff --git a/clang/test/CodeGenHLSL/builtins/dot-builtin.hlsl 
b/clang/test/CodeGenHLSL/builtins/dot-builtin.hlsl
index b0b95074c972d5..6036f9430db4f0 100644
--- a/clang/test/CodeGenHLSL/builtins/dot-builtin.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/dot-builtin.hlsl
@@ -2,8 +2,8 @@
 
 // CHECK-LABEL: builtin_bool_to_float_type_promotion
 // CHECK: %conv1 = uitofp i1 %loadedv to double
-// CHECK: %dx.dot = fmul double %conv, %conv1
-// CHECK: %conv2 = fptrunc double %dx.dot to float
+// CHECK: %dot = fmul double %conv, %conv1
+// CHECK: %conv2 = fptrunc double %dot to float
 

[clang] [llvm] [HLSL][DXIL][SPIRV] Create llvm dot intrinsic and use for HLSL (PR #102872)

2024-08-16 Thread Greg Roth via cfe-commits

https://github.com/pow2clk updated 
https://github.com/llvm/llvm-project/pull/102872

>From 6fde4bc98d0156024cf7acc27e2e986b9bec3993 Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Fri, 2 Aug 2024 20:10:04 -0600
Subject: [PATCH 1/6] Create llvm dot intrinsic

Per https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294
`dot` should be an LLVM intrinsic. This adds the llvm intrinsics
and updates HLSL builtin codegen to emit them.

Removed some stale comments that gave the obsolete impression that
type conversions should be expected to match overloads.

With dot moving into an LLVM intrinsic, the lowering to dx-specific
operations doesn't take place until DXIL intrinsic expansion. This
moves the introduction of arity-specific DX opcodes to DXIL
intrinsic expansion.

Part of #88056
---
 clang/lib/CodeGen/CGBuiltin.cpp   |  47 +++--
 .../CodeGenHLSL/builtins/dot-builtin.hlsl |  12 +-
 clang/test/CodeGenHLSL/builtins/dot.hlsl  | 160 +-
 llvm/include/llvm/IR/Intrinsics.td|   9 +
 .../Target/DirectX/DXILIntrinsicExpansion.cpp |  61 +--
 5 files changed, 159 insertions(+), 130 deletions(-)

diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 7fe80b0cbdfbfa..67148e32014ed2 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -18470,22 +18470,14 @@ llvm::Value 
*CodeGenFunction::EmitScalarOrConstFoldImmArg(unsigned ICEArguments,
   return Arg;
 }
 
-Intrinsic::ID getDotProductIntrinsic(QualType QT, int elementCount) {
-  if (QT->hasFloatingRepresentation()) {
-switch (elementCount) {
-case 2:
-  return Intrinsic::dx_dot2;
-case 3:
-  return Intrinsic::dx_dot3;
-case 4:
-  return Intrinsic::dx_dot4;
-}
-  }
-  if (QT->hasSignedIntegerRepresentation())
-return Intrinsic::dx_sdot;
-
-  assert(QT->hasUnsignedIntegerRepresentation());
-  return Intrinsic::dx_udot;
+// Return dot product intrinsic that corresponds to the QT scalar type
+Intrinsic::ID getDotProductIntrinsic(QualType QT) {
+  if (QT->isFloatingType())
+return Intrinsic::fdot;
+  if (QT->isSignedIntegerType())
+return Intrinsic::sdot;
+  assert(QT->isUnsignedIntegerType());
+  return Intrinsic::udot;
 }
 
 Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID,
@@ -18528,37 +18520,38 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned 
BuiltinID,
 Value *Op1 = EmitScalarExpr(E->getArg(1));
 llvm::Type *T0 = Op0->getType();
 llvm::Type *T1 = Op1->getType();
+
+// If the arguments are scalars, just emit a multiply
 if (!T0->isVectorTy() && !T1->isVectorTy()) {
   if (T0->isFloatingPointTy())
-return Builder.CreateFMul(Op0, Op1, "dx.dot");
+return Builder.CreateFMul(Op0, Op1, "dot");
 
   if (T0->isIntegerTy())
-return Builder.CreateMul(Op0, Op1, "dx.dot");
+return Builder.CreateMul(Op0, Op1, "dot");
 
-  // Bools should have been promoted
   llvm_unreachable(
   "Scalar dot product is only supported on ints and floats.");
 }
+// For vectors, validate types and emit the appropriate intrinsic
+
 // A VectorSplat should have happened
 assert(T0->isVectorTy() && T1->isVectorTy() &&
"Dot product of vector and scalar is not supported.");
 
-// A vector sext or sitofp should have happened
-assert(T0->getScalarType() == T1->getScalarType() &&
-   "Dot product of vectors need the same element types.");
-
 auto *VecTy0 = E->getArg(0)->getType()->getAs();
 [[maybe_unused]] auto *VecTy1 =
 E->getArg(1)->getType()->getAs();
-// A HLSLVectorTruncation should have happend
+
+assert(VecTy0->getElementType() == VecTy1->getElementType() &&
+   "Dot product of vectors need the same element types.");
+
 assert(VecTy0->getNumElements() == VecTy1->getNumElements() &&
"Dot product requires vectors to be of the same size.");
 
 return Builder.CreateIntrinsic(
 /*ReturnType=*/T0->getScalarType(),
-getDotProductIntrinsic(E->getArg(0)->getType(),
-   VecTy0->getNumElements()),
-ArrayRef{Op0, Op1}, nullptr, "dx.dot");
+getDotProductIntrinsic(VecTy0->getElementType()),
+ArrayRef{Op0, Op1}, nullptr, "dot");
   } break;
   case Builtin::BI__builtin_hlsl_lerp: {
 Value *X = EmitScalarExpr(E->getArg(0));
diff --git a/clang/test/CodeGenHLSL/builtins/dot-builtin.hlsl 
b/clang/test/CodeGenHLSL/builtins/dot-builtin.hlsl
index b0b95074c972d5..6036f9430db4f0 100644
--- a/clang/test/CodeGenHLSL/builtins/dot-builtin.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/dot-builtin.hlsl
@@ -2,8 +2,8 @@
 
 // CHECK-LABEL: builtin_bool_to_float_type_promotion
 // CHECK: %conv1 = uitofp i1 %loadedv to double
-// CHECK: %dx.dot = fmul double %conv, %conv1
-// CHECK: %conv2 = fptrunc double %dx.dot to float
+// CHECK: %dot = fmul double %conv, %conv1
+// CHECK: %conv2 = fptrunc double %dot to float
 

[clang] Correct confusing header in HLSLDocs (PR #100017)

2024-07-22 Thread Greg Roth via cfe-commits

https://github.com/pow2clk updated 
https://github.com/llvm/llvm-project/pull/100017

>From f325499de6336807b0d56696356a3e11c7a26ac3 Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Mon, 22 Jul 2024 17:39:47 -0600
Subject: [PATCH 1/2] Correct confusing header in HLSLDocs

Because AvailabilityDiagnostics.rst mistakenly overlined the "Examples"
section, it was included in the generated HLSLDocs page.
By demoting it to a subheader, it shouldn't show up as a top-level
HLSLDocs page.
---
 clang/docs/HLSL/AvailabilityDiagnostics.rst | 1 -
 1 file changed, 1 deletion(-)

diff --git a/clang/docs/HLSL/AvailabilityDiagnostics.rst 
b/clang/docs/HLSL/AvailabilityDiagnostics.rst
index bb9d02f21dde6..7ce82c1946b87 100644
--- a/clang/docs/HLSL/AvailabilityDiagnostics.rst
+++ b/clang/docs/HLSL/AvailabilityDiagnostics.rst
@@ -52,7 +52,6 @@ If the compilation target is a shader library, only 
availability based on shader
 
 As a result, availability based on specific shader stage will only be 
diagnosed in code that is reachable from a shader entry point or library export 
function. It also means that function bodies might be scanned multiple time. 
When that happens, care should be taken not to produce duplicated diagnostics.
 
-
 Examples
 
 

>From 9351ea72d247c8f5db513e3ea18116e4df9aad93 Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Mon, 22 Jul 2024 17:57:35 -0600
Subject: [PATCH 2/2] Add another case with multiple top-level headers

ExpectedDifference sections were bleeding into the top page too
---
 clang/docs/HLSL/ExpectedDifferences.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/docs/HLSL/ExpectedDifferences.rst 
b/clang/docs/HLSL/ExpectedDifferences.rst
index a29b6348e0b8e..4782eb3cda754 100644
--- a/clang/docs/HLSL/ExpectedDifferences.rst
+++ b/clang/docs/HLSL/ExpectedDifferences.rst
@@ -1,4 +1,4 @@
-
+===
 Expected Differences vs DXC and FXC
 ===
 

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Correct confusing headers in HLSLDocs (PR #100017)

2024-07-22 Thread Greg Roth via cfe-commits

https://github.com/pow2clk edited 
https://github.com/llvm/llvm-project/pull/100017
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] (DRAFT) Another way to implement #92071: [HLSL] Default linkage of HLSL function should be internal (PR #95331)

2024-07-24 Thread Greg Roth via cfe-commits


@@ -353,6 +353,23 @@ llvm::Value *CGHLSLRuntime::emitInputSemantic(IRBuilder<> 
&B,
   return nullptr;
 }
 
+void CGHLSLRuntime::emitFunctionProlog(const FunctionDecl *FD,
+   llvm::Function *Fn) {
+  if (!FD || !Fn)
+return;
+
+  if (FD->hasAttr()) {
+emitEntryFunction(FD, Fn);
+  } else {
+// HLSL functions declared in the current translation unit without
+// body have external linkage by default.
+if (!FD->isDefined())
+  Fn->setLinkage(GlobalValue::ExternalLinkage);
+
+// FIXME: also set external linkage on exported functions

pow2clk wrote:

I don't know if you want to mention patch constant functions here as well. They 
will need to be external in the end as well. 

https://github.com/llvm/llvm-project/pull/95331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] (DRAFT) Another way to implement #92071: [HLSL] Default linkage of HLSL function should be internal (PR #95331)

2024-07-25 Thread Greg Roth via cfe-commits


@@ -119,3 +119,16 @@ behavior between Clang and DXC. Some examples include:
   diagnostic notifying the user of the conversion rather than silently altering
   precision relative to the other overloads (as FXC does) or generating code
   that will fail validation (as DXC does).
+
+Correctness improvements (bug fixes)
+
+
+Entry point functions & ``static`` keyword
+--
+Marking a shader entry point function ``static`` will result in an error.
+
+This is identical to DXC behavior when an entry point is specified as compiler
+argument. However, DXC does not report an error when compiling a shader library
+that has an entry point function with ``[shader("stage")]`` attribute that is
+also marked ``static``. Additionally, this function definition is not included
+in the final DXIL.

pow2clk wrote:

I'm not sure this is accurate yet. In my experiments with or without this 
change applied, if I declared an entry function called "main" static, I got a 
very helpful error, but if I call it "csmain", I get warnings about the useless 
`shader` and `numthreads` attributes, but otherwise it does just what DXC does. 
It generates an empty library. I think it _should_ do what is said here. There 
aren't any tests for that included here. It does seem separable from the rest 
of this change if you were so inclined. 

https://github.com/llvm/llvm-project/pull/95331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] (DRAFT) Another way to implement #92071: [HLSL] Default linkage of HLSL function should be internal (PR #95331)

2024-07-25 Thread Greg Roth via cfe-commits

https://github.com/pow2clk edited 
https://github.com/llvm/llvm-project/pull/95331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] (DRAFT) Another way to implement #92071: [HLSL] Default linkage of HLSL function should be internal (PR #95331)

2024-07-25 Thread Greg Roth via cfe-commits

https://github.com/pow2clk commented:

Linking functions as internal should mean that they won't be included in the 
final output if they don't have any calls. A test that verifies that behavior 
would be useful.

Even if they are called, a test that verifies that they have the "internal" 
attribute applied would be useful too. 

https://github.com/llvm/llvm-project/pull/95331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] (DRAFT) Another way to implement #92071: [HLSL] Default linkage of HLSL function should be internal (PR #95331)

2024-07-25 Thread Greg Roth via cfe-commits


@@ -12363,6 +12363,11 @@ bool ASTContext::DeclMustBeEmitted(const Decl *D) {
   if (D->hasAttr() || D->hasAttr())
 return true;
 
+  // HLSL entry functiona are required.

pow2clk wrote:

```suggestion
  // HLSL entry functions are required.
```
typo

https://github.com/llvm/llvm-project/pull/95331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] (DRAFT) Another way to implement #92071: [HLSL] Default linkage of HLSL function should be internal (PR #95331)

2024-07-25 Thread Greg Roth via cfe-commits


@@ -158,7 +158,8 @@ def FunctionTmpl
 
 def HLSLEntry
 : SubsetSubjectisExternallyVisible() && !isa(S)}],
+   [{S->getDeclContext()->getRedeclContext()->isFileContext() 
&&
+S->getStorageClass() != SC_Static}],

pow2clk wrote:

Perhaps this is what's expected to enforce that entry functions not be static? 
From [this 
description](https://github.com/llvm/llvm-project/blob/a737b8704c031310460d492cef90eee5054cabd7/clang/include/clang/Basic/Attr.td#L84),
 it sounds to me like it would prevent a static function from being considered 
an entry function, which is appropriate, but doesn't produce any error when it 
does. 

https://github.com/llvm/llvm-project/pull/95331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] (DRAFT) Another way to implement #92071: [HLSL] Default linkage of HLSL function should be internal (PR #95331)

2024-07-25 Thread Greg Roth via cfe-commits


@@ -119,3 +119,16 @@ behavior between Clang and DXC. Some examples include:
   diagnostic notifying the user of the conversion rather than silently altering
   precision relative to the other overloads (as FXC does) or generating code
   that will fail validation (as DXC does).
+
+Correctness improvements (bug fixes)
+
+
+Entry point functions & ``static`` keyword
+--
+Marking a shader entry point function ``static`` will result in an error.
+
+This is identical to DXC behavior when an entry point is specified as compiler

pow2clk wrote:

What is the expected error? What I see in DXC is "Cannot find entry function 
main". That's better than cheerfully producing an empty library, but it could 
be more helpful. 

https://github.com/llvm/llvm-project/pull/95331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Correct confusing header in HLSLDocs (PR #100017)

2024-07-22 Thread Greg Roth via cfe-commits

https://github.com/pow2clk created 
https://github.com/llvm/llvm-project/pull/100017

Because AvailabilityDiagnostics.rst mistakenly overlined the "Examples" 
section, it was included in the generated HLSLDocs page. By demoting it to a 
subheader, it shouldn't show up as a top-level HLSLDocs page.

>From f325499de6336807b0d56696356a3e11c7a26ac3 Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Mon, 22 Jul 2024 17:39:47 -0600
Subject: [PATCH] Correct confusing header in HLSLDocs

Because AvailabilityDiagnostics.rst mistakenly overlined the "Examples"
section, it was included in the generated HLSLDocs page.
By demoting it to a subheader, it shouldn't show up as a top-level
HLSLDocs page.
---
 clang/docs/HLSL/AvailabilityDiagnostics.rst | 1 -
 1 file changed, 1 deletion(-)

diff --git a/clang/docs/HLSL/AvailabilityDiagnostics.rst 
b/clang/docs/HLSL/AvailabilityDiagnostics.rst
index bb9d02f21dde6..7ce82c1946b87 100644
--- a/clang/docs/HLSL/AvailabilityDiagnostics.rst
+++ b/clang/docs/HLSL/AvailabilityDiagnostics.rst
@@ -52,7 +52,6 @@ If the compilation target is a shader library, only 
availability based on shader
 
 As a result, availability based on specific shader stage will only be 
diagnosed in code that is reachable from a shader entry point or library export 
function. It also means that function bodies might be scanned multiple time. 
When that happens, care should be taken not to produce duplicated diagnostics.
 
-
 Examples
 
 

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Correct confusing header in HLSLDocs (PR #100017)

2024-07-22 Thread Greg Roth via cfe-commits

pow2clk wrote:

See https://clang.llvm.org/docs/HLSL/HLSLDocs.html to see the effect

https://github.com/llvm/llvm-project/pull/100017
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] Implement '__builtin_hlsl_is_intangible' type trait (PR #104544)

2024-08-26 Thread Greg Roth via cfe-commits

https://github.com/pow2clk commented:

Looks good overall. I have some areas where I still feel confused though

https://github.com/llvm/llvm-project/pull/104544
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] Implement '__builtin_hlsl_is_intangible' type trait (PR #104544)

2024-08-26 Thread Greg Roth via cfe-commits


@@ -1154,3 +1156,70 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned 
BuiltinID, CallExpr *TheCall) {
   }
   return false;
 }
+
+static bool calculateIsIntangibleType(QualType Ty) {
+  Ty = Ty->getCanonicalTypeUnqualified();
+  if (Ty->isBuiltinType())
+return Ty->isHLSLSpecificType();
+
+  llvm::SmallVector TypesToScan;
+  TypesToScan.push_back(Ty);
+  while (!TypesToScan.empty()) {
+QualType T = TypesToScan.pop_back_val();
+assert(T == T->getCanonicalTypeUnqualified() && "expected sugar-free 
type");
+assert(!isa(T) && "Matrix types not yet supported in HLSL");
+
+if (const auto *AT = dyn_cast(T)) {
+  QualType ElTy = AT->getElementType()->getCanonicalTypeUnqualified();
+  if (ElTy->isBuiltinType())
+return ElTy->isHLSLSpecificType();
+  TypesToScan.push_back(ElTy);
+  continue;
+}
+
+if (const auto *VT = dyn_cast(T)) {
+  QualType ElTy = VT->getElementType()->getCanonicalTypeUnqualified();
+  assert(ElTy->isBuiltinType() && "vectors can only contain builtin 
types");
+  if (ElTy->isHLSLSpecificType())

pow2clk wrote:

Expanding vectors to be able to contain intangible types seems pretty 
disruptive. Is there a proposal to do that? 

https://github.com/llvm/llvm-project/pull/104544
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] Implement '__builtin_hlsl_is_intangible' type trait (PR #104544)

2024-08-26 Thread Greg Roth via cfe-commits


@@ -1154,3 +1156,70 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned 
BuiltinID, CallExpr *TheCall) {
   }
   return false;
 }
+
+static bool calculateIsIntangibleType(QualType Ty) {
+  Ty = Ty->getCanonicalTypeUnqualified();
+  if (Ty->isBuiltinType())
+return Ty->isHLSLSpecificType();
+
+  llvm::SmallVector TypesToScan;
+  TypesToScan.push_back(Ty);
+  while (!TypesToScan.empty()) {
+QualType T = TypesToScan.pop_back_val();
+assert(T == T->getCanonicalTypeUnqualified() && "expected sugar-free 
type");
+assert(!isa(T) && "Matrix types not yet supported in HLSL");

pow2clk wrote:

I don't think matrices will be intangible types nor can they contain them 
anyway?

https://github.com/llvm/llvm-project/pull/104544
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] Implement '__builtin_hlsl_is_intangible' type trait (PR #104544)

2024-08-26 Thread Greg Roth via cfe-commits


@@ -1154,3 +1156,70 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned 
BuiltinID, CallExpr *TheCall) {
   }
   return false;
 }
+
+static bool calculateIsIntangibleType(QualType Ty) {
+  Ty = Ty->getCanonicalTypeUnqualified();
+  if (Ty->isBuiltinType())
+return Ty->isHLSLSpecificType();
+
+  llvm::SmallVector TypesToScan;
+  TypesToScan.push_back(Ty);
+  while (!TypesToScan.empty()) {
+QualType T = TypesToScan.pop_back_val();
+assert(T == T->getCanonicalTypeUnqualified() && "expected sugar-free 
type");
+assert(!isa(T) && "Matrix types not yet supported in HLSL");
+
+if (const auto *AT = dyn_cast(T)) {
+  QualType ElTy = AT->getElementType()->getCanonicalTypeUnqualified();
+  if (ElTy->isBuiltinType())
+return ElTy->isHLSLSpecificType();

pow2clk wrote:

Currently `isHLSLSPecificType` seems to be synonymous with 
`isHLSLIntangibleType`, but the name suggests it might extend beyond intangible 
types in the future. If it is meant to only check for intangible types, perhaps 
it should be renamed.

https://github.com/llvm/llvm-project/pull/104544
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] Implement '__builtin_hlsl_is_intangible' type trait (PR #104544)

2024-08-26 Thread Greg Roth via cfe-commits

https://github.com/pow2clk edited 
https://github.com/llvm/llvm-project/pull/104544
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] Implement '__builtin_hlsl_is_intangible' type trait (PR #104544)

2024-08-26 Thread Greg Roth via cfe-commits


@@ -1154,3 +1156,70 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned 
BuiltinID, CallExpr *TheCall) {
   }
   return false;
 }
+
+static bool calculateIsIntangibleType(QualType Ty) {
+  Ty = Ty->getCanonicalTypeUnqualified();
+  if (Ty->isBuiltinType())
+return Ty->isHLSLSpecificType();
+
+  llvm::SmallVector TypesToScan;
+  TypesToScan.push_back(Ty);
+  while (!TypesToScan.empty()) {
+QualType T = TypesToScan.pop_back_val();
+assert(T == T->getCanonicalTypeUnqualified() && "expected sugar-free 
type");

pow2clk wrote:

Is a sugar-free type an artificial sweetener type? 😆

https://github.com/llvm/llvm-project/pull/104544
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] Implement '__builtin_hlsl_is_intangible' type trait (PR #104544)

2024-08-26 Thread Greg Roth via cfe-commits


@@ -5683,6 +5685,14 @@ static bool EvaluateUnaryTypeTrait(Sema &Self, TypeTrait 
UTT,
 return true;
 return false;
   }
+  case UTT_IsIntangibleType:
+if (!T->isVoidType() && !T->isIncompleteArrayType())
+  if (Self.RequireCompleteType(TInfo->getTypeLoc().getBeginLoc(), T,
+   diag::err_incomplete_type))
+return false;
+DiagnoseVLAInCXXTypeTrait(Self, TInfo,
+  tok::kw___builtin_hlsl_is_intangible);
+return Self.HLSL().IsIntangibleType(T);

pow2clk wrote:

I'm afraid that this and the change above are confusing me. Can you explain a 
bit what purpose these functions serve? I'm not even sure what returning true 
really means. This is why I favor negatable verbs at the start of functions 
returning bool like `has`* or `is`* . I don't know what to expect from `Check`* 
or `Evaluate`* 😒

https://github.com/llvm/llvm-project/pull/104544
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] Implement '__builtin_hlsl_is_intangible' type trait (PR #104544)

2024-08-26 Thread Greg Roth via cfe-commits


@@ -1154,3 +1156,70 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned 
BuiltinID, CallExpr *TheCall) {
   }
   return false;
 }
+
+static bool calculateIsIntangibleType(QualType Ty) {
+  Ty = Ty->getCanonicalTypeUnqualified();
+  if (Ty->isBuiltinType())
+return Ty->isHLSLSpecificType();
+
+  llvm::SmallVector TypesToScan;
+  TypesToScan.push_back(Ty);
+  while (!TypesToScan.empty()) {
+QualType T = TypesToScan.pop_back_val();
+assert(T == T->getCanonicalTypeUnqualified() && "expected sugar-free 
type");
+assert(!isa(T) && "Matrix types not yet supported in HLSL");
+
+if (const auto *AT = dyn_cast(T)) {
+  QualType ElTy = AT->getElementType()->getCanonicalTypeUnqualified();
+  if (ElTy->isBuiltinType())
+return ElTy->isHLSLSpecificType();
+  TypesToScan.push_back(ElTy);
+  continue;
+}
+
+if (const auto *VT = dyn_cast(T)) {
+  QualType ElTy = VT->getElementType()->getCanonicalTypeUnqualified();
+  assert(ElTy->isBuiltinType() && "vectors can only contain builtin 
types");
+  if (ElTy->isHLSLSpecificType())
+return true;
+  continue;
+}
+
+if (const auto *RT = dyn_cast(T)) {
+  const RecordDecl *RD = RT->getDecl();
+  for (const auto *FD : RD->fields()) {
+QualType FieldTy = FD->getType()->getCanonicalTypeUnqualified();
+if (FieldTy->isBuiltinType()) {
+  if (FieldTy->isHLSLSpecificType())

pow2clk wrote:

It's a very minor nit, but given the repetition of this check, we could just 
test this at the top of the loop after popping it off the work queue instead of 
testing it before insertion. 

https://github.com/llvm/llvm-project/pull/104544
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] Implement '__builtin_hlsl_is_intangible' type trait (PR #104544)

2024-08-26 Thread Greg Roth via cfe-commits


@@ -1154,3 +1156,70 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned 
BuiltinID, CallExpr *TheCall) {
   }
   return false;
 }
+
+static bool calculateIsIntangibleType(QualType Ty) {
+  Ty = Ty->getCanonicalTypeUnqualified();
+  if (Ty->isBuiltinType())
+return Ty->isHLSLSpecificType();
+
+  llvm::SmallVector TypesToScan;
+  TypesToScan.push_back(Ty);
+  while (!TypesToScan.empty()) {
+QualType T = TypesToScan.pop_back_val();
+assert(T == T->getCanonicalTypeUnqualified() && "expected sugar-free 
type");
+assert(!isa(T) && "Matrix types not yet supported in HLSL");
+
+if (const auto *AT = dyn_cast(T)) {
+  QualType ElTy = AT->getElementType()->getCanonicalTypeUnqualified();
+  if (ElTy->isBuiltinType())
+return ElTy->isHLSLSpecificType();

pow2clk wrote:

This duplicates the [comment 
](https://github.com/llvm/llvm-project/pull/104544#discussion_r1731610819) 
Justin made 

https://github.com/llvm/llvm-project/pull/104544
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Tentative fix for not removing newly internal functions (PR #106146)

2024-08-26 Thread Greg Roth via cfe-commits

https://github.com/pow2clk created 
https://github.com/llvm/llvm-project/pull/106146

Functions are not removed even when made internal by DXILFinalizeLinkage The 
removal code is called from alwaysinliner and globalopt, which are invoked too 
early to remove functions made internal by this pass.

This adds a check similar to that in alwaysinliner that removes trivially dead 
functions after being marked internal. It refactors that code a bit to make it 
simpler including reversing what is stored in the work queue.

Not sure how to test this. To test all the interactions between alwaysinliner, 
DXILfinalizelinkage and any other optimization passes, it kinda needs to be 
end-to-end.

Fixes #106139

>From 20bf1f85d8aa2786ecf874203b9759aa42be9627 Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Sun, 25 Aug 2024 12:00:03 -0600
Subject: [PATCH] Tentative fix for not removing newly internal functions

Functions are not removed even when made internal by DXILFinalizeLinkage
The removal code is called from alwaysinliner and globalopt, which are
invoked too early to remove functions made internal by this pass.

This adds a check similar to that in alwaysinliner that removes
trivially dead functions after being marked internal. It refactors
that code a bit to make it simpler including reversing what is
stored in the work queue.

Not sure how to test this. To test all the interactions between
alwaysinliner, DXILfinalizelinkage and any other optimization passes,
it kinda needs to be end-to-end.

Fixes #106139
---
 .../CodeGenHLSL/remove-internal-unused.hlsl | 17 +
 llvm/lib/Target/DirectX/DXILFinalizeLinkage.cpp | 15 ---
 2 files changed, 25 insertions(+), 7 deletions(-)
 create mode 100644 clang/test/CodeGenHLSL/remove-internal-unused.hlsl

diff --git a/clang/test/CodeGenHLSL/remove-internal-unused.hlsl 
b/clang/test/CodeGenHLSL/remove-internal-unused.hlsl
new file mode 100644
index 00..6ec08060e24dd2
--- /dev/null
+++ b/clang/test/CodeGenHLSL/remove-internal-unused.hlsl
@@ -0,0 +1,17 @@
+// RUN: %clang_dxc -T cs_6_0 %s | Filecheck %s
+
+// Verify that internal linkage unused functions are removed
+
+RWBuffer buf;
+
+// CHECK-NOT: define{{.*}}donothing
+void donothing() {
+ buf[1] = 1; // never called, does nothing!
+}
+
+
+[numthreads(1,1,1)]
+[shader("compute")]
+void main() {
+ buf[0] = 0;// I'm doing something!!! 
+}
\ No newline at end of file
diff --git a/llvm/lib/Target/DirectX/DXILFinalizeLinkage.cpp 
b/llvm/lib/Target/DirectX/DXILFinalizeLinkage.cpp
index c02eb768cdf49b..6508258cdd197a 100644
--- a/llvm/lib/Target/DirectX/DXILFinalizeLinkage.cpp
+++ b/llvm/lib/Target/DirectX/DXILFinalizeLinkage.cpp
@@ -18,19 +18,20 @@
 using namespace llvm;
 
 static bool finalizeLinkage(Module &M) {
-  SmallPtrSet EntriesAndExports;
+  SmallPtrSet Funcs;
 
   // Find all entry points and export functions
   for (Function &EF : M.functions()) {
-if (!EF.hasFnAttribute("hlsl.shader") && !EF.hasFnAttribute("hlsl.export"))
+if (EF.hasFnAttribute("hlsl.shader") || EF.hasFnAttribute("hlsl.export"))
   continue;
-EntriesAndExports.insert(&EF);
+Funcs.insert(&EF);
   }
 
-  for (Function &F : M.functions()) {
-if (F.getLinkage() == GlobalValue::ExternalLinkage &&
-!EntriesAndExports.contains(&F)) {
-  F.setLinkage(GlobalValue::InternalLinkage);
+  for (Function *F : Funcs) {
+if (F->getLinkage() == GlobalValue::ExternalLinkage) {
+  F->setLinkage(GlobalValue::InternalLinkage);
+  if (F->isDefTriviallyDead())
+   M.getFunctionList().erase(F);
 }
   }
 

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Tentative fix for not removing newly internal functions (PR #106146)

2024-08-26 Thread Greg Roth via cfe-commits

https://github.com/pow2clk updated 
https://github.com/llvm/llvm-project/pull/106146

>From 6cf9e802a47860279fc793cb07ac3f4850826cb3 Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Sun, 25 Aug 2024 12:00:03 -0600
Subject: [PATCH] Tentative fix for not removing newly internal functions

Functions are not removed even when made internal by DXILFinalizeLinkage
The removal code is called from alwaysinliner and globalopt, which are
invoked too early to remove functions made internal by this pass.

This adds a check similar to that in alwaysinliner that removes
trivially dead functions after being marked internal. It refactors
that code a bit to make it simpler including reversing what is
stored in the work queue.

Not sure how to test this. To test all the interactions between
alwaysinliner, DXILfinalizelinkage and any other optimization passes,
it kinda needs to be end-to-end.

Fixes #106139
---
 .../CodeGenHLSL/remove-internal-unused.hlsl | 17 +
 llvm/lib/Target/DirectX/DXILFinalizeLinkage.cpp | 15 ---
 2 files changed, 25 insertions(+), 7 deletions(-)
 create mode 100644 clang/test/CodeGenHLSL/remove-internal-unused.hlsl

diff --git a/clang/test/CodeGenHLSL/remove-internal-unused.hlsl 
b/clang/test/CodeGenHLSL/remove-internal-unused.hlsl
new file mode 100644
index 00..6ec08060e24dd2
--- /dev/null
+++ b/clang/test/CodeGenHLSL/remove-internal-unused.hlsl
@@ -0,0 +1,17 @@
+// RUN: %clang_dxc -T cs_6_0 %s | Filecheck %s
+
+// Verify that internal linkage unused functions are removed
+
+RWBuffer buf;
+
+// CHECK-NOT: define{{.*}}donothing
+void donothing() {
+ buf[1] = 1; // never called, does nothing!
+}
+
+
+[numthreads(1,1,1)]
+[shader("compute")]
+void main() {
+ buf[0] = 0;// I'm doing something!!! 
+}
\ No newline at end of file
diff --git a/llvm/lib/Target/DirectX/DXILFinalizeLinkage.cpp 
b/llvm/lib/Target/DirectX/DXILFinalizeLinkage.cpp
index c02eb768cdf49b..6508258cdd197a 100644
--- a/llvm/lib/Target/DirectX/DXILFinalizeLinkage.cpp
+++ b/llvm/lib/Target/DirectX/DXILFinalizeLinkage.cpp
@@ -18,19 +18,20 @@
 using namespace llvm;
 
 static bool finalizeLinkage(Module &M) {
-  SmallPtrSet EntriesAndExports;
+  SmallPtrSet Funcs;
 
   // Find all entry points and export functions
   for (Function &EF : M.functions()) {
-if (!EF.hasFnAttribute("hlsl.shader") && !EF.hasFnAttribute("hlsl.export"))
+if (EF.hasFnAttribute("hlsl.shader") || EF.hasFnAttribute("hlsl.export"))
   continue;
-EntriesAndExports.insert(&EF);
+Funcs.insert(&EF);
   }
 
-  for (Function &F : M.functions()) {
-if (F.getLinkage() == GlobalValue::ExternalLinkage &&
-!EntriesAndExports.contains(&F)) {
-  F.setLinkage(GlobalValue::InternalLinkage);
+  for (Function *F : Funcs) {
+if (F->getLinkage() == GlobalValue::ExternalLinkage) {
+  F->setLinkage(GlobalValue::InternalLinkage);
+  if (F->isDefTriviallyDead())
+   M.getFunctionList().erase(F);
 }
   }
 

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Tentative fix for not removing newly internal functions (PR #106146)

2024-08-26 Thread Greg Roth via cfe-commits

pow2clk wrote:

I'd like to add a test that verifies this removal of used functions that get 
alwaysinlined, but that requires the inlining fix for #89282 

https://github.com/llvm/llvm-project/pull/106146
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Tentative fix for not removing newly internal functions (PR #106146)

2024-08-26 Thread Greg Roth via cfe-commits


@@ -18,19 +18,20 @@
 using namespace llvm;
 
 static bool finalizeLinkage(Module &M) {
-  SmallPtrSet EntriesAndExports;
+  SmallPtrSet Funcs;
 
   // Find all entry points and export functions
   for (Function &EF : M.functions()) {
-if (!EF.hasFnAttribute("hlsl.shader") && !EF.hasFnAttribute("hlsl.export"))
+if (EF.hasFnAttribute("hlsl.shader") || EF.hasFnAttribute("hlsl.export"))
   continue;
-EntriesAndExports.insert(&EF);
+Funcs.insert(&EF);
   }
 
-  for (Function &F : M.functions()) {
-if (F.getLinkage() == GlobalValue::ExternalLinkage &&
-!EntriesAndExports.contains(&F)) {
-  F.setLinkage(GlobalValue::InternalLinkage);
+  for (Function *F : Funcs) {
+if (F->getLinkage() == GlobalValue::ExternalLinkage) {
+  F->setLinkage(GlobalValue::InternalLinkage);
+  if (F->isDefTriviallyDead())

pow2clk wrote:

Not sure if this should also check that the alwaysinline attribute is set like 
AlwaysInliner does: 
https://github.com/llvm/llvm-project/blob/2a5ac9d9aff91406b0c58629df3a4e4dce87738c/llvm/lib/Transforms/IPO/AlwaysInliner.cpp#L89

We should always inline functions that are marked internal, so I don't know of 
a case where it wouldn't be set. 

https://github.com/llvm/llvm-project/pull/106146
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Tentative fix for not removing newly internal functions (PR #106146)

2024-08-26 Thread Greg Roth via cfe-commits

https://github.com/pow2clk updated 
https://github.com/llvm/llvm-project/pull/106146

>From e0d9fa7a87ee18b23cda29381afadeb0b8d23ce8 Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Sun, 25 Aug 2024 12:00:03 -0600
Subject: [PATCH] Tentative fix for not removing newly internal functions

Functions are not removed even when made internal by DXILFinalizeLinkage
The removal code is called from alwaysinliner and globalopt, which are
invoked too early to remove functions made internal by this pass.

This adds a check similar to that in alwaysinliner that removes
trivially dead functions after being marked internal. It refactors
that code a bit to make it simpler including reversing what is
stored in the work queue.

Not sure how to test this. To test all the interactions between
alwaysinliner, DXILfinalizelinkage and any other optimization passes,
it kinda needs to be end-to-end.

Fixes #106139
---
 .../CodeGenHLSL/remove-internal-unused.hlsl | 17 +
 llvm/lib/Target/DirectX/DXILFinalizeLinkage.cpp | 15 ---
 2 files changed, 25 insertions(+), 7 deletions(-)
 create mode 100644 clang/test/CodeGenHLSL/remove-internal-unused.hlsl

diff --git a/clang/test/CodeGenHLSL/remove-internal-unused.hlsl 
b/clang/test/CodeGenHLSL/remove-internal-unused.hlsl
new file mode 100644
index 00..6ec08060e24dd2
--- /dev/null
+++ b/clang/test/CodeGenHLSL/remove-internal-unused.hlsl
@@ -0,0 +1,17 @@
+// RUN: %clang_dxc -T cs_6_0 %s | Filecheck %s
+
+// Verify that internal linkage unused functions are removed
+
+RWBuffer buf;
+
+// CHECK-NOT: define{{.*}}donothing
+void donothing() {
+ buf[1] = 1; // never called, does nothing!
+}
+
+
+[numthreads(1,1,1)]
+[shader("compute")]
+void main() {
+ buf[0] = 0;// I'm doing something!!! 
+}
\ No newline at end of file
diff --git a/llvm/lib/Target/DirectX/DXILFinalizeLinkage.cpp 
b/llvm/lib/Target/DirectX/DXILFinalizeLinkage.cpp
index c02eb768cdf49b..0055ad3073c644 100644
--- a/llvm/lib/Target/DirectX/DXILFinalizeLinkage.cpp
+++ b/llvm/lib/Target/DirectX/DXILFinalizeLinkage.cpp
@@ -18,19 +18,20 @@
 using namespace llvm;
 
 static bool finalizeLinkage(Module &M) {
-  SmallPtrSet EntriesAndExports;
+  SmallPtrSet Funcs;
 
   // Find all entry points and export functions
   for (Function &EF : M.functions()) {
-if (!EF.hasFnAttribute("hlsl.shader") && !EF.hasFnAttribute("hlsl.export"))
+if (EF.hasFnAttribute("hlsl.shader") || EF.hasFnAttribute("hlsl.export"))
   continue;
-EntriesAndExports.insert(&EF);
+Funcs.insert(&EF);
   }
 
-  for (Function &F : M.functions()) {
-if (F.getLinkage() == GlobalValue::ExternalLinkage &&
-!EntriesAndExports.contains(&F)) {
-  F.setLinkage(GlobalValue::InternalLinkage);
+  for (Function *F : Funcs) {
+if (F->getLinkage() == GlobalValue::ExternalLinkage) {
+  F->setLinkage(GlobalValue::InternalLinkage);
+  if (F->isDefTriviallyDead())
+M.getFunctionList().erase(F);
 }
   }
 

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [DXIL] Don't generate per-variable guards for DirectX (PR #106096)

2024-08-27 Thread Greg Roth via cfe-commits


@@ -0,0 +1,37 @@
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -o - 
-disable-llvm-passes %s | FileCheck %s
+
+// Verify that no per variable _Init_thread instructions are emitted for 
non-trivial static locals
+// These would normally be emitted by the MicrosoftCXXABI, but the DirectX 
backend should exlude them
+// Instead, check for the guardvar oparations that should protect the 
constructor initialization should
+// only take place once.
+
+RWBuffer buf[10];
+
+void InitBuf(RWBuffer buf) {
+  for (unsigned i; i < 100; i++)

pow2clk wrote:

Yeah. That's something I've been considering how to address. There are dozens 
of existing tests that use this same construction though. I think we should 
make DXC accept it as part of 202x or just out of band rather than forbid it in 
clang. 

https://github.com/llvm/llvm-project/pull/106096
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [DXIL] Don't generate per-variable guards for DirectX (PR #106096)

2024-08-27 Thread Greg Roth via cfe-commits


@@ -0,0 +1,37 @@
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -o - 
-disable-llvm-passes %s | FileCheck %s
+
+// Verify that no per variable _Init_thread instructions are emitted for 
non-trivial static locals
+// These would normally be emitted by the MicrosoftCXXABI, but the DirectX 
backend should exlude them
+// Instead, check for the guardvar oparations that should protect the 
constructor initialization should
+// only take place once.
+
+RWBuffer buf[10];
+
+void InitBuf(RWBuffer buf) {
+  for (unsigned i; i < 100; i++)
+buf[i] = 0;
+}
+
+// CHECK-NOT: _Init_thread_epoch
+// CHECK: define internal void @"?main@@YAXXZ"
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[Tmp1:%.*]] = alloca %"class.hlsl::RWBuffer"
+// CHECK-NEXT: [[Tmp2:%.*]] = load i32, ptr
+// CHECK-NEXT: [[Tmp3:%.*]] = and i32 [[Tmp2]], 1
+// CHECK-NEXT: [[Tmp4:%.*]] = icmp eq i32 [[Tmp3]], 0
+// CHECK-NEXT: br i1 [[Tmp4]]
+// CHECK-NOT: _Init_thread_header
+// CHECK: init:
+// CHECK-NEXT: = or i32 [[Tmp2]], 1
+// CHECK-NOT: _Init_thread_footer
+
+
+[shader("compute")]
+[numthreads(1,1,1)]
+void main() {
+  // A non-trivially initialized static local will get checks to verify that 
it is generated just once
+  static RWBuffer mybuf;
+  mybuf = buf[0];

pow2clk wrote:

That's what I'm expecting and that's fine. RWBuffer has an implicit constructor 
that needs the guards. It's little different if I initialized it on the same 
line as far as the guard variable protection goes. I'm using "initialization" a 
bit more broadly than in the C++ context because that's what the compiler code 
uses to refer to the construction execution. 

You can see the calls I'm eliminating here in this link: 
https://godbolt.org/z/d7djPTWP7
```llvm
tail call void @_Init_thread_header(ptr nonnull @"?$TSS0@?1??main@@YAXXZ@4HA") 
#1, !dbg !75
...
  br i1 %26, label %27, label %31, !dbg !75

27:   ; preds = %24
...
  tail call void @_Init_thread_footer(ptr nonnull 
@"?$TSS0@?1??main@@YAXXZ@4HA") #1, !dbg !75
```

https://github.com/llvm/llvm-project/pull/106096
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [DXIL] Don't generate per-variable guards for DirectX (PR #106096)

2024-08-27 Thread Greg Roth via cfe-commits

https://github.com/pow2clk updated 
https://github.com/llvm/llvm-project/pull/106096

>From a7242d7183b9a65c7e205c80f3a2bfe3866fcfb7 Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Fri, 23 Aug 2024 16:00:01 -0600
Subject: [PATCH 1/2] [DXIL] Don't generate per-variable guards for DirectX

Thread init guards are generated for local static variables when
using the Microsoft CXX ABI. This ABI is also used for HLSL generation,
but DXIL doesn't need the corresponding _Init_thread_header/footer
calls and doesn't really have a way to handle them in its output
targets.

This modifies the language ops when the target is DXIL to exclude this
so that they won't be generated and an alternate guardvar method is used.

Done to facilitate testing for #89806, but isn't really related
---
 clang/lib/Basic/Targets/DirectX.h |  7 
 clang/test/CodeGenHLSL/static-local-ctor.hlsl | 37 +++
 2 files changed, 44 insertions(+)
 create mode 100644 clang/test/CodeGenHLSL/static-local-ctor.hlsl

diff --git a/clang/lib/Basic/Targets/DirectX.h 
b/clang/lib/Basic/Targets/DirectX.h
index a084e2823453fc..cf7ea5e83503dc 100644
--- a/clang/lib/Basic/Targets/DirectX.h
+++ b/clang/lib/Basic/Targets/DirectX.h
@@ -94,6 +94,13 @@ class LLVM_LIBRARY_VISIBILITY DirectXTargetInfo : public 
TargetInfo {
   BuiltinVaListKind getBuiltinVaListKind() const override {
 return TargetInfo::VoidPtrBuiltinVaList;
   }
+
+  void adjust(DiagnosticsEngine &Diags, LangOptions &Opts) override {
+TargetInfo::adjust(Diags, Opts);
+// The static values this addresses do not apply outside of the same thread
+// This protection is neither available nor needed
+Opts.ThreadsafeStatics = false;
+  }
 };
 
 } // namespace targets
diff --git a/clang/test/CodeGenHLSL/static-local-ctor.hlsl 
b/clang/test/CodeGenHLSL/static-local-ctor.hlsl
new file mode 100644
index 00..d19f843b6f25c3
--- /dev/null
+++ b/clang/test/CodeGenHLSL/static-local-ctor.hlsl
@@ -0,0 +1,37 @@
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -o - 
-disable-llvm-passes %s | FileCheck %s
+
+// Verify that no per variable _Init_thread instructions are emitted for 
non-trivial static locals
+// These would normally be emitted by the MicrosoftCXXABI, but the DirectX 
backend should exlude them
+// Instead, check for the guardvar oparations that should protect the 
constructor initialization should
+// only take place once.
+
+RWBuffer buf[10];
+
+void InitBuf(RWBuffer buf) {
+  for (unsigned i; i < 100; i++)
+buf[i] = 0;
+}
+
+// CHECK-NOT: _Init_thread_epoch
+// CHECK: define internal void @"?main@@YAXXZ"
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[Tmp1:%.*]] = alloca %"class.hlsl::RWBuffer"
+// CHECK-NEXT: [[Tmp2:%.*]] = load i32, ptr
+// CHECK-NEXT: [[Tmp3:%.*]] = and i32 [[Tmp2]], 1
+// CHECK-NEXT: [[Tmp4:%.*]] = icmp eq i32 [[Tmp3]], 0
+// CHECK-NEXT: br i1 [[Tmp4]]
+// CHECK-NOT: _Init_thread_header
+// CHECK: init:
+// CHECK-NEXT: = or i32 [[Tmp2]], 1
+// CHECK-NOT: _Init_thread_footer
+
+
+[shader("compute")]
+[numthreads(1,1,1)]
+void main() {
+  // A non-trivially initialized static local will get checks to verify that 
it is generated just once
+  static RWBuffer mybuf;
+  mybuf = buf[0];
+  InitBuf(mybuf);
+}
+

>From b9d72573d6502a754312974b0839274d9d76219e Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Tue, 27 Aug 2024 11:15:39 -0600
Subject: [PATCH 2/2] minor improvements to constructor test

---
 clang/test/CodeGenHLSL/static-local-ctor.hlsl | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/clang/test/CodeGenHLSL/static-local-ctor.hlsl 
b/clang/test/CodeGenHLSL/static-local-ctor.hlsl
index d19f843b6f25c3..f55f6808672dea 100644
--- a/clang/test/CodeGenHLSL/static-local-ctor.hlsl
+++ b/clang/test/CodeGenHLSL/static-local-ctor.hlsl
@@ -8,7 +8,7 @@
 RWBuffer buf[10];
 
 void InitBuf(RWBuffer buf) {
-  for (unsigned i; i < 100; i++)
+  for (unsigned int i = 0; i < 100; i++)
 buf[i] = 0;
 }
 
@@ -29,7 +29,7 @@ void InitBuf(RWBuffer buf) {
 [shader("compute")]
 [numthreads(1,1,1)]
 void main() {
-  // A non-trivially initialized static local will get checks to verify that 
it is generated just once
+  // A non-trivially constructed static local will get checks to verify that 
it is generated just once
   static RWBuffer mybuf;
   mybuf = buf[0];
   InitBuf(mybuf);

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL] Implement '__builtin_hlsl_is_intangible' type trait (PR #104544)

2024-08-27 Thread Greg Roth via cfe-commits

https://github.com/pow3clk approved this pull request.

One more question, but it looks good to me!

https://github.com/llvm/llvm-project/pull/104544
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL] Implement '__builtin_hlsl_is_intangible' type trait (PR #104544)

2024-08-27 Thread Greg Roth via cfe-commits


@@ -5695,6 +5696,15 @@ static bool EvaluateUnaryTypeTrait(Sema &Self, TypeTrait 
UTT,
 return true;
 return false;
   }
+  case UTT_IsIntangibleType:
+assert(Self.getLangOpts().HLSL && "intangible types are HLSL-only 
feature");
+if (!T->isVoidType() && !T->isIncompleteArrayType())
+  if (Self.RequireCompleteType(TInfo->getTypeLoc().getBeginLoc(), T,
+   diag::err_incomplete_type))
+return false;
+DiagnoseVLAInCXXTypeTrait(Self, TInfo,

pow3clk wrote:

If this identifies a VLA and returns true, should we not return false? 

https://github.com/llvm/llvm-project/pull/104544
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL] Implement '__builtin_hlsl_is_intangible' type trait (PR #104544)

2024-08-27 Thread Greg Roth via cfe-commits

https://github.com/pow3clk edited 
https://github.com/llvm/llvm-project/pull/104544
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL] Implement '__builtin_hlsl_is_intangible' type trait (PR #104544)

2024-08-27 Thread Greg Roth via cfe-commits

https://github.com/pow2clk approved this pull request.

Oops. I approved on my underprivileged laptop account 😳

https://github.com/llvm/llvm-project/pull/104544
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] Apply NoRecurse attrib to all HLSL functions (PR #105907)

2024-08-28 Thread Greg Roth via cfe-commits


@@ -1064,13 +1064,17 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, 
QualType RetTy,
   // OpenCL C 2.0 v2.2-11 s6.9.i:
   // Recursion is not supported.
   //
+  // HLSL
+  // Recursion is not supported.
+  //
   // SYCL v1.2.1 s3.10:
   // kernels cannot include RTTI information, exception classes,
   // recursive code, virtual functions or make use of C++ libraries that
   // are not compiled for the device.
-  if (FD && ((getLangOpts().CPlusPlus && FD->isMain()) ||
- getLangOpts().OpenCL || getLangOpts().SYCLIsDevice ||
- (getLangOpts().CUDA && FD->hasAttr(
+  if (FD &&
+  ((getLangOpts().CPlusPlus && FD->isMain()) || getLangOpts().OpenCL ||
+   getLangOpts().HLSL || getLangOpts().SYCLIsDevice ||

pow2clk wrote:

This attribute is descriptive and used in a few places to determine if 
recursion is expected or not. There is not actually any place that checks for 
recursion and produces an error for HLSL nor for OpenCL. There is a check for 
it when inlining takes place, but that's too late for diagnostics. #105244 is 
meant to address this problem.

https://github.com/llvm/llvm-project/pull/105907
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] Apply NoRecurse attrib to all HLSL functions (PR #105907)

2024-08-28 Thread Greg Roth via cfe-commits


@@ -0,0 +1,93 @@
+// RUN: %clang_cc1 -x hlsl -triple dxil-pc-shadermodel6.3-library  
-finclude-default-header %s -emit-llvm -disable-llvm-passes -o - | FileCheck %s
+// RUN: %clang_cc1 -x hlsl -triple dxil-pc-shadermodel6.0-compute  
-finclude-default-header %s -emit-llvm -disable-llvm-passes -o - | FileCheck %s
+
+// Verify that a few different function types all get the NoRecurse attribute
+
+#define MAX 100
+
+struct Node {
+  uint value;
+  uint key;
+  uint left, right;
+};
+
+// CHECK: Function Attrs:{{.*}}norecurse
+// CHECK: define noundef i32 @"?Find@@YAIY0GE@UNode@@I@Z"(ptr noundef 
byval([100 x %struct.Node]) align 4 %SortedTree, i32 noundef %key) 
[[IntAttr:\#[0-9]+]]
+// CHECK: ret i32
+// Find and return value corresponding to key in the SortedTree
+uint Find(Node SortedTree[MAX], uint key) {

pow2clk wrote:

Probably not. I admit I got carried away, but in the process I discovered the 
constraints of the current implementation and potentially introduced some 
incidentals that might catch future issues by creating a more representative 
shader. I realize that philosophies might differ here, but I'm reluctant to 
change it. 

https://github.com/llvm/llvm-project/pull/105907
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [DXIL] Don't generate per-variable guards for DirectX (PR #106096)

2024-08-28 Thread Greg Roth via cfe-commits


@@ -94,6 +94,13 @@ class LLVM_LIBRARY_VISIBILITY DirectXTargetInfo : public 
TargetInfo {
   BuiltinVaListKind getBuiltinVaListKind() const override {
 return TargetInfo::VoidPtrBuiltinVaList;
   }
+
+  void adjust(DiagnosticsEngine &Diags, LangOptions &Opts) override {

pow2clk wrote:

I'm glad you asked! That's a bug I forgot to file. I think SPIR-V should 
exclude these as well. However, it uses the Itanium ABI which generates 
completely different protections. It was a non-trivial amount of additional 
work that wouldn't share anything with what I did here and this was meant to be 
an incidental to make testing easier, so I left it out. 

https://github.com/llvm/llvm-project/pull/106096
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [DXIL] Don't generate per-variable guards for DirectX (PR #106096)

2024-08-28 Thread Greg Roth via cfe-commits


@@ -94,6 +94,13 @@ class LLVM_LIBRARY_VISIBILITY DirectXTargetInfo : public 
TargetInfo {
   BuiltinVaListKind getBuiltinVaListKind() const override {
 return TargetInfo::VoidPtrBuiltinVaList;
   }
+
+  void adjust(DiagnosticsEngine &Diags, LangOptions &Opts) override {

pow2clk wrote:

I filed the bug #106455 

https://github.com/llvm/llvm-project/pull/106096
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [DXIL] Don't generate per-variable guards for DirectX (PR #106096)

2024-08-28 Thread Greg Roth via cfe-commits

https://github.com/pow2clk closed 
https://github.com/llvm/llvm-project/pull/106096
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] Apply NoRecurse attrib to all HLSL functions (PR #105907)

2024-08-29 Thread Greg Roth via cfe-commits

https://github.com/pow2clk closed 
https://github.com/llvm/llvm-project/pull/105907
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-08-29 Thread Greg Roth via cfe-commits

https://github.com/pow2clk created 
https://github.com/llvm/llvm-project/pull/106588

HLSL inlines all its functions by default. This uses the alwaysinline attribute 
to force that in the corresponding pass for user functions by default and 
overrides the default noinline of some implicit functions. This makes an 
instance of explicit inlining for buffer subscripts unnecessary.

Adds tests for function and constructor inlining and augments some existing 
tests to verify correct inlining of implicitly created functions as well.

incidentally restore RUN line that I believe was mistakenly removed as part of 
#88918

fixes #89282

>From 12253818bd47aa8c324f6222586965f356b11c90 Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Wed, 24 Jul 2024 16:49:19 -0600
Subject: [PATCH] [HLSL] set alwaysinline on HLSL functions

HLSL inlines all its functions by default. This uses the alwaysinline
attribute to force that in the corresponding pass for user functions
by default and overrides the default noinline of some implicit functions.
This makes an instance of explicit inlining for buffer subscripts unnecessary.

Adds tests for function and constructor inlining and augments some existing
tests to verify correct inlining of implicitly created functions as well.

incidentally restore RUN line that I believe was mistakenly removed as part of 
#88918

fixes #89282
---
 clang/lib/CodeGen/CGHLSLRuntime.cpp   |  17 ++-
 clang/lib/CodeGen/CodeGenFunction.cpp |   4 +-
 clang/lib/Sema/HLSLExternalSemaSource.cpp |   2 -
 .../GlobalConstructorFunction.hlsl|  31 +++--
 .../CodeGenHLSL/GlobalConstructorLib.hlsl |  23 +++-
 clang/test/CodeGenHLSL/GlobalDestructors.hlsl |  51 +---
 .../builtins/RWBuffer-constructor.hlsl|   1 +
 .../builtins/RWBuffer-subscript.hlsl  |   5 +-
 .../test/CodeGenHLSL/inline-constructors.hlsl |  74 
 clang/test/CodeGenHLSL/inline-functions.hlsl  | 114 ++
 10 files changed, 279 insertions(+), 43 deletions(-)
 create mode 100644 clang/test/CodeGenHLSL/inline-constructors.hlsl
 create mode 100644 clang/test/CodeGenHLSL/inline-functions.hlsl

diff --git a/clang/lib/CodeGen/CGHLSLRuntime.cpp 
b/clang/lib/CodeGen/CGHLSLRuntime.cpp
index 4bd7b6ba58de0d..24d126ced0d9f7 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.cpp
+++ b/clang/lib/CodeGen/CGHLSLRuntime.cpp
@@ -414,9 +414,20 @@ void CGHLSLRuntime::emitEntryFunction(const FunctionDecl 
*FD,
 
 void CGHLSLRuntime::setHLSLFunctionAttributes(const FunctionDecl *FD,
   llvm::Function *Fn) {
-  if (FD->isInExportDeclContext()) {
-const StringRef ExportAttrKindStr = "hlsl.export";
-Fn->addFnAttr(ExportAttrKindStr);
+  if (FD) { // "explicit" functions with declarations
+if (FD->isInExportDeclContext()) {
+  const StringRef ExportAttrKindStr = "hlsl.export";
+  Fn->addFnAttr(ExportAttrKindStr);
+}
+// Respect noinline if the explicit functions use it
+// otherwise default to alwaysinline
+if (!Fn->hasFnAttribute(Attribute::NoInline))
+  Fn->addFnAttr(llvm::Attribute::AlwaysInline);
+  } else { // "implicit" autogenerated functions with no declaration
+// Implicit functions might get marked as noinline by default
+// but we override that for HLSL
+Fn->removeFnAttr(Attribute::NoInline);
+Fn->addFnAttr(Attribute::AlwaysInline);
   }
 }
 
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp 
b/clang/lib/CodeGen/CodeGenFunction.cpp
index a5747283e98058..aceeed0e66d130 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -1239,9 +1239,9 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, 
QualType RetTy,
   if (getLangOpts().OpenMP && CurCodeDecl)
 CGM.getOpenMPRuntime().emitFunctionProlog(*this, CurCodeDecl);
 
-  if (FD && getLangOpts().HLSL) {
+  if (getLangOpts().HLSL) {
 // Handle emitting HLSL entry functions.
-if (FD->hasAttr()) {
+if (FD && FD->hasAttr()) {
   CGM.getHLSLRuntime().emitEntryFunction(FD, Fn);
 }
 CGM.getHLSLRuntime().setHLSLFunctionAttributes(FD, Fn);
diff --git a/clang/lib/Sema/HLSLExternalSemaSource.cpp 
b/clang/lib/Sema/HLSLExternalSemaSource.cpp
index 9aacbe4ad9548e..0a534d94192560 100644
--- a/clang/lib/Sema/HLSLExternalSemaSource.cpp
+++ b/clang/lib/Sema/HLSLExternalSemaSource.cpp
@@ -290,8 +290,6 @@ struct BuiltinTypeDeclBuilder {
  SourceLocation()));
 MethodDecl->setLexicalDeclContext(Record);
 MethodDecl->setAccess(AccessSpecifier::AS_public);
-MethodDecl->addAttr(AlwaysInlineAttr::CreateImplicit(
-AST, SourceRange(), AlwaysInlineAttr::CXX11_clang_always_inline));
 Record->addDecl(MethodDecl);
 
 return *this;
diff --git a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl 
b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
index f954c9d2f029f2..b39311ad67cd62 100644
--- a/clang/test/CodeGenHLSL/GlobalConstructorFun

[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-08-29 Thread Greg Roth via cfe-commits

https://github.com/pow2clk updated 
https://github.com/llvm/llvm-project/pull/106588

>From 12253818bd47aa8c324f6222586965f356b11c90 Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Wed, 24 Jul 2024 16:49:19 -0600
Subject: [PATCH 1/2] [HLSL] set alwaysinline on HLSL functions

HLSL inlines all its functions by default. This uses the alwaysinline
attribute to force that in the corresponding pass for user functions
by default and overrides the default noinline of some implicit functions.
This makes an instance of explicit inlining for buffer subscripts unnecessary.

Adds tests for function and constructor inlining and augments some existing
tests to verify correct inlining of implicitly created functions as well.

incidentally restore RUN line that I believe was mistakenly removed as part of 
#88918

fixes #89282
---
 clang/lib/CodeGen/CGHLSLRuntime.cpp   |  17 ++-
 clang/lib/CodeGen/CodeGenFunction.cpp |   4 +-
 clang/lib/Sema/HLSLExternalSemaSource.cpp |   2 -
 .../GlobalConstructorFunction.hlsl|  31 +++--
 .../CodeGenHLSL/GlobalConstructorLib.hlsl |  23 +++-
 clang/test/CodeGenHLSL/GlobalDestructors.hlsl |  51 +---
 .../builtins/RWBuffer-constructor.hlsl|   1 +
 .../builtins/RWBuffer-subscript.hlsl  |   5 +-
 .../test/CodeGenHLSL/inline-constructors.hlsl |  74 
 clang/test/CodeGenHLSL/inline-functions.hlsl  | 114 ++
 10 files changed, 279 insertions(+), 43 deletions(-)
 create mode 100644 clang/test/CodeGenHLSL/inline-constructors.hlsl
 create mode 100644 clang/test/CodeGenHLSL/inline-functions.hlsl

diff --git a/clang/lib/CodeGen/CGHLSLRuntime.cpp 
b/clang/lib/CodeGen/CGHLSLRuntime.cpp
index 4bd7b6ba58de0d..24d126ced0d9f7 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.cpp
+++ b/clang/lib/CodeGen/CGHLSLRuntime.cpp
@@ -414,9 +414,20 @@ void CGHLSLRuntime::emitEntryFunction(const FunctionDecl 
*FD,
 
 void CGHLSLRuntime::setHLSLFunctionAttributes(const FunctionDecl *FD,
   llvm::Function *Fn) {
-  if (FD->isInExportDeclContext()) {
-const StringRef ExportAttrKindStr = "hlsl.export";
-Fn->addFnAttr(ExportAttrKindStr);
+  if (FD) { // "explicit" functions with declarations
+if (FD->isInExportDeclContext()) {
+  const StringRef ExportAttrKindStr = "hlsl.export";
+  Fn->addFnAttr(ExportAttrKindStr);
+}
+// Respect noinline if the explicit functions use it
+// otherwise default to alwaysinline
+if (!Fn->hasFnAttribute(Attribute::NoInline))
+  Fn->addFnAttr(llvm::Attribute::AlwaysInline);
+  } else { // "implicit" autogenerated functions with no declaration
+// Implicit functions might get marked as noinline by default
+// but we override that for HLSL
+Fn->removeFnAttr(Attribute::NoInline);
+Fn->addFnAttr(Attribute::AlwaysInline);
   }
 }
 
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp 
b/clang/lib/CodeGen/CodeGenFunction.cpp
index a5747283e98058..aceeed0e66d130 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -1239,9 +1239,9 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, 
QualType RetTy,
   if (getLangOpts().OpenMP && CurCodeDecl)
 CGM.getOpenMPRuntime().emitFunctionProlog(*this, CurCodeDecl);
 
-  if (FD && getLangOpts().HLSL) {
+  if (getLangOpts().HLSL) {
 // Handle emitting HLSL entry functions.
-if (FD->hasAttr()) {
+if (FD && FD->hasAttr()) {
   CGM.getHLSLRuntime().emitEntryFunction(FD, Fn);
 }
 CGM.getHLSLRuntime().setHLSLFunctionAttributes(FD, Fn);
diff --git a/clang/lib/Sema/HLSLExternalSemaSource.cpp 
b/clang/lib/Sema/HLSLExternalSemaSource.cpp
index 9aacbe4ad9548e..0a534d94192560 100644
--- a/clang/lib/Sema/HLSLExternalSemaSource.cpp
+++ b/clang/lib/Sema/HLSLExternalSemaSource.cpp
@@ -290,8 +290,6 @@ struct BuiltinTypeDeclBuilder {
  SourceLocation()));
 MethodDecl->setLexicalDeclContext(Record);
 MethodDecl->setAccess(AccessSpecifier::AS_public);
-MethodDecl->addAttr(AlwaysInlineAttr::CreateImplicit(
-AST, SourceRange(), AlwaysInlineAttr::CXX11_clang_always_inline));
 Record->addDecl(MethodDecl);
 
 return *this;
diff --git a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl 
b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
index f954c9d2f029f2..b39311ad67cd62 100644
--- a/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
+++ b/clang/test/CodeGenHLSL/GlobalConstructorFunction.hlsl
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm 
-disable-llvm-passes %s -o - | FileCheck %s
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm 
-disable-llvm-passes %s -o - | FileCheck %s --check-prefixes=CHECK,NOINLINE
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -emit-llvm -O0 %s -o 
- | FileCheck %s --check-prefixes=CHECK,INLINE
 
 int i;
 
@@ -7,7 +8,7 @@ __attribute__((constructor)) void call_me

[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-03 Thread Greg Roth via cfe-commits


@@ -414,9 +414,20 @@ void CGHLSLRuntime::emitEntryFunction(const FunctionDecl 
*FD,
 
 void CGHLSLRuntime::setHLSLFunctionAttributes(const FunctionDecl *FD,
   llvm::Function *Fn) {
-  if (FD->isInExportDeclContext()) {
-const StringRef ExportAttrKindStr = "hlsl.export";
-Fn->addFnAttr(ExportAttrKindStr);
+  if (FD) { // "explicit" functions with declarations
+if (FD->isInExportDeclContext()) {
+  const StringRef ExportAttrKindStr = "hlsl.export";
+  Fn->addFnAttr(ExportAttrKindStr);
+}
+// Respect noinline if the explicit functions use it
+// otherwise default to alwaysinline
+if (!Fn->hasFnAttribute(Attribute::NoInline))
+  Fn->addFnAttr(llvm::Attribute::AlwaysInline);
+  } else { // "implicit" autogenerated functions with no declaration
+// Implicit functions might get marked as noinline by default
+// but we override that for HLSL
+Fn->removeFnAttr(Attribute::NoInline);

pow2clk wrote:

It is applied here: 
https://github.com/llvm/llvm-project/blob/df159d3cf8e681f8d225bd0b4ed0cbd97b16c588/clang/lib/CodeGen/CodeGenModule.cpp#L2477-L2479
on functions not explicitly created by the user, but generated to construct or 
destruct global/static local variables as well as the global functions that 
call each of these at the top and bottom of the entry function.


https://github.com/llvm/llvm-project/pull/106588
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [HLSL] set alwaysinline on HLSL functions (PR #106588)

2024-09-03 Thread Greg Roth via cfe-commits


@@ -290,8 +290,6 @@ struct BuiltinTypeDeclBuilder {
  SourceLocation()));
 MethodDecl->setLexicalDeclContext(Record);
 MethodDecl->setAccess(AccessSpecifier::AS_public);
-MethodDecl->addAttr(AlwaysInlineAttr::CreateImplicit(
-AST, SourceRange(), AlwaysInlineAttr::CXX11_clang_always_inline));

pow2clk wrote:

It wasn't necessary so much as redundant for the compilations I tested and 
actually gets applied to the LLVM function representation after that set by 
`setHLSLFunctionAttributes`.

How would I test a relevant separate compilation case? 

https://github.com/llvm/llvm-project/pull/106588
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL][DXIL][SPIRV] Create llvm dot intrinsic and use for HLSL (PR #102872)

2024-08-16 Thread Greg Roth via cfe-commits

https://github.com/pow2clk closed 
https://github.com/llvm/llvm-project/pull/102872
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL][DXIL][SPIRV] Create llvm dot intrinsic and use for HLSL (PR #102872)

2024-08-16 Thread Greg Roth via cfe-commits

pow2clk wrote:

Closing in light of the above. A new PR will capture the DXIL and 
SPIRV-specific work

https://github.com/llvm/llvm-project/pull/102872
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] Add SPIRV generation for HLSL dot (PR #104656)

2024-08-16 Thread Greg Roth via cfe-commits

https://github.com/pow2clk created 
https://github.com/llvm/llvm-project/pull/104656

This adds the SPIRV fdot, sdot, and udot intrinsics and allows them to be 
created at codegen depending on the target architecture. This required moving 
some of the DXIL-specific choices to DXIL instruction expansion out of codegen 
and providing it with at a more generic fdot intrinsic as well.

Removed some stale comments that gave the obsolete impression that type 
conversions should be expected to match overloads.

The SPIRV intrinsic handling involves generating multiply and add operations 
for integers and the existing OpDot operation for floating point.

New tests for generating SPIRV float and integer dot intrinsics are added as 
well.

Incidentally changed existing dot intrinsic definitions to use 
DefaultAttrsIntrinsic to match the newly added inrinsics

Fixes #88056

>From 9aff63478b76f042c05b7ae3dd1a2c099dc615de Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Fri, 2 Aug 2024 20:10:04 -0600
Subject: [PATCH] Add SPIRV generation for HLSL dot

This adds the SPIRV fdot, sdot, and udot intrinsics and allows
them to be created at codegen depending on the target architecture.
This required moving some of the DXIL-specific choices to DXIL
instruction expansion out of codegen and providing it with at a
more generic fdot intrinsic as well.

Removed some stale comments that gave the obsolete impression that
type conversions should be expected to match overloads.

The SPIRV intrinsic handling involves generating multiply and add
operations for integers and the existing OpDot operation for
floating point.

New tests for generating SPIRV float and integer dot intrinsics are
added as well.

Incidentally changed existing dot intrinsic definitions to use
DefaultAttrsIntrinsic to match the newly added inrinsics

Fixes #88056
---
 clang/lib/CodeGen/CGBuiltin.cpp   |  47 +++--
 clang/lib/CodeGen/CGHLSLRuntime.h |   3 +
 .../CodeGenHLSL/builtins/dot-builtin.hlsl |  12 +-
 clang/test/CodeGenHLSL/builtins/dot.hlsl  | 160 +-
 llvm/include/llvm/IR/IntrinsicsDirectX.td |  34 ++--
 llvm/include/llvm/IR/IntrinsicsSPIRV.td   |  12 ++
 llvm/lib/Target/DirectX/DXIL.td   |   6 +-
 .../Target/DirectX/DXILIntrinsicExpansion.cpp |  67 ++--
 .../Target/SPIRV/SPIRVInstructionSelector.cpp |  74 
 llvm/test/CodeGen/DirectX/fdot.ll | 117 +++--
 llvm/test/CodeGen/DirectX/idot.ll |  24 +--
 .../CodeGen/SPIRV/hlsl-intrinsics/fdot.ll |  75 
 .../CodeGen/SPIRV/hlsl-intrinsics/idot.ll |  88 ++
 13 files changed, 508 insertions(+), 211 deletions(-)
 create mode 100644 llvm/test/CodeGen/SPIRV/hlsl-intrinsics/fdot.ll
 create mode 100644 llvm/test/CodeGen/SPIRV/hlsl-intrinsics/idot.ll

diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index f424ddaa175400..5c49e71df3fcfa 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -18471,22 +18471,14 @@ llvm::Value 
*CodeGenFunction::EmitScalarOrConstFoldImmArg(unsigned ICEArguments,
   return Arg;
 }
 
-Intrinsic::ID getDotProductIntrinsic(QualType QT, int elementCount) {
-  if (QT->hasFloatingRepresentation()) {
-switch (elementCount) {
-case 2:
-  return Intrinsic::dx_dot2;
-case 3:
-  return Intrinsic::dx_dot3;
-case 4:
-  return Intrinsic::dx_dot4;
-}
-  }
-  if (QT->hasSignedIntegerRepresentation())
-return Intrinsic::dx_sdot;
-
-  assert(QT->hasUnsignedIntegerRepresentation());
-  return Intrinsic::dx_udot;
+// Return dot product intrinsic that corresponds to the QT scalar type
+Intrinsic::ID getDotProductIntrinsic(CGHLSLRuntime &RT, QualType QT) {
+  if (QT->isFloatingType())
+return RT.getFDotIntrinsic();
+  if (QT->isSignedIntegerType())
+return RT.getSDotIntrinsic();
+  assert(QT->isUnsignedIntegerType());
+  return RT.getUDotIntrinsic();
 }
 
 Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID,
@@ -18529,37 +18521,38 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned 
BuiltinID,
 Value *Op1 = EmitScalarExpr(E->getArg(1));
 llvm::Type *T0 = Op0->getType();
 llvm::Type *T1 = Op1->getType();
+
+// If the arguments are scalars, just emit a multiply
 if (!T0->isVectorTy() && !T1->isVectorTy()) {
   if (T0->isFloatingPointTy())
-return Builder.CreateFMul(Op0, Op1, "dx.dot");
+return Builder.CreateFMul(Op0, Op1, "hlsl.dot");
 
   if (T0->isIntegerTy())
-return Builder.CreateMul(Op0, Op1, "dx.dot");
+return Builder.CreateMul(Op0, Op1, "hlsl.dot");
 
-  // Bools should have been promoted
   llvm_unreachable(
   "Scalar dot product is only supported on ints and floats.");
 }
+// For vectors, validate types and emit the appropriate intrinsic
+
 // A VectorSplat should have happened
 assert(T0->isVectorTy() && T1->isVectorTy() &&
"Dot product of vector and sc

[clang] [llvm] [HLSL][SPIRV]Add SPIRV generation for HLSL dot (PR #104656)

2024-08-16 Thread Greg Roth via cfe-commits

https://github.com/pow2clk edited 
https://github.com/llvm/llvm-project/pull/104656
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL][DXIL][SPIRV] Create llvm dot intrinsic and use for HLSL (PR #102872)

2024-08-16 Thread Greg Roth via cfe-commits

pow2clk wrote:

Here's the new PR for anyone who wants to keep following along in its altered 
state: #104656

https://github.com/llvm/llvm-project/pull/102872
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL][SPIRV]Add SPIRV generation for HLSL dot (PR #104656)

2024-08-16 Thread Greg Roth via cfe-commits

https://github.com/pow2clk updated 
https://github.com/llvm/llvm-project/pull/104656

>From 9aff63478b76f042c05b7ae3dd1a2c099dc615de Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Fri, 2 Aug 2024 20:10:04 -0600
Subject: [PATCH 1/2] Add SPIRV generation for HLSL dot

This adds the SPIRV fdot, sdot, and udot intrinsics and allows
them to be created at codegen depending on the target architecture.
This required moving some of the DXIL-specific choices to DXIL
instruction expansion out of codegen and providing it with at a
more generic fdot intrinsic as well.

Removed some stale comments that gave the obsolete impression that
type conversions should be expected to match overloads.

The SPIRV intrinsic handling involves generating multiply and add
operations for integers and the existing OpDot operation for
floating point.

New tests for generating SPIRV float and integer dot intrinsics are
added as well.

Incidentally changed existing dot intrinsic definitions to use
DefaultAttrsIntrinsic to match the newly added inrinsics

Fixes #88056
---
 clang/lib/CodeGen/CGBuiltin.cpp   |  47 +++--
 clang/lib/CodeGen/CGHLSLRuntime.h |   3 +
 .../CodeGenHLSL/builtins/dot-builtin.hlsl |  12 +-
 clang/test/CodeGenHLSL/builtins/dot.hlsl  | 160 +-
 llvm/include/llvm/IR/IntrinsicsDirectX.td |  34 ++--
 llvm/include/llvm/IR/IntrinsicsSPIRV.td   |  12 ++
 llvm/lib/Target/DirectX/DXIL.td   |   6 +-
 .../Target/DirectX/DXILIntrinsicExpansion.cpp |  67 ++--
 .../Target/SPIRV/SPIRVInstructionSelector.cpp |  74 
 llvm/test/CodeGen/DirectX/fdot.ll | 117 +++--
 llvm/test/CodeGen/DirectX/idot.ll |  24 +--
 .../CodeGen/SPIRV/hlsl-intrinsics/fdot.ll |  75 
 .../CodeGen/SPIRV/hlsl-intrinsics/idot.ll |  88 ++
 13 files changed, 508 insertions(+), 211 deletions(-)
 create mode 100644 llvm/test/CodeGen/SPIRV/hlsl-intrinsics/fdot.ll
 create mode 100644 llvm/test/CodeGen/SPIRV/hlsl-intrinsics/idot.ll

diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index f424ddaa175400..5c49e71df3fcfa 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -18471,22 +18471,14 @@ llvm::Value 
*CodeGenFunction::EmitScalarOrConstFoldImmArg(unsigned ICEArguments,
   return Arg;
 }
 
-Intrinsic::ID getDotProductIntrinsic(QualType QT, int elementCount) {
-  if (QT->hasFloatingRepresentation()) {
-switch (elementCount) {
-case 2:
-  return Intrinsic::dx_dot2;
-case 3:
-  return Intrinsic::dx_dot3;
-case 4:
-  return Intrinsic::dx_dot4;
-}
-  }
-  if (QT->hasSignedIntegerRepresentation())
-return Intrinsic::dx_sdot;
-
-  assert(QT->hasUnsignedIntegerRepresentation());
-  return Intrinsic::dx_udot;
+// Return dot product intrinsic that corresponds to the QT scalar type
+Intrinsic::ID getDotProductIntrinsic(CGHLSLRuntime &RT, QualType QT) {
+  if (QT->isFloatingType())
+return RT.getFDotIntrinsic();
+  if (QT->isSignedIntegerType())
+return RT.getSDotIntrinsic();
+  assert(QT->isUnsignedIntegerType());
+  return RT.getUDotIntrinsic();
 }
 
 Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID,
@@ -18529,37 +18521,38 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned 
BuiltinID,
 Value *Op1 = EmitScalarExpr(E->getArg(1));
 llvm::Type *T0 = Op0->getType();
 llvm::Type *T1 = Op1->getType();
+
+// If the arguments are scalars, just emit a multiply
 if (!T0->isVectorTy() && !T1->isVectorTy()) {
   if (T0->isFloatingPointTy())
-return Builder.CreateFMul(Op0, Op1, "dx.dot");
+return Builder.CreateFMul(Op0, Op1, "hlsl.dot");
 
   if (T0->isIntegerTy())
-return Builder.CreateMul(Op0, Op1, "dx.dot");
+return Builder.CreateMul(Op0, Op1, "hlsl.dot");
 
-  // Bools should have been promoted
   llvm_unreachable(
   "Scalar dot product is only supported on ints and floats.");
 }
+// For vectors, validate types and emit the appropriate intrinsic
+
 // A VectorSplat should have happened
 assert(T0->isVectorTy() && T1->isVectorTy() &&
"Dot product of vector and scalar is not supported.");
 
-// A vector sext or sitofp should have happened
-assert(T0->getScalarType() == T1->getScalarType() &&
-   "Dot product of vectors need the same element types.");
-
 auto *VecTy0 = E->getArg(0)->getType()->getAs();
 [[maybe_unused]] auto *VecTy1 =
 E->getArg(1)->getType()->getAs();
-// A HLSLVectorTruncation should have happend
+
+assert(VecTy0->getElementType() == VecTy1->getElementType() &&
+   "Dot product of vectors need the same element types.");
+
 assert(VecTy0->getNumElements() == VecTy1->getNumElements() &&
"Dot product requires vectors to be of the same size.");
 
 return Builder.CreateIntrinsic(
 /*ReturnType=*/T0->getScalarType(),
-getDotProductIntrinsic

[clang] [llvm] [HLSL][SPIRV]Add SPIRV generation for HLSL dot (PR #104656)

2024-08-19 Thread Greg Roth via cfe-commits


@@ -2157,6 +2221,16 @@ bool SPIRVInstructionSelector::selectIntrinsic(Register 
ResVReg,
 break;
   case Intrinsic::spv_thread_id:
 return selectSpvThreadId(ResVReg, ResType, I);
+  case Intrinsic::spv_fdot:
+return BuildMI(BB, I, I.getDebugLoc(), TII.get(SPIRV::OpDot))

pow2clk wrote:

I don't mind creating a float function to expand the fdot. I don't think you're 
suggesting that I have one call for float and integer as that's the opposite of 
what you suggested 
[here](https://github.com/llvm/llvm-project/pull/102872#discussion_r1714053193).
 Are the checks you're referring to the asserts? 

https://github.com/llvm/llvm-project/pull/104656
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL][SPIRV]Add SPIRV generation for HLSL dot (PR #104656)

2024-08-19 Thread Greg Roth via cfe-commits


@@ -68,28 +69,65 @@ static Value *expandAbs(CallInst *Orig) {
  "dx.max");
 }
 
-static Value *expandIntegerDot(CallInst *Orig, Intrinsic::ID DotIntrinsic) {
+// Create DXIL dot intrinsics for floating point dot operations
+static Value *expandFloatDotIntrinsic(CallInst *Orig) {
+  Value *A = Orig->getOperand(0);
+  Value *B = Orig->getOperand(1);
+  Type *ATy = A->getType();
+  [[maybe_unused]] Type *BTy = B->getType();
+  assert(ATy->isVectorTy() && BTy->isVectorTy());
+
+  IRBuilder<> Builder(Orig);
+
+  auto *AVec = dyn_cast(ATy);
+
+  assert(ATy->getScalarType()->isFloatingPointTy());
+
+  Intrinsic::ID DotIntrinsic = Intrinsic::dx_dot4;
+  switch (AVec->getNumElements()) {
+  case 2:
+DotIntrinsic = Intrinsic::dx_dot2;
+break;
+  case 3:
+DotIntrinsic = Intrinsic::dx_dot3;
+break;
+  case 4:
+DotIntrinsic = Intrinsic::dx_dot4;
+break;
+  default:
+llvm_unreachable("dot product with vector outside 2-4 range");
+  }
+  return Builder.CreateIntrinsic(ATy->getScalarType(), DotIntrinsic,
+ ArrayRef{A, B}, nullptr, "dot");
+}
+
+// Expand integer dot product to multiply and add ops
+static Value *expandIntegerDotIntrinsic(CallInst *Orig,
+Intrinsic::ID DotIntrinsic) {
   assert(DotIntrinsic == Intrinsic::dx_sdot ||
  DotIntrinsic == Intrinsic::dx_udot);
-  Intrinsic::ID MadIntrinsic = DotIntrinsic == Intrinsic::dx_sdot
-   ? Intrinsic::dx_imad
-   : Intrinsic::dx_umad;
   Value *A = Orig->getOperand(0);
   Value *B = Orig->getOperand(1);
-  [[maybe_unused]] Type *ATy = A->getType();
+  Type *ATy = A->getType();
   [[maybe_unused]] Type *BTy = B->getType();
   assert(ATy->isVectorTy() && BTy->isVectorTy());
 
-  IRBuilder<> Builder(Orig->getParent());
-  Builder.SetInsertPoint(Orig);
+  IRBuilder<> Builder(Orig);
+
+  auto *AVec = dyn_cast(ATy);
 
-  auto *AVec = dyn_cast(A->getType());
+  assert(ATy->getScalarType()->isIntegerTy());
+
+  Value *Result;
+  Intrinsic::ID MadIntrinsic = DotIntrinsic == Intrinsic::dx_sdot
+   ? Intrinsic::dx_imad
+   : Intrinsic::dx_umad;
   Value *Elt0 = Builder.CreateExtractElement(A, (uint64_t)0);
   Value *Elt1 = Builder.CreateExtractElement(B, (uint64_t)0);
-  Value *Result = Builder.CreateMul(Elt0, Elt1);
-  for (unsigned I = 1; I < AVec->getNumElements(); I++) {
-Elt0 = Builder.CreateExtractElement(A, I);
-Elt1 = Builder.CreateExtractElement(B, I);
+  Result = Builder.CreateMul(Elt0, Elt1);
+  for (unsigned i = 1; i < AVec->getNumElements(); i++) {

pow2clk wrote:

I try so hard to be agnostic about style guides, but stuff like this challenge 
my lack of creed. 😣

https://github.com/llvm/llvm-project/pull/104656
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL][SPIRV]Add SPIRV generation for HLSL dot (PR #104656)

2024-08-19 Thread Greg Roth via cfe-commits

https://github.com/pow2clk updated 
https://github.com/llvm/llvm-project/pull/104656

>From 9aff63478b76f042c05b7ae3dd1a2c099dc615de Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Fri, 2 Aug 2024 20:10:04 -0600
Subject: [PATCH 1/4] Add SPIRV generation for HLSL dot

This adds the SPIRV fdot, sdot, and udot intrinsics and allows
them to be created at codegen depending on the target architecture.
This required moving some of the DXIL-specific choices to DXIL
instruction expansion out of codegen and providing it with at a
more generic fdot intrinsic as well.

Removed some stale comments that gave the obsolete impression that
type conversions should be expected to match overloads.

The SPIRV intrinsic handling involves generating multiply and add
operations for integers and the existing OpDot operation for
floating point.

New tests for generating SPIRV float and integer dot intrinsics are
added as well.

Incidentally changed existing dot intrinsic definitions to use
DefaultAttrsIntrinsic to match the newly added inrinsics

Fixes #88056
---
 clang/lib/CodeGen/CGBuiltin.cpp   |  47 +++--
 clang/lib/CodeGen/CGHLSLRuntime.h |   3 +
 .../CodeGenHLSL/builtins/dot-builtin.hlsl |  12 +-
 clang/test/CodeGenHLSL/builtins/dot.hlsl  | 160 +-
 llvm/include/llvm/IR/IntrinsicsDirectX.td |  34 ++--
 llvm/include/llvm/IR/IntrinsicsSPIRV.td   |  12 ++
 llvm/lib/Target/DirectX/DXIL.td   |   6 +-
 .../Target/DirectX/DXILIntrinsicExpansion.cpp |  67 ++--
 .../Target/SPIRV/SPIRVInstructionSelector.cpp |  74 
 llvm/test/CodeGen/DirectX/fdot.ll | 117 +++--
 llvm/test/CodeGen/DirectX/idot.ll |  24 +--
 .../CodeGen/SPIRV/hlsl-intrinsics/fdot.ll |  75 
 .../CodeGen/SPIRV/hlsl-intrinsics/idot.ll |  88 ++
 13 files changed, 508 insertions(+), 211 deletions(-)
 create mode 100644 llvm/test/CodeGen/SPIRV/hlsl-intrinsics/fdot.ll
 create mode 100644 llvm/test/CodeGen/SPIRV/hlsl-intrinsics/idot.ll

diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index f424ddaa175400..5c49e71df3fcfa 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -18471,22 +18471,14 @@ llvm::Value 
*CodeGenFunction::EmitScalarOrConstFoldImmArg(unsigned ICEArguments,
   return Arg;
 }
 
-Intrinsic::ID getDotProductIntrinsic(QualType QT, int elementCount) {
-  if (QT->hasFloatingRepresentation()) {
-switch (elementCount) {
-case 2:
-  return Intrinsic::dx_dot2;
-case 3:
-  return Intrinsic::dx_dot3;
-case 4:
-  return Intrinsic::dx_dot4;
-}
-  }
-  if (QT->hasSignedIntegerRepresentation())
-return Intrinsic::dx_sdot;
-
-  assert(QT->hasUnsignedIntegerRepresentation());
-  return Intrinsic::dx_udot;
+// Return dot product intrinsic that corresponds to the QT scalar type
+Intrinsic::ID getDotProductIntrinsic(CGHLSLRuntime &RT, QualType QT) {
+  if (QT->isFloatingType())
+return RT.getFDotIntrinsic();
+  if (QT->isSignedIntegerType())
+return RT.getSDotIntrinsic();
+  assert(QT->isUnsignedIntegerType());
+  return RT.getUDotIntrinsic();
 }
 
 Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID,
@@ -18529,37 +18521,38 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned 
BuiltinID,
 Value *Op1 = EmitScalarExpr(E->getArg(1));
 llvm::Type *T0 = Op0->getType();
 llvm::Type *T1 = Op1->getType();
+
+// If the arguments are scalars, just emit a multiply
 if (!T0->isVectorTy() && !T1->isVectorTy()) {
   if (T0->isFloatingPointTy())
-return Builder.CreateFMul(Op0, Op1, "dx.dot");
+return Builder.CreateFMul(Op0, Op1, "hlsl.dot");
 
   if (T0->isIntegerTy())
-return Builder.CreateMul(Op0, Op1, "dx.dot");
+return Builder.CreateMul(Op0, Op1, "hlsl.dot");
 
-  // Bools should have been promoted
   llvm_unreachable(
   "Scalar dot product is only supported on ints and floats.");
 }
+// For vectors, validate types and emit the appropriate intrinsic
+
 // A VectorSplat should have happened
 assert(T0->isVectorTy() && T1->isVectorTy() &&
"Dot product of vector and scalar is not supported.");
 
-// A vector sext or sitofp should have happened
-assert(T0->getScalarType() == T1->getScalarType() &&
-   "Dot product of vectors need the same element types.");
-
 auto *VecTy0 = E->getArg(0)->getType()->getAs();
 [[maybe_unused]] auto *VecTy1 =
 E->getArg(1)->getType()->getAs();
-// A HLSLVectorTruncation should have happend
+
+assert(VecTy0->getElementType() == VecTy1->getElementType() &&
+   "Dot product of vectors need the same element types.");
+
 assert(VecTy0->getNumElements() == VecTy1->getNumElements() &&
"Dot product requires vectors to be of the same size.");
 
 return Builder.CreateIntrinsic(
 /*ReturnType=*/T0->getScalarType(),
-getDotProductIntrinsic

[clang] [llvm] [HLSL][SPIRV]Add SPIRV generation for HLSL dot (PR #104656)

2024-08-19 Thread Greg Roth via cfe-commits

pow2clk wrote:

> could you fixup normalize to use the fdot expansion?
> 
> https://github.com/llvm/llvm-project/blob/9cf27a4d8b1da0e7b51eacb9fb6096155c294d3f/llvm/lib/Target/DirectX/DXILIntrinsicExpansion.cpp#L278-L289

It makes the signature a bit unusual, but it works!

https://github.com/llvm/llvm-project/pull/104656
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL][SPIRV]Add SPIRV generation for HLSL dot (PR #104656)

2024-08-19 Thread Greg Roth via cfe-commits


@@ -2157,6 +2221,16 @@ bool SPIRVInstructionSelector::selectIntrinsic(Register 
ResVReg,
 break;
   case Intrinsic::spv_thread_id:
 return selectSpvThreadId(ResVReg, ResType, I);
+  case Intrinsic::spv_fdot:
+return BuildMI(BB, I, I.getDebugLoc(), TII.get(SPIRV::OpDot))

pow2clk wrote:

Done assuming I've understood the suggestion correctly. 

https://github.com/llvm/llvm-project/pull/104656
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL][SPIRV]Add SPIRV generation for HLSL dot (PR #104656)

2024-08-19 Thread Greg Roth via cfe-commits


@@ -1446,6 +1449,67 @@ bool SPIRVInstructionSelector::selectRsqrt(Register 
ResVReg,
   .constrainAllUses(TII, TRI, RBI);
 }
 
+// Since there is no integer dot implementation, expand by piecewise 
multiplying

pow2clk wrote:

Done

https://github.com/llvm/llvm-project/pull/104656
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL][SPIRV]Add SPIRV generation for HLSL dot (PR #104656)

2024-08-19 Thread Greg Roth via cfe-commits


@@ -68,28 +69,65 @@ static Value *expandAbs(CallInst *Orig) {
  "dx.max");
 }
 
-static Value *expandIntegerDot(CallInst *Orig, Intrinsic::ID DotIntrinsic) {
+// Create DXIL dot intrinsics for floating point dot operations
+static Value *expandFloatDotIntrinsic(CallInst *Orig) {
+  Value *A = Orig->getOperand(0);
+  Value *B = Orig->getOperand(1);
+  Type *ATy = A->getType();
+  [[maybe_unused]] Type *BTy = B->getType();
+  assert(ATy->isVectorTy() && BTy->isVectorTy());
+
+  IRBuilder<> Builder(Orig);
+
+  auto *AVec = dyn_cast(ATy);
+
+  assert(ATy->getScalarType()->isFloatingPointTy());
+
+  Intrinsic::ID DotIntrinsic = Intrinsic::dx_dot4;
+  switch (AVec->getNumElements()) {
+  case 2:
+DotIntrinsic = Intrinsic::dx_dot2;
+break;
+  case 3:
+DotIntrinsic = Intrinsic::dx_dot3;
+break;
+  case 4:
+DotIntrinsic = Intrinsic::dx_dot4;
+break;
+  default:
+llvm_unreachable("dot product with vector outside 2-4 range");
+  }
+  return Builder.CreateIntrinsic(ATy->getScalarType(), DotIntrinsic,
+ ArrayRef{A, B}, nullptr, "dot");
+}
+
+// Expand integer dot product to multiply and add ops
+static Value *expandIntegerDotIntrinsic(CallInst *Orig,
+Intrinsic::ID DotIntrinsic) {
   assert(DotIntrinsic == Intrinsic::dx_sdot ||
  DotIntrinsic == Intrinsic::dx_udot);
-  Intrinsic::ID MadIntrinsic = DotIntrinsic == Intrinsic::dx_sdot
-   ? Intrinsic::dx_imad
-   : Intrinsic::dx_umad;
   Value *A = Orig->getOperand(0);
   Value *B = Orig->getOperand(1);
-  [[maybe_unused]] Type *ATy = A->getType();
+  Type *ATy = A->getType();
   [[maybe_unused]] Type *BTy = B->getType();
   assert(ATy->isVectorTy() && BTy->isVectorTy());
 
-  IRBuilder<> Builder(Orig->getParent());
-  Builder.SetInsertPoint(Orig);
+  IRBuilder<> Builder(Orig);
+
+  auto *AVec = dyn_cast(ATy);
 
-  auto *AVec = dyn_cast(A->getType());
+  assert(ATy->getScalarType()->isIntegerTy());
+
+  Value *Result;
+  Intrinsic::ID MadIntrinsic = DotIntrinsic == Intrinsic::dx_sdot
+   ? Intrinsic::dx_imad
+   : Intrinsic::dx_umad;
   Value *Elt0 = Builder.CreateExtractElement(A, (uint64_t)0);
   Value *Elt1 = Builder.CreateExtractElement(B, (uint64_t)0);
-  Value *Result = Builder.CreateMul(Elt0, Elt1);
-  for (unsigned I = 1; I < AVec->getNumElements(); I++) {
-Elt0 = Builder.CreateExtractElement(A, I);
-Elt1 = Builder.CreateExtractElement(B, I);
+  Result = Builder.CreateMul(Elt0, Elt1);
+  for (unsigned i = 1; i < AVec->getNumElements(); i++) {

pow2clk wrote:

Done

https://github.com/llvm/llvm-project/pull/104656
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL][SPIRV]Add SPIRV generation for HLSL dot (PR #104656)

2024-08-19 Thread Greg Roth via cfe-commits


@@ -7,155 +7,155 @@
 // RUN:   -o - | FileCheck %s --check-prefixes=CHECK,NO_HALF
 
 #ifdef __HLSL_ENABLE_16_BIT
-// NATIVE_HALF: %dx.dot = mul i16 %0, %1
-// NATIVE_HALF: ret i16 %dx.dot
+// NATIVE_HALF: %hlsl.dot = mul i16 %0, %1
+// NATIVE_HALF: ret i16 %hlsl.dot
 int16_t test_dot_short(int16_t p0, int16_t p1) { return dot(p0, p1); }
 
-// NATIVE_HALF: %dx.dot = call i16 @llvm.dx.sdot.v2i16(<2 x i16> %0, <2 x i16> 
%1)
-// NATIVE_HALF: ret i16 %dx.dot
+// NATIVE_HALF: %hlsl.dot = call i16 @llvm.dx.sdot.v2i16(<2 x i16> %0, <2 x 
i16> %1)
+// NATIVE_HALF: ret i16 %hlsl.dot
 int16_t test_dot_short2(int16_t2 p0, int16_t2 p1) { return dot(p0, p1); }
 
-// NATIVE_HALF: %dx.dot = call i16 @llvm.dx.sdot.v3i16(<3 x i16> %0, <3 x i16> 
%1)

pow2clk wrote:

Done

https://github.com/llvm/llvm-project/pull/104656
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL][SPIRV]Add SPIRV generation for HLSL dot (PR #104656)

2024-08-19 Thread Greg Roth via cfe-commits


@@ -1,161 +1,172 @@
 // RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
 // RUN:   dxil-pc-shadermodel6.3-library %s -fnative-half-type \
 // RUN:   -emit-llvm -disable-llvm-passes -o - | FileCheck %s \
-// RUN:   --check-prefixes=CHECK,NATIVE_HALF
+// RUN:   --check-prefixes=CHECK,DXCHECK,NATIVE_HALF
 // RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
 // RUN:   dxil-pc-shadermodel6.3-library %s -emit-llvm -disable-llvm-passes \
-// RUN:   -o - | FileCheck %s --check-prefixes=CHECK,NO_HALF
+// RUN:   -o - | FileCheck %s --check-prefixes=CHECK,DXCHECK,NO_HALF
 
-#ifdef __HLSL_ENABLE_16_BIT
-// NATIVE_HALF: %dx.dot = mul i16 %0, %1
-// NATIVE_HALF: ret i16 %dx.dot
-int16_t test_dot_short(int16_t p0, int16_t p1) { return dot(p0, p1); }
-
-// NATIVE_HALF: %dx.dot = call i16 @llvm.dx.sdot.v2i16(<2 x i16> %0, <2 x i16> 
%1)
-// NATIVE_HALF: ret i16 %dx.dot
-int16_t test_dot_short2(int16_t2 p0, int16_t2 p1) { return dot(p0, p1); }
-
-// NATIVE_HALF: %dx.dot = call i16 @llvm.dx.sdot.v3i16(<3 x i16> %0, <3 x i16> 
%1)
-// NATIVE_HALF: ret i16 %dx.dot
-int16_t test_dot_short3(int16_t3 p0, int16_t3 p1) { return dot(p0, p1); }
-
-// NATIVE_HALF: %dx.dot = call i16 @llvm.dx.sdot.v4i16(<4 x i16> %0, <4 x i16> 
%1)
-// NATIVE_HALF: ret i16 %dx.dot
-int16_t test_dot_short4(int16_t4 p0, int16_t4 p1) { return dot(p0, p1); }
-
-// NATIVE_HALF: %dx.dot = mul i16 %0, %1
-// NATIVE_HALF: ret i16 %dx.dot
-uint16_t test_dot_ushort(uint16_t p0, uint16_t p1) { return dot(p0, p1); }
-
-// NATIVE_HALF: %dx.dot = call i16 @llvm.dx.udot.v2i16(<2 x i16> %0, <2 x i16> 
%1)
-// NATIVE_HALF: ret i16 %dx.dot
-uint16_t test_dot_ushort2(uint16_t2 p0, uint16_t2 p1) { return dot(p0, p1); }
-
-// NATIVE_HALF: %dx.dot = call i16 @llvm.dx.udot.v3i16(<3 x i16> %0, <3 x i16> 
%1)
-// NATIVE_HALF: ret i16 %dx.dot
-uint16_t test_dot_ushort3(uint16_t3 p0, uint16_t3 p1) { return dot(p0, p1); }
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
+// RUN:   spirv-unknown-vulkan-compute %s -fnative-half-type \
+// RUN:   -emit-llvm -disable-llvm-passes -o - | FileCheck %s \
+// RUN:   --check-prefixes=CHECK,SPVCHECK,NATIVE_HALF
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
+// RUN:   spirv-unknown-vulkan-compute %s -emit-llvm -disable-llvm-passes \
+// RUN:   -o - | FileCheck %s --check-prefixes=CHECK,SPVCHECK,NO_HALF
 
-// NATIVE_HALF: %dx.dot = call i16 @llvm.dx.udot.v4i16(<4 x i16> %0, <4 x i16> 
%1)
-// NATIVE_HALF: ret i16 %dx.dot
-uint16_t test_dot_ushort4(uint16_t4 p0, uint16_t4 p1) { return dot(p0, p1); }
-#endif
 
-// CHECK: %dx.dot = mul i32 %0, %1
-// CHECK: ret i32 %dx.dot
+// CHECK: %hlsl.dot = mul i32
+// CHECK: ret i32 %hlsl.dot
 int test_dot_int(int p0, int p1) { return dot(p0, p1); }
 
-// CHECK: %dx.dot = call i32 @llvm.dx.sdot.v2i32(<2 x i32> %0, <2 x i32> %1)
-// CHECK: ret i32 %dx.dot
+// Capture the expected interchange format so not every check needs to be 
duplicated
+// DXCHECK: %hlsl.dot = call i32 @llvm.[[ICF:dx]].sdot.v2i32(<2 x i32>
+// SPVCHECK: %hlsl.dot = call i32 @llvm.[[ICF:spv]].sdot.v2i32(<2 x i32>
+// CHECK: ret i32 %hlsl.dot
 int test_dot_int2(int2 p0, int2 p1) { return dot(p0, p1); }
 
-// CHECK: %dx.dot = call i32 @llvm.dx.sdot.v3i32(<3 x i32> %0, <3 x i32> %1)
-// CHECK: ret i32 %dx.dot
+// CHECK: %hlsl.dot = call i32 @llvm.[[ICF]].sdot.v3i32(<3 x i32>
+// CHECK: ret i32 %hlsl.dot
 int test_dot_int3(int3 p0, int3 p1) { return dot(p0, p1); }
 
-// CHECK: %dx.dot = call i32 @llvm.dx.sdot.v4i32(<4 x i32> %0, <4 x i32> %1)
-// CHECK: ret i32 %dx.dot
+// CHECK: %hlsl.dot = call i32 @llvm.[[ICF]].sdot.v4i32(<4 x i32>
+// CHECK: ret i32 %hlsl.dot
 int test_dot_int4(int4 p0, int4 p1) { return dot(p0, p1); }
 
-// CHECK: %dx.dot = mul i32 %0, %1
-// CHECK: ret i32 %dx.dot
+// CHECK: %hlsl.dot = mul i32
+// CHECK: ret i32 %hlsl.dot
 uint test_dot_uint(uint p0, uint p1) { return dot(p0, p1); }
 
-// CHECK: %dx.dot = call i32 @llvm.dx.udot.v2i32(<2 x i32> %0, <2 x i32> %1)
-// CHECK: ret i32 %dx.dot
+// CHECK: %hlsl.dot = call i32 @llvm.[[ICF]].udot.v2i32(<2 x i32>
+// CHECK: ret i32 %hlsl.dot
 uint test_dot_uint2(uint2 p0, uint2 p1) { return dot(p0, p1); }
 
-// CHECK: %dx.dot = call i32 @llvm.dx.udot.v3i32(<3 x i32> %0, <3 x i32> %1)
-// CHECK: ret i32 %dx.dot
+// CHECK: %hlsl.dot = call i32 @llvm.[[ICF]].udot.v3i32(<3 x i32>
+// CHECK: ret i32 %hlsl.dot
 uint test_dot_uint3(uint3 p0, uint3 p1) { return dot(p0, p1); }
 
-// CHECK: %dx.dot = call i32 @llvm.dx.udot.v4i32(<4 x i32> %0, <4 x i32> %1)
-// CHECK: ret i32 %dx.dot
+// CHECK: %hlsl.dot = call i32 @llvm.[[ICF]].udot.v4i32(<4 x i32>
+// CHECK: ret i32 %hlsl.dot
 uint test_dot_uint4(uint4 p0, uint4 p1) { return dot(p0, p1); }
 
-// CHECK: %dx.dot = mul i64 %0, %1
-// CHECK: ret i64 %dx.dot
+// CHECK: %hlsl.dot = mul i64
+// CHECK: ret i64 %hlsl.dot
 int64_t test_dot_long(int64_t p0, int64_t p1) { return dot(p0, p1); }
 
-// CHECK: %dx.dot = call i64 @llvm.dx.sdot.v2i64(<2 x i64> %0, <2 x i64> %1)
-// CH

[clang] [llvm] [HLSL][SPIRV]Add SPIRV generation for HLSL dot (PR #104656)

2024-08-19 Thread Greg Roth via cfe-commits


@@ -1,161 +1,172 @@
 // RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
 // RUN:   dxil-pc-shadermodel6.3-library %s -fnative-half-type \
 // RUN:   -emit-llvm -disable-llvm-passes -o - | FileCheck %s \
-// RUN:   --check-prefixes=CHECK,NATIVE_HALF
+// RUN:   --check-prefixes=CHECK,DXCHECK,NATIVE_HALF
 // RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
 // RUN:   dxil-pc-shadermodel6.3-library %s -emit-llvm -disable-llvm-passes \
-// RUN:   -o - | FileCheck %s --check-prefixes=CHECK,NO_HALF
+// RUN:   -o - | FileCheck %s --check-prefixes=CHECK,DXCHECK,NO_HALF
 
-#ifdef __HLSL_ENABLE_16_BIT
-// NATIVE_HALF: %dx.dot = mul i16 %0, %1
-// NATIVE_HALF: ret i16 %dx.dot
-int16_t test_dot_short(int16_t p0, int16_t p1) { return dot(p0, p1); }
-
-// NATIVE_HALF: %dx.dot = call i16 @llvm.dx.sdot.v2i16(<2 x i16> %0, <2 x i16> 
%1)
-// NATIVE_HALF: ret i16 %dx.dot
-int16_t test_dot_short2(int16_t2 p0, int16_t2 p1) { return dot(p0, p1); }
-
-// NATIVE_HALF: %dx.dot = call i16 @llvm.dx.sdot.v3i16(<3 x i16> %0, <3 x i16> 
%1)
-// NATIVE_HALF: ret i16 %dx.dot
-int16_t test_dot_short3(int16_t3 p0, int16_t3 p1) { return dot(p0, p1); }
-
-// NATIVE_HALF: %dx.dot = call i16 @llvm.dx.sdot.v4i16(<4 x i16> %0, <4 x i16> 
%1)
-// NATIVE_HALF: ret i16 %dx.dot
-int16_t test_dot_short4(int16_t4 p0, int16_t4 p1) { return dot(p0, p1); }
-
-// NATIVE_HALF: %dx.dot = mul i16 %0, %1
-// NATIVE_HALF: ret i16 %dx.dot
-uint16_t test_dot_ushort(uint16_t p0, uint16_t p1) { return dot(p0, p1); }
-
-// NATIVE_HALF: %dx.dot = call i16 @llvm.dx.udot.v2i16(<2 x i16> %0, <2 x i16> 
%1)
-// NATIVE_HALF: ret i16 %dx.dot
-uint16_t test_dot_ushort2(uint16_t2 p0, uint16_t2 p1) { return dot(p0, p1); }
-
-// NATIVE_HALF: %dx.dot = call i16 @llvm.dx.udot.v3i16(<3 x i16> %0, <3 x i16> 
%1)
-// NATIVE_HALF: ret i16 %dx.dot
-uint16_t test_dot_ushort3(uint16_t3 p0, uint16_t3 p1) { return dot(p0, p1); }
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
+// RUN:   spirv-unknown-vulkan-compute %s -fnative-half-type \
+// RUN:   -emit-llvm -disable-llvm-passes -o - | FileCheck %s \
+// RUN:   --check-prefixes=CHECK,SPVCHECK,NATIVE_HALF
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
+// RUN:   spirv-unknown-vulkan-compute %s -emit-llvm -disable-llvm-passes \
+// RUN:   -o - | FileCheck %s --check-prefixes=CHECK,SPVCHECK,NO_HALF
 
-// NATIVE_HALF: %dx.dot = call i16 @llvm.dx.udot.v4i16(<4 x i16> %0, <4 x i16> 
%1)
-// NATIVE_HALF: ret i16 %dx.dot
-uint16_t test_dot_ushort4(uint16_t4 p0, uint16_t4 p1) { return dot(p0, p1); }
-#endif
 
-// CHECK: %dx.dot = mul i32 %0, %1
-// CHECK: ret i32 %dx.dot
+// CHECK: %hlsl.dot = mul i32
+// CHECK: ret i32 %hlsl.dot
 int test_dot_int(int p0, int p1) { return dot(p0, p1); }
 
-// CHECK: %dx.dot = call i32 @llvm.dx.sdot.v2i32(<2 x i32> %0, <2 x i32> %1)
-// CHECK: ret i32 %dx.dot
+// Capture the expected interchange format so not every check needs to be 
duplicated
+// DXCHECK: %hlsl.dot = call i32 @llvm.[[ICF:dx]].sdot.v2i32(<2 x i32>
+// SPVCHECK: %hlsl.dot = call i32 @llvm.[[ICF:spv]].sdot.v2i32(<2 x i32>
+// CHECK: ret i32 %hlsl.dot
 int test_dot_int2(int2 p0, int2 p1) { return dot(p0, p1); }
 
-// CHECK: %dx.dot = call i32 @llvm.dx.sdot.v3i32(<3 x i32> %0, <3 x i32> %1)
-// CHECK: ret i32 %dx.dot
+// CHECK: %hlsl.dot = call i32 @llvm.[[ICF]].sdot.v3i32(<3 x i32>

pow2clk wrote:

I opted to chop off the rest instead of delve into regexpamancy because that's 
what other tests in this directory have done.

https://github.com/llvm/llvm-project/pull/104656
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [HLSL][SPIRV]Add SPIRV generation for HLSL dot (PR #104656)

2024-08-19 Thread Greg Roth via cfe-commits

https://github.com/pow2clk updated 
https://github.com/llvm/llvm-project/pull/104656

>From 9aff63478b76f042c05b7ae3dd1a2c099dc615de Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Fri, 2 Aug 2024 20:10:04 -0600
Subject: [PATCH 1/5] Add SPIRV generation for HLSL dot

This adds the SPIRV fdot, sdot, and udot intrinsics and allows
them to be created at codegen depending on the target architecture.
This required moving some of the DXIL-specific choices to DXIL
instruction expansion out of codegen and providing it with at a
more generic fdot intrinsic as well.

Removed some stale comments that gave the obsolete impression that
type conversions should be expected to match overloads.

The SPIRV intrinsic handling involves generating multiply and add
operations for integers and the existing OpDot operation for
floating point.

New tests for generating SPIRV float and integer dot intrinsics are
added as well.

Incidentally changed existing dot intrinsic definitions to use
DefaultAttrsIntrinsic to match the newly added inrinsics

Fixes #88056
---
 clang/lib/CodeGen/CGBuiltin.cpp   |  47 +++--
 clang/lib/CodeGen/CGHLSLRuntime.h |   3 +
 .../CodeGenHLSL/builtins/dot-builtin.hlsl |  12 +-
 clang/test/CodeGenHLSL/builtins/dot.hlsl  | 160 +-
 llvm/include/llvm/IR/IntrinsicsDirectX.td |  34 ++--
 llvm/include/llvm/IR/IntrinsicsSPIRV.td   |  12 ++
 llvm/lib/Target/DirectX/DXIL.td   |   6 +-
 .../Target/DirectX/DXILIntrinsicExpansion.cpp |  67 ++--
 .../Target/SPIRV/SPIRVInstructionSelector.cpp |  74 
 llvm/test/CodeGen/DirectX/fdot.ll | 117 +++--
 llvm/test/CodeGen/DirectX/idot.ll |  24 +--
 .../CodeGen/SPIRV/hlsl-intrinsics/fdot.ll |  75 
 .../CodeGen/SPIRV/hlsl-intrinsics/idot.ll |  88 ++
 13 files changed, 508 insertions(+), 211 deletions(-)
 create mode 100644 llvm/test/CodeGen/SPIRV/hlsl-intrinsics/fdot.ll
 create mode 100644 llvm/test/CodeGen/SPIRV/hlsl-intrinsics/idot.ll

diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index f424ddaa175400..5c49e71df3fcfa 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -18471,22 +18471,14 @@ llvm::Value 
*CodeGenFunction::EmitScalarOrConstFoldImmArg(unsigned ICEArguments,
   return Arg;
 }
 
-Intrinsic::ID getDotProductIntrinsic(QualType QT, int elementCount) {
-  if (QT->hasFloatingRepresentation()) {
-switch (elementCount) {
-case 2:
-  return Intrinsic::dx_dot2;
-case 3:
-  return Intrinsic::dx_dot3;
-case 4:
-  return Intrinsic::dx_dot4;
-}
-  }
-  if (QT->hasSignedIntegerRepresentation())
-return Intrinsic::dx_sdot;
-
-  assert(QT->hasUnsignedIntegerRepresentation());
-  return Intrinsic::dx_udot;
+// Return dot product intrinsic that corresponds to the QT scalar type
+Intrinsic::ID getDotProductIntrinsic(CGHLSLRuntime &RT, QualType QT) {
+  if (QT->isFloatingType())
+return RT.getFDotIntrinsic();
+  if (QT->isSignedIntegerType())
+return RT.getSDotIntrinsic();
+  assert(QT->isUnsignedIntegerType());
+  return RT.getUDotIntrinsic();
 }
 
 Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID,
@@ -18529,37 +18521,38 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned 
BuiltinID,
 Value *Op1 = EmitScalarExpr(E->getArg(1));
 llvm::Type *T0 = Op0->getType();
 llvm::Type *T1 = Op1->getType();
+
+// If the arguments are scalars, just emit a multiply
 if (!T0->isVectorTy() && !T1->isVectorTy()) {
   if (T0->isFloatingPointTy())
-return Builder.CreateFMul(Op0, Op1, "dx.dot");
+return Builder.CreateFMul(Op0, Op1, "hlsl.dot");
 
   if (T0->isIntegerTy())
-return Builder.CreateMul(Op0, Op1, "dx.dot");
+return Builder.CreateMul(Op0, Op1, "hlsl.dot");
 
-  // Bools should have been promoted
   llvm_unreachable(
   "Scalar dot product is only supported on ints and floats.");
 }
+// For vectors, validate types and emit the appropriate intrinsic
+
 // A VectorSplat should have happened
 assert(T0->isVectorTy() && T1->isVectorTy() &&
"Dot product of vector and scalar is not supported.");
 
-// A vector sext or sitofp should have happened
-assert(T0->getScalarType() == T1->getScalarType() &&
-   "Dot product of vectors need the same element types.");
-
 auto *VecTy0 = E->getArg(0)->getType()->getAs();
 [[maybe_unused]] auto *VecTy1 =
 E->getArg(1)->getType()->getAs();
-// A HLSLVectorTruncation should have happend
+
+assert(VecTy0->getElementType() == VecTy1->getElementType() &&
+   "Dot product of vectors need the same element types.");
+
 assert(VecTy0->getNumElements() == VecTy1->getNumElements() &&
"Dot product requires vectors to be of the same size.");
 
 return Builder.CreateIntrinsic(
 /*ReturnType=*/T0->getScalarType(),
-getDotProductIntrinsic

[clang] [llvm] [HLSL][SPIRV]Add SPIRV generation for HLSL dot (PR #104656)

2024-08-19 Thread Greg Roth via cfe-commits

https://github.com/pow2clk updated 
https://github.com/llvm/llvm-project/pull/104656

>From 9aff63478b76f042c05b7ae3dd1a2c099dc615de Mon Sep 17 00:00:00 2001
From: Greg Roth 
Date: Fri, 2 Aug 2024 20:10:04 -0600
Subject: [PATCH 1/6] Add SPIRV generation for HLSL dot

This adds the SPIRV fdot, sdot, and udot intrinsics and allows
them to be created at codegen depending on the target architecture.
This required moving some of the DXIL-specific choices to DXIL
instruction expansion out of codegen and providing it with at a
more generic fdot intrinsic as well.

Removed some stale comments that gave the obsolete impression that
type conversions should be expected to match overloads.

The SPIRV intrinsic handling involves generating multiply and add
operations for integers and the existing OpDot operation for
floating point.

New tests for generating SPIRV float and integer dot intrinsics are
added as well.

Incidentally changed existing dot intrinsic definitions to use
DefaultAttrsIntrinsic to match the newly added inrinsics

Fixes #88056
---
 clang/lib/CodeGen/CGBuiltin.cpp   |  47 +++--
 clang/lib/CodeGen/CGHLSLRuntime.h |   3 +
 .../CodeGenHLSL/builtins/dot-builtin.hlsl |  12 +-
 clang/test/CodeGenHLSL/builtins/dot.hlsl  | 160 +-
 llvm/include/llvm/IR/IntrinsicsDirectX.td |  34 ++--
 llvm/include/llvm/IR/IntrinsicsSPIRV.td   |  12 ++
 llvm/lib/Target/DirectX/DXIL.td   |   6 +-
 .../Target/DirectX/DXILIntrinsicExpansion.cpp |  67 ++--
 .../Target/SPIRV/SPIRVInstructionSelector.cpp |  74 
 llvm/test/CodeGen/DirectX/fdot.ll | 117 +++--
 llvm/test/CodeGen/DirectX/idot.ll |  24 +--
 .../CodeGen/SPIRV/hlsl-intrinsics/fdot.ll |  75 
 .../CodeGen/SPIRV/hlsl-intrinsics/idot.ll |  88 ++
 13 files changed, 508 insertions(+), 211 deletions(-)
 create mode 100644 llvm/test/CodeGen/SPIRV/hlsl-intrinsics/fdot.ll
 create mode 100644 llvm/test/CodeGen/SPIRV/hlsl-intrinsics/idot.ll

diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index f424ddaa175400..5c49e71df3fcfa 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -18471,22 +18471,14 @@ llvm::Value 
*CodeGenFunction::EmitScalarOrConstFoldImmArg(unsigned ICEArguments,
   return Arg;
 }
 
-Intrinsic::ID getDotProductIntrinsic(QualType QT, int elementCount) {
-  if (QT->hasFloatingRepresentation()) {
-switch (elementCount) {
-case 2:
-  return Intrinsic::dx_dot2;
-case 3:
-  return Intrinsic::dx_dot3;
-case 4:
-  return Intrinsic::dx_dot4;
-}
-  }
-  if (QT->hasSignedIntegerRepresentation())
-return Intrinsic::dx_sdot;
-
-  assert(QT->hasUnsignedIntegerRepresentation());
-  return Intrinsic::dx_udot;
+// Return dot product intrinsic that corresponds to the QT scalar type
+Intrinsic::ID getDotProductIntrinsic(CGHLSLRuntime &RT, QualType QT) {
+  if (QT->isFloatingType())
+return RT.getFDotIntrinsic();
+  if (QT->isSignedIntegerType())
+return RT.getSDotIntrinsic();
+  assert(QT->isUnsignedIntegerType());
+  return RT.getUDotIntrinsic();
 }
 
 Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID,
@@ -18529,37 +18521,38 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned 
BuiltinID,
 Value *Op1 = EmitScalarExpr(E->getArg(1));
 llvm::Type *T0 = Op0->getType();
 llvm::Type *T1 = Op1->getType();
+
+// If the arguments are scalars, just emit a multiply
 if (!T0->isVectorTy() && !T1->isVectorTy()) {
   if (T0->isFloatingPointTy())
-return Builder.CreateFMul(Op0, Op1, "dx.dot");
+return Builder.CreateFMul(Op0, Op1, "hlsl.dot");
 
   if (T0->isIntegerTy())
-return Builder.CreateMul(Op0, Op1, "dx.dot");
+return Builder.CreateMul(Op0, Op1, "hlsl.dot");
 
-  // Bools should have been promoted
   llvm_unreachable(
   "Scalar dot product is only supported on ints and floats.");
 }
+// For vectors, validate types and emit the appropriate intrinsic
+
 // A VectorSplat should have happened
 assert(T0->isVectorTy() && T1->isVectorTy() &&
"Dot product of vector and scalar is not supported.");
 
-// A vector sext or sitofp should have happened
-assert(T0->getScalarType() == T1->getScalarType() &&
-   "Dot product of vectors need the same element types.");
-
 auto *VecTy0 = E->getArg(0)->getType()->getAs();
 [[maybe_unused]] auto *VecTy1 =
 E->getArg(1)->getType()->getAs();
-// A HLSLVectorTruncation should have happend
+
+assert(VecTy0->getElementType() == VecTy1->getElementType() &&
+   "Dot product of vectors need the same element types.");
+
 assert(VecTy0->getNumElements() == VecTy1->getNumElements() &&
"Dot product requires vectors to be of the same size.");
 
 return Builder.CreateIntrinsic(
 /*ReturnType=*/T0->getScalarType(),
-getDotProductIntrinsic

  1   2   3   >