[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-11-30 Thread Matt Arsenault via cfe-commits

arsenm wrote:

> lgtm, but a strict reading of the spec would filter out arbitrary other 
> ext_vector_types

Still think this would be a good follow up 

https://github.com/llvm/llvm-project/pull/66651
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-11-30 Thread Matt Arsenault via cfe-commits

https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/66651
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-11-15 Thread Romaric Jodin via cfe-commits

rjodinchr wrote:

@AnastasiaStulova Could you please take a look? thanks

https://github.com/llvm/llvm-project/pull/66651
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-11-03 Thread Romaric Jodin via cfe-commits

rjodinchr wrote:

What is the next step to get this PR merged?

https://github.com/llvm/llvm-project/pull/66651
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-10-27 Thread Matt Arsenault via cfe-commits

https://github.com/arsenm approved this pull request.

lgtm, but a strict reading of the spec would filter out arbitrary other 
ext_vector_types 

https://github.com/llvm/llvm-project/pull/66651
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-10-27 Thread Matt Arsenault via cfe-commits


@@ -5612,6 +5612,10 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
,
   BundleList);
 EmitBlock(Cont);
   }
+  if (CI->getCalledFunction() && CI->getCalledFunction()->hasName() &&
+  CI->getCalledFunction()->getName().startswith("_Z4sqrt")) {

arsenm wrote:

ext_vector_type is an implementation detail. The standardized vector types just 
happen to be typedefs of ext_vector_type.  It's not undefined behavior, it's 
just something out of spec you can do

https://github.com/llvm/llvm-project/pull/66651
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-10-24 Thread Romaric Jodin via cfe-commits

rjodinchr wrote:

@arsenm?

https://github.com/llvm/llvm-project/pull/66651
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-10-18 Thread via cfe-commits

alan-baker wrote:

This is preferable to defining the function in the OpenCL headers (as was noted 
in the original PR). To me, the important part is covering the OpenCL use case. 
So if we want to only check the overloads OpenCL generates I think that would 
be ok, but I expect this generic implementation is sufficient.

https://github.com/llvm/llvm-project/pull/66651
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-10-18 Thread Romaric Jodin via cfe-commits

rjodinchr wrote:

@arsenm could you have another look at this PR?
Thank you

https://github.com/llvm/llvm-project/pull/66651
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-10-02 Thread Romaric Jodin via cfe-commits


@@ -5612,6 +5612,10 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
,
   BundleList);
 EmitBlock(Cont);
   }
+  if (CI->getCalledFunction() && CI->getCalledFunction()->hasName() &&
+  CI->getCalledFunction()->getName().startswith("_Z4sqrt")) {

rjodinchr wrote:

@arsenm ?

https://github.com/llvm/llvm-project/pull/66651
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-09-25 Thread Romaric Jodin via cfe-commits


@@ -5612,6 +5612,10 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
,
   BundleList);
 EmitBlock(Cont);
   }
+  if (CI->getCalledFunction() && CI->getCalledFunction()->hasName() &&
+  CI->getCalledFunction()->getName().startswith("_Z4sqrt")) {

rjodinchr wrote:

Using a custom `ext_vector_type` in OpenCL feels like an undefined behaviour. I 
don't see anything about it in the specification.
Am I missing something?

https://github.com/llvm/llvm-project/pull/66651
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-09-25 Thread Matt Arsenault via cfe-commits


@@ -5612,6 +5612,10 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
,
   BundleList);
 EmitBlock(Cont);
   }
+  if (CI->getCalledFunction() && CI->getCalledFunction()->hasName() &&
+  CI->getCalledFunction()->getName().startswith("_Z4sqrt")) {

arsenm wrote:

Maybe it's good enough, I guess the Z4 filters out other prefixes. Not sure 
about arbitrary custom ext_vector_type

https://github.com/llvm/llvm-project/pull/66651
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-09-25 Thread Matt Arsenault via cfe-commits


@@ -5612,6 +5612,10 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
,
   BundleList);
 EmitBlock(Cont);
   }
+  if (CI->getCalledFunction() && CI->getCalledFunction()->hasName() &&
+  CI->getCalledFunction()->getName().startswith("_Z4sqrt")) {

arsenm wrote:

This isn't a specific enough filter. Exactly match the full name for all the 
types?

https://github.com/llvm/llvm-project/pull/66651
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-09-25 Thread Romaric Jodin via cfe-commits

rjodinchr wrote:

@arsenm Could you review this PR again please?

https://github.com/llvm/llvm-project/pull/66651
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-09-20 Thread Romaric Jodin via cfe-commits

https://github.com/rjodinchr resolved 
https://github.com/llvm/llvm-project/pull/66651
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-09-18 Thread Romaric Jodin via cfe-commits

https://github.com/rjodinchr updated 
https://github.com/llvm/llvm-project/pull/66651

>From b6df142239256e979a70896f324f9ed3547c640c Mon Sep 17 00:00:00 2001
From: Romaric Jodin 
Date: Mon, 18 Sep 2023 09:34:56 +0200
Subject: [PATCH 1/2] Revert "clang/OpenCL: Add inline implementations of sqrt
 in builtin header"

This reverts commit 15e0fe0b6122e32657b98daf74a1fce028d2e5bf.
---
 clang/lib/Headers/opencl-c-base.h   |  58 ---
 clang/lib/Headers/opencl-c.h|  26 +++
 clang/lib/Sema/OpenCLBuiltins.td|   5 +-
 clang/test/CodeGenOpenCL/sqrt-fpmath.cl | 201 
 4 files changed, 27 insertions(+), 263 deletions(-)
 delete mode 100644 clang/test/CodeGenOpenCL/sqrt-fpmath.cl

diff --git a/clang/lib/Headers/opencl-c-base.h 
b/clang/lib/Headers/opencl-c-base.h
index d56e5ceae652ad5..2494f6213fc5695 100644
--- a/clang/lib/Headers/opencl-c-base.h
+++ b/clang/lib/Headers/opencl-c-base.h
@@ -819,64 +819,6 @@ int printf(__constant const char* st, ...) 
__attribute__((format(printf, 1, 2)))
 
 #endif // cl_intel_device_side_avc_motion_estimation
 
-/**
- * Compute square root.
- *
- * Provide inline implementations using the builtin so that we get appropriate
- * !fpmath based on -cl-fp32-correctly-rounded-divide-sqrt, attached to
- * llvm.sqrt. The implementation should still provide an external definition.
- */
-#define __ovld __attribute__((overloadable))
-#define __cnfn __attribute__((const))
-
-inline float __ovld __cnfn sqrt(float __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-inline float2 __ovld __cnfn sqrt(float2 __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-inline float3 __ovld __cnfn sqrt(float3 __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-inline float4 __ovld __cnfn sqrt(float4 __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-inline float8 __ovld __cnfn sqrt(float8 __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-inline float16 __ovld __cnfn sqrt(float16 __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-// We only really want to define the float variants here. However
-// -fdeclare-opencl-builtins will not work if some overloads are already
- // provided in the base header, so provide all overloads here.
-
-#ifdef cl_khr_fp64
-double __ovld __cnfn sqrt(double);
-double2 __ovld __cnfn sqrt(double2);
-double3 __ovld __cnfn sqrt(double3);
-double4 __ovld __cnfn sqrt(double4);
-double8 __ovld __cnfn sqrt(double8);
-double16 __ovld __cnfn sqrt(double16);
-#endif //cl_khr_fp64
-#ifdef cl_khr_fp16
-half __ovld __cnfn sqrt(half);
-half2 __ovld __cnfn sqrt(half2);
-half3 __ovld __cnfn sqrt(half3);
-half4 __ovld __cnfn sqrt(half4);
-half8 __ovld __cnfn sqrt(half8);
-half16 __ovld __cnfn sqrt(half16);
-#endif //cl_khr_fp16
-
-#undef __cnfn
-#undef __ovld
-
 // Disable any extensions we may have enabled previously.
 #pragma OPENCL EXTENSION all : disable
 
diff --git a/clang/lib/Headers/opencl-c.h b/clang/lib/Headers/opencl-c.h
index 1efbbf8f8ee6a01..288bb18bc654ebc 100644
--- a/clang/lib/Headers/opencl-c.h
+++ b/clang/lib/Headers/opencl-c.h
@@ -8496,6 +8496,32 @@ half8 __ovld __cnfn sinpi(half8);
 half16 __ovld __cnfn sinpi(half16);
 #endif //cl_khr_fp16
 
+/**
+ * Compute square root.
+ */
+float __ovld __cnfn sqrt(float);
+float2 __ovld __cnfn sqrt(float2);
+float3 __ovld __cnfn sqrt(float3);
+float4 __ovld __cnfn sqrt(float4);
+float8 __ovld __cnfn sqrt(float8);
+float16 __ovld __cnfn sqrt(float16);
+#ifdef cl_khr_fp64
+double __ovld __cnfn sqrt(double);
+double2 __ovld __cnfn sqrt(double2);
+double3 __ovld __cnfn sqrt(double3);
+double4 __ovld __cnfn sqrt(double4);
+double8 __ovld __cnfn sqrt(double8);
+double16 __ovld __cnfn sqrt(double16);
+#endif //cl_khr_fp64
+#ifdef cl_khr_fp16
+half __ovld __cnfn sqrt(half);
+half2 __ovld __cnfn sqrt(half2);
+half3 __ovld __cnfn sqrt(half3);
+half4 __ovld __cnfn sqrt(half4);
+half8 __ovld __cnfn sqrt(half8);
+half16 __ovld __cnfn sqrt(half16);
+#endif //cl_khr_fp16
+
 /**
  * Compute tangent.
  */
diff --git a/clang/lib/Sema/OpenCLBuiltins.td b/clang/lib/Sema/OpenCLBuiltins.td
index 9db450281912d2f..0cceba090bd8f26 100644
--- a/clang/lib/Sema/OpenCLBuiltins.td
+++ b/clang/lib/Sema/OpenCLBuiltins.td
@@ -563,15 +563,12 @@ foreach name = ["acos", "acosh", "acospi",
 "log", "log2", "log10", "log1p", "logb",
 "rint", "round", "rsqrt",
 "sin", "sinh", "sinpi",
+"sqrt",
 "tan", "tanh", "tanpi",
 "tgamma", "trunc",
 "lgamma"] in {
 def : Builtin;
 }
-
-// sqrt is handled in opencl-c-base.h to handle
-// -cl-fp32-correctly-rounded-divide-sqrt.
-
 foreach name = ["nan"] in {
   def : Builtin;
   def : Builtin;
diff --git a/clang/test/CodeGenOpenCL/sqrt-fpmath.cl 
b/clang/test/CodeGenOpenCL/sqrt-fpmath.cl
deleted file mode 100644
index df30085cba2e7d5..000
--- a/clang/test/CodeGenOpenCL/sqrt-fpmath.cl
+++ /dev/null
@@ -1,201 +0,0 @@
-// Test that float variants of 

[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-09-18 Thread Romaric Jodin via cfe-commits


@@ -5612,6 +5612,10 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
,
   BundleList);
 EmitBlock(Cont);
   }
+  if (CI->getCalledFunction() && CI->getCalledFunction()->hasName() &&
+  CI->getCalledFunction()->getName().contains("Z4sqrt")) {

rjodinchr wrote:

the language is checked by `SetSqrtFPAccuracy`. I'll have a look at a better 
match for the name

https://github.com/llvm/llvm-project/pull/66651
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-09-18 Thread Matt Arsenault via cfe-commits


@@ -5612,6 +5612,10 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
,
   BundleList);
 EmitBlock(Cont);
   }
+  if (CI->getCalledFunction() && CI->getCalledFunction()->hasName() &&
+  CI->getCalledFunction()->getName().contains("Z4sqrt")) {

arsenm wrote:

Probably should check the language, and exact match on the permissible names 

https://github.com/llvm/llvm-project/pull/66651
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-09-18 Thread Romaric Jodin via cfe-commits

https://github.com/rjodinchr updated 
https://github.com/llvm/llvm-project/pull/66651

>From b6df142239256e979a70896f324f9ed3547c640c Mon Sep 17 00:00:00 2001
From: Romaric Jodin 
Date: Mon, 18 Sep 2023 09:34:56 +0200
Subject: [PATCH 1/2] Revert "clang/OpenCL: Add inline implementations of sqrt
 in builtin header"

This reverts commit 15e0fe0b6122e32657b98daf74a1fce028d2e5bf.
---
 clang/lib/Headers/opencl-c-base.h   |  58 ---
 clang/lib/Headers/opencl-c.h|  26 +++
 clang/lib/Sema/OpenCLBuiltins.td|   5 +-
 clang/test/CodeGenOpenCL/sqrt-fpmath.cl | 201 
 4 files changed, 27 insertions(+), 263 deletions(-)
 delete mode 100644 clang/test/CodeGenOpenCL/sqrt-fpmath.cl

diff --git a/clang/lib/Headers/opencl-c-base.h 
b/clang/lib/Headers/opencl-c-base.h
index d56e5ceae652ad5..2494f6213fc5695 100644
--- a/clang/lib/Headers/opencl-c-base.h
+++ b/clang/lib/Headers/opencl-c-base.h
@@ -819,64 +819,6 @@ int printf(__constant const char* st, ...) 
__attribute__((format(printf, 1, 2)))
 
 #endif // cl_intel_device_side_avc_motion_estimation
 
-/**
- * Compute square root.
- *
- * Provide inline implementations using the builtin so that we get appropriate
- * !fpmath based on -cl-fp32-correctly-rounded-divide-sqrt, attached to
- * llvm.sqrt. The implementation should still provide an external definition.
- */
-#define __ovld __attribute__((overloadable))
-#define __cnfn __attribute__((const))
-
-inline float __ovld __cnfn sqrt(float __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-inline float2 __ovld __cnfn sqrt(float2 __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-inline float3 __ovld __cnfn sqrt(float3 __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-inline float4 __ovld __cnfn sqrt(float4 __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-inline float8 __ovld __cnfn sqrt(float8 __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-inline float16 __ovld __cnfn sqrt(float16 __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-// We only really want to define the float variants here. However
-// -fdeclare-opencl-builtins will not work if some overloads are already
- // provided in the base header, so provide all overloads here.
-
-#ifdef cl_khr_fp64
-double __ovld __cnfn sqrt(double);
-double2 __ovld __cnfn sqrt(double2);
-double3 __ovld __cnfn sqrt(double3);
-double4 __ovld __cnfn sqrt(double4);
-double8 __ovld __cnfn sqrt(double8);
-double16 __ovld __cnfn sqrt(double16);
-#endif //cl_khr_fp64
-#ifdef cl_khr_fp16
-half __ovld __cnfn sqrt(half);
-half2 __ovld __cnfn sqrt(half2);
-half3 __ovld __cnfn sqrt(half3);
-half4 __ovld __cnfn sqrt(half4);
-half8 __ovld __cnfn sqrt(half8);
-half16 __ovld __cnfn sqrt(half16);
-#endif //cl_khr_fp16
-
-#undef __cnfn
-#undef __ovld
-
 // Disable any extensions we may have enabled previously.
 #pragma OPENCL EXTENSION all : disable
 
diff --git a/clang/lib/Headers/opencl-c.h b/clang/lib/Headers/opencl-c.h
index 1efbbf8f8ee6a01..288bb18bc654ebc 100644
--- a/clang/lib/Headers/opencl-c.h
+++ b/clang/lib/Headers/opencl-c.h
@@ -8496,6 +8496,32 @@ half8 __ovld __cnfn sinpi(half8);
 half16 __ovld __cnfn sinpi(half16);
 #endif //cl_khr_fp16
 
+/**
+ * Compute square root.
+ */
+float __ovld __cnfn sqrt(float);
+float2 __ovld __cnfn sqrt(float2);
+float3 __ovld __cnfn sqrt(float3);
+float4 __ovld __cnfn sqrt(float4);
+float8 __ovld __cnfn sqrt(float8);
+float16 __ovld __cnfn sqrt(float16);
+#ifdef cl_khr_fp64
+double __ovld __cnfn sqrt(double);
+double2 __ovld __cnfn sqrt(double2);
+double3 __ovld __cnfn sqrt(double3);
+double4 __ovld __cnfn sqrt(double4);
+double8 __ovld __cnfn sqrt(double8);
+double16 __ovld __cnfn sqrt(double16);
+#endif //cl_khr_fp64
+#ifdef cl_khr_fp16
+half __ovld __cnfn sqrt(half);
+half2 __ovld __cnfn sqrt(half2);
+half3 __ovld __cnfn sqrt(half3);
+half4 __ovld __cnfn sqrt(half4);
+half8 __ovld __cnfn sqrt(half8);
+half16 __ovld __cnfn sqrt(half16);
+#endif //cl_khr_fp16
+
 /**
  * Compute tangent.
  */
diff --git a/clang/lib/Sema/OpenCLBuiltins.td b/clang/lib/Sema/OpenCLBuiltins.td
index 9db450281912d2f..0cceba090bd8f26 100644
--- a/clang/lib/Sema/OpenCLBuiltins.td
+++ b/clang/lib/Sema/OpenCLBuiltins.td
@@ -563,15 +563,12 @@ foreach name = ["acos", "acosh", "acospi",
 "log", "log2", "log10", "log1p", "logb",
 "rint", "round", "rsqrt",
 "sin", "sinh", "sinpi",
+"sqrt",
 "tan", "tanh", "tanpi",
 "tgamma", "trunc",
 "lgamma"] in {
 def : Builtin;
 }
-
-// sqrt is handled in opencl-c-base.h to handle
-// -cl-fp32-correctly-rounded-divide-sqrt.
-
 foreach name = ["nan"] in {
   def : Builtin;
   def : Builtin;
diff --git a/clang/test/CodeGenOpenCL/sqrt-fpmath.cl 
b/clang/test/CodeGenOpenCL/sqrt-fpmath.cl
deleted file mode 100644
index df30085cba2e7d5..000
--- a/clang/test/CodeGenOpenCL/sqrt-fpmath.cl
+++ /dev/null
@@ -1,201 +0,0 @@
-// Test that float variants of 

[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-09-18 Thread Romaric Jodin via cfe-commits

https://github.com/rjodinchr edited 
https://github.com/llvm/llvm-project/pull/66651
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-09-18 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-x86


Changes

This is reverting the previous implementation to avoid adding inline function 
in opencl headers.
This was breaking clspv flow google/clspv#1231, while 
https://reviews.llvm.org/D156743 mentioned that just decoring the call node 
with `!pfmath` was enough.
This PR is implementing this idea.
The test has been updated with this implementation.


---
Full diff: https://github.com/llvm/llvm-project/pull/66651.diff


5 Files Affected:

- (modified) clang/lib/CodeGen/CGCall.cpp (+3) 
- (modified) clang/lib/Headers/opencl-c-base.h (-58) 
- (modified) clang/lib/Headers/opencl-c.h (+26) 
- (modified) clang/lib/Sema/OpenCLBuiltins.td (+1-4) 
- (modified) clang/test/CodeGenOpenCL/sqrt-fpmath.cl (+51-73) 


``diff
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index e15a4634b1d041b..98138811de7f8df 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -5612,6 +5612,9 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
,
   BundleList);
 EmitBlock(Cont);
   }
+  if (CI->getCalledFunction()->getName().contains("Z4sqrt")) {
+SetSqrtFPAccuracy(CI);
+  }
   if (callOrInvoke)
 *callOrInvoke = CI;
 
diff --git a/clang/lib/Headers/opencl-c-base.h 
b/clang/lib/Headers/opencl-c-base.h
index d56e5ceae652ad5..2494f6213fc5695 100644
--- a/clang/lib/Headers/opencl-c-base.h
+++ b/clang/lib/Headers/opencl-c-base.h
@@ -819,64 +819,6 @@ int printf(__constant const char* st, ...) 
__attribute__((format(printf, 1, 2)))
 
 #endif // cl_intel_device_side_avc_motion_estimation
 
-/**
- * Compute square root.
- *
- * Provide inline implementations using the builtin so that we get appropriate
- * !fpmath based on -cl-fp32-correctly-rounded-divide-sqrt, attached to
- * llvm.sqrt. The implementation should still provide an external definition.
- */
-#define __ovld __attribute__((overloadable))
-#define __cnfn __attribute__((const))
-
-inline float __ovld __cnfn sqrt(float __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-inline float2 __ovld __cnfn sqrt(float2 __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-inline float3 __ovld __cnfn sqrt(float3 __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-inline float4 __ovld __cnfn sqrt(float4 __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-inline float8 __ovld __cnfn sqrt(float8 __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-inline float16 __ovld __cnfn sqrt(float16 __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-// We only really want to define the float variants here. However
-// -fdeclare-opencl-builtins will not work if some overloads are already
- // provided in the base header, so provide all overloads here.
-
-#ifdef cl_khr_fp64
-double __ovld __cnfn sqrt(double);
-double2 __ovld __cnfn sqrt(double2);
-double3 __ovld __cnfn sqrt(double3);
-double4 __ovld __cnfn sqrt(double4);
-double8 __ovld __cnfn sqrt(double8);
-double16 __ovld __cnfn sqrt(double16);
-#endif //cl_khr_fp64
-#ifdef cl_khr_fp16
-half __ovld __cnfn sqrt(half);
-half2 __ovld __cnfn sqrt(half2);
-half3 __ovld __cnfn sqrt(half3);
-half4 __ovld __cnfn sqrt(half4);
-half8 __ovld __cnfn sqrt(half8);
-half16 __ovld __cnfn sqrt(half16);
-#endif //cl_khr_fp16
-
-#undef __cnfn
-#undef __ovld
-
 // Disable any extensions we may have enabled previously.
 #pragma OPENCL EXTENSION all : disable
 
diff --git a/clang/lib/Headers/opencl-c.h b/clang/lib/Headers/opencl-c.h
index 1efbbf8f8ee6a01..288bb18bc654ebc 100644
--- a/clang/lib/Headers/opencl-c.h
+++ b/clang/lib/Headers/opencl-c.h
@@ -8496,6 +8496,32 @@ half8 __ovld __cnfn sinpi(half8);
 half16 __ovld __cnfn sinpi(half16);
 #endif //cl_khr_fp16
 
+/**
+ * Compute square root.
+ */
+float __ovld __cnfn sqrt(float);
+float2 __ovld __cnfn sqrt(float2);
+float3 __ovld __cnfn sqrt(float3);
+float4 __ovld __cnfn sqrt(float4);
+float8 __ovld __cnfn sqrt(float8);
+float16 __ovld __cnfn sqrt(float16);
+#ifdef cl_khr_fp64
+double __ovld __cnfn sqrt(double);
+double2 __ovld __cnfn sqrt(double2);
+double3 __ovld __cnfn sqrt(double3);
+double4 __ovld __cnfn sqrt(double4);
+double8 __ovld __cnfn sqrt(double8);
+double16 __ovld __cnfn sqrt(double16);
+#endif //cl_khr_fp64
+#ifdef cl_khr_fp16
+half __ovld __cnfn sqrt(half);
+half2 __ovld __cnfn sqrt(half2);
+half3 __ovld __cnfn sqrt(half3);
+half4 __ovld __cnfn sqrt(half4);
+half8 __ovld __cnfn sqrt(half8);
+half16 __ovld __cnfn sqrt(half16);
+#endif //cl_khr_fp16
+
 /**
  * Compute tangent.
  */
diff --git a/clang/lib/Sema/OpenCLBuiltins.td b/clang/lib/Sema/OpenCLBuiltins.td
index 9db450281912d2f..0cceba090bd8f26 100644
--- a/clang/lib/Sema/OpenCLBuiltins.td
+++ b/clang/lib/Sema/OpenCLBuiltins.td
@@ -563,15 +563,12 @@ foreach name = ["acos", "acosh", "acospi",
 "log", "log2", "log10", "log1p", "logb",
 "rint", "round", "rsqrt",
 "sin", "sinh", "sinpi",
+"sqrt",
 

[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-09-18 Thread Romaric Jodin via cfe-commits

rjodinchr wrote:

@alan-baker @arsenm @AnastasiaStulova 
Could you review this PR please?
Thank you

https://github.com/llvm/llvm-project/pull/66651
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)

2023-09-18 Thread Romaric Jodin via cfe-commits

https://github.com/rjodinchr created 
https://github.com/llvm/llvm-project/pull/66651

This is reverting the previous implementation to avoid adding inline function 
in opencl headers.
This was breaking clspv flow google/clspv#1231, while 
https://reviews.llvm.org/D156743 mentioned that just decoring the call node 
with `!pfmath` was enough.
This PR is implementing this idea.
The test has been updated with this implementation.


>From b6df142239256e979a70896f324f9ed3547c640c Mon Sep 17 00:00:00 2001
From: Romaric Jodin 
Date: Mon, 18 Sep 2023 09:34:56 +0200
Subject: [PATCH 1/2] Revert "clang/OpenCL: Add inline implementations of sqrt
 in builtin header"

This reverts commit 15e0fe0b6122e32657b98daf74a1fce028d2e5bf.
---
 clang/lib/Headers/opencl-c-base.h   |  58 ---
 clang/lib/Headers/opencl-c.h|  26 +++
 clang/lib/Sema/OpenCLBuiltins.td|   5 +-
 clang/test/CodeGenOpenCL/sqrt-fpmath.cl | 201 
 4 files changed, 27 insertions(+), 263 deletions(-)
 delete mode 100644 clang/test/CodeGenOpenCL/sqrt-fpmath.cl

diff --git a/clang/lib/Headers/opencl-c-base.h 
b/clang/lib/Headers/opencl-c-base.h
index d56e5ceae652ad5..2494f6213fc5695 100644
--- a/clang/lib/Headers/opencl-c-base.h
+++ b/clang/lib/Headers/opencl-c-base.h
@@ -819,64 +819,6 @@ int printf(__constant const char* st, ...) 
__attribute__((format(printf, 1, 2)))
 
 #endif // cl_intel_device_side_avc_motion_estimation
 
-/**
- * Compute square root.
- *
- * Provide inline implementations using the builtin so that we get appropriate
- * !fpmath based on -cl-fp32-correctly-rounded-divide-sqrt, attached to
- * llvm.sqrt. The implementation should still provide an external definition.
- */
-#define __ovld __attribute__((overloadable))
-#define __cnfn __attribute__((const))
-
-inline float __ovld __cnfn sqrt(float __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-inline float2 __ovld __cnfn sqrt(float2 __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-inline float3 __ovld __cnfn sqrt(float3 __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-inline float4 __ovld __cnfn sqrt(float4 __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-inline float8 __ovld __cnfn sqrt(float8 __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-inline float16 __ovld __cnfn sqrt(float16 __x) {
-  return __builtin_elementwise_sqrt(__x);
-}
-
-// We only really want to define the float variants here. However
-// -fdeclare-opencl-builtins will not work if some overloads are already
- // provided in the base header, so provide all overloads here.
-
-#ifdef cl_khr_fp64
-double __ovld __cnfn sqrt(double);
-double2 __ovld __cnfn sqrt(double2);
-double3 __ovld __cnfn sqrt(double3);
-double4 __ovld __cnfn sqrt(double4);
-double8 __ovld __cnfn sqrt(double8);
-double16 __ovld __cnfn sqrt(double16);
-#endif //cl_khr_fp64
-#ifdef cl_khr_fp16
-half __ovld __cnfn sqrt(half);
-half2 __ovld __cnfn sqrt(half2);
-half3 __ovld __cnfn sqrt(half3);
-half4 __ovld __cnfn sqrt(half4);
-half8 __ovld __cnfn sqrt(half8);
-half16 __ovld __cnfn sqrt(half16);
-#endif //cl_khr_fp16
-
-#undef __cnfn
-#undef __ovld
-
 // Disable any extensions we may have enabled previously.
 #pragma OPENCL EXTENSION all : disable
 
diff --git a/clang/lib/Headers/opencl-c.h b/clang/lib/Headers/opencl-c.h
index 1efbbf8f8ee6a01..288bb18bc654ebc 100644
--- a/clang/lib/Headers/opencl-c.h
+++ b/clang/lib/Headers/opencl-c.h
@@ -8496,6 +8496,32 @@ half8 __ovld __cnfn sinpi(half8);
 half16 __ovld __cnfn sinpi(half16);
 #endif //cl_khr_fp16
 
+/**
+ * Compute square root.
+ */
+float __ovld __cnfn sqrt(float);
+float2 __ovld __cnfn sqrt(float2);
+float3 __ovld __cnfn sqrt(float3);
+float4 __ovld __cnfn sqrt(float4);
+float8 __ovld __cnfn sqrt(float8);
+float16 __ovld __cnfn sqrt(float16);
+#ifdef cl_khr_fp64
+double __ovld __cnfn sqrt(double);
+double2 __ovld __cnfn sqrt(double2);
+double3 __ovld __cnfn sqrt(double3);
+double4 __ovld __cnfn sqrt(double4);
+double8 __ovld __cnfn sqrt(double8);
+double16 __ovld __cnfn sqrt(double16);
+#endif //cl_khr_fp64
+#ifdef cl_khr_fp16
+half __ovld __cnfn sqrt(half);
+half2 __ovld __cnfn sqrt(half2);
+half3 __ovld __cnfn sqrt(half3);
+half4 __ovld __cnfn sqrt(half4);
+half8 __ovld __cnfn sqrt(half8);
+half16 __ovld __cnfn sqrt(half16);
+#endif //cl_khr_fp16
+
 /**
  * Compute tangent.
  */
diff --git a/clang/lib/Sema/OpenCLBuiltins.td b/clang/lib/Sema/OpenCLBuiltins.td
index 9db450281912d2f..0cceba090bd8f26 100644
--- a/clang/lib/Sema/OpenCLBuiltins.td
+++ b/clang/lib/Sema/OpenCLBuiltins.td
@@ -563,15 +563,12 @@ foreach name = ["acos", "acosh", "acospi",
 "log", "log2", "log10", "log1p", "logb",
 "rint", "round", "rsqrt",
 "sin", "sinh", "sinpi",
+"sqrt",
 "tan", "tanh", "tanpi",
 "tgamma", "trunc",
 "lgamma"] in {
 def : Builtin;
 }
-
-// sqrt is handled in opencl-c-base.h to handle
-//