[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
arsenm wrote: > lgtm, but a strict reading of the spec would filter out arbitrary other > ext_vector_types Still think this would be a good follow up https://github.com/llvm/llvm-project/pull/66651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/66651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
rjodinchr wrote: @AnastasiaStulova Could you please take a look? thanks https://github.com/llvm/llvm-project/pull/66651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
rjodinchr wrote: What is the next step to get this PR merged? https://github.com/llvm/llvm-project/pull/66651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
https://github.com/arsenm approved this pull request. lgtm, but a strict reading of the spec would filter out arbitrary other ext_vector_types https://github.com/llvm/llvm-project/pull/66651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
@@ -5612,6 +5612,10 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo , BundleList); EmitBlock(Cont); } + if (CI->getCalledFunction() && CI->getCalledFunction()->hasName() && + CI->getCalledFunction()->getName().startswith("_Z4sqrt")) { arsenm wrote: ext_vector_type is an implementation detail. The standardized vector types just happen to be typedefs of ext_vector_type. It's not undefined behavior, it's just something out of spec you can do https://github.com/llvm/llvm-project/pull/66651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
rjodinchr wrote: @arsenm? https://github.com/llvm/llvm-project/pull/66651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
alan-baker wrote: This is preferable to defining the function in the OpenCL headers (as was noted in the original PR). To me, the important part is covering the OpenCL use case. So if we want to only check the overloads OpenCL generates I think that would be ok, but I expect this generic implementation is sufficient. https://github.com/llvm/llvm-project/pull/66651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
rjodinchr wrote: @arsenm could you have another look at this PR? Thank you https://github.com/llvm/llvm-project/pull/66651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
@@ -5612,6 +5612,10 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo , BundleList); EmitBlock(Cont); } + if (CI->getCalledFunction() && CI->getCalledFunction()->hasName() && + CI->getCalledFunction()->getName().startswith("_Z4sqrt")) { rjodinchr wrote: @arsenm ? https://github.com/llvm/llvm-project/pull/66651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
@@ -5612,6 +5612,10 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo , BundleList); EmitBlock(Cont); } + if (CI->getCalledFunction() && CI->getCalledFunction()->hasName() && + CI->getCalledFunction()->getName().startswith("_Z4sqrt")) { rjodinchr wrote: Using a custom `ext_vector_type` in OpenCL feels like an undefined behaviour. I don't see anything about it in the specification. Am I missing something? https://github.com/llvm/llvm-project/pull/66651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
@@ -5612,6 +5612,10 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo , BundleList); EmitBlock(Cont); } + if (CI->getCalledFunction() && CI->getCalledFunction()->hasName() && + CI->getCalledFunction()->getName().startswith("_Z4sqrt")) { arsenm wrote: Maybe it's good enough, I guess the Z4 filters out other prefixes. Not sure about arbitrary custom ext_vector_type https://github.com/llvm/llvm-project/pull/66651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
@@ -5612,6 +5612,10 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo , BundleList); EmitBlock(Cont); } + if (CI->getCalledFunction() && CI->getCalledFunction()->hasName() && + CI->getCalledFunction()->getName().startswith("_Z4sqrt")) { arsenm wrote: This isn't a specific enough filter. Exactly match the full name for all the types? https://github.com/llvm/llvm-project/pull/66651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
rjodinchr wrote: @arsenm Could you review this PR again please? https://github.com/llvm/llvm-project/pull/66651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
https://github.com/rjodinchr resolved https://github.com/llvm/llvm-project/pull/66651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
https://github.com/rjodinchr updated https://github.com/llvm/llvm-project/pull/66651 >From b6df142239256e979a70896f324f9ed3547c640c Mon Sep 17 00:00:00 2001 From: Romaric Jodin Date: Mon, 18 Sep 2023 09:34:56 +0200 Subject: [PATCH 1/2] Revert "clang/OpenCL: Add inline implementations of sqrt in builtin header" This reverts commit 15e0fe0b6122e32657b98daf74a1fce028d2e5bf. --- clang/lib/Headers/opencl-c-base.h | 58 --- clang/lib/Headers/opencl-c.h| 26 +++ clang/lib/Sema/OpenCLBuiltins.td| 5 +- clang/test/CodeGenOpenCL/sqrt-fpmath.cl | 201 4 files changed, 27 insertions(+), 263 deletions(-) delete mode 100644 clang/test/CodeGenOpenCL/sqrt-fpmath.cl diff --git a/clang/lib/Headers/opencl-c-base.h b/clang/lib/Headers/opencl-c-base.h index d56e5ceae652ad5..2494f6213fc5695 100644 --- a/clang/lib/Headers/opencl-c-base.h +++ b/clang/lib/Headers/opencl-c-base.h @@ -819,64 +819,6 @@ int printf(__constant const char* st, ...) __attribute__((format(printf, 1, 2))) #endif // cl_intel_device_side_avc_motion_estimation -/** - * Compute square root. - * - * Provide inline implementations using the builtin so that we get appropriate - * !fpmath based on -cl-fp32-correctly-rounded-divide-sqrt, attached to - * llvm.sqrt. The implementation should still provide an external definition. - */ -#define __ovld __attribute__((overloadable)) -#define __cnfn __attribute__((const)) - -inline float __ovld __cnfn sqrt(float __x) { - return __builtin_elementwise_sqrt(__x); -} - -inline float2 __ovld __cnfn sqrt(float2 __x) { - return __builtin_elementwise_sqrt(__x); -} - -inline float3 __ovld __cnfn sqrt(float3 __x) { - return __builtin_elementwise_sqrt(__x); -} - -inline float4 __ovld __cnfn sqrt(float4 __x) { - return __builtin_elementwise_sqrt(__x); -} - -inline float8 __ovld __cnfn sqrt(float8 __x) { - return __builtin_elementwise_sqrt(__x); -} - -inline float16 __ovld __cnfn sqrt(float16 __x) { - return __builtin_elementwise_sqrt(__x); -} - -// We only really want to define the float variants here. However -// -fdeclare-opencl-builtins will not work if some overloads are already - // provided in the base header, so provide all overloads here. - -#ifdef cl_khr_fp64 -double __ovld __cnfn sqrt(double); -double2 __ovld __cnfn sqrt(double2); -double3 __ovld __cnfn sqrt(double3); -double4 __ovld __cnfn sqrt(double4); -double8 __ovld __cnfn sqrt(double8); -double16 __ovld __cnfn sqrt(double16); -#endif //cl_khr_fp64 -#ifdef cl_khr_fp16 -half __ovld __cnfn sqrt(half); -half2 __ovld __cnfn sqrt(half2); -half3 __ovld __cnfn sqrt(half3); -half4 __ovld __cnfn sqrt(half4); -half8 __ovld __cnfn sqrt(half8); -half16 __ovld __cnfn sqrt(half16); -#endif //cl_khr_fp16 - -#undef __cnfn -#undef __ovld - // Disable any extensions we may have enabled previously. #pragma OPENCL EXTENSION all : disable diff --git a/clang/lib/Headers/opencl-c.h b/clang/lib/Headers/opencl-c.h index 1efbbf8f8ee6a01..288bb18bc654ebc 100644 --- a/clang/lib/Headers/opencl-c.h +++ b/clang/lib/Headers/opencl-c.h @@ -8496,6 +8496,32 @@ half8 __ovld __cnfn sinpi(half8); half16 __ovld __cnfn sinpi(half16); #endif //cl_khr_fp16 +/** + * Compute square root. + */ +float __ovld __cnfn sqrt(float); +float2 __ovld __cnfn sqrt(float2); +float3 __ovld __cnfn sqrt(float3); +float4 __ovld __cnfn sqrt(float4); +float8 __ovld __cnfn sqrt(float8); +float16 __ovld __cnfn sqrt(float16); +#ifdef cl_khr_fp64 +double __ovld __cnfn sqrt(double); +double2 __ovld __cnfn sqrt(double2); +double3 __ovld __cnfn sqrt(double3); +double4 __ovld __cnfn sqrt(double4); +double8 __ovld __cnfn sqrt(double8); +double16 __ovld __cnfn sqrt(double16); +#endif //cl_khr_fp64 +#ifdef cl_khr_fp16 +half __ovld __cnfn sqrt(half); +half2 __ovld __cnfn sqrt(half2); +half3 __ovld __cnfn sqrt(half3); +half4 __ovld __cnfn sqrt(half4); +half8 __ovld __cnfn sqrt(half8); +half16 __ovld __cnfn sqrt(half16); +#endif //cl_khr_fp16 + /** * Compute tangent. */ diff --git a/clang/lib/Sema/OpenCLBuiltins.td b/clang/lib/Sema/OpenCLBuiltins.td index 9db450281912d2f..0cceba090bd8f26 100644 --- a/clang/lib/Sema/OpenCLBuiltins.td +++ b/clang/lib/Sema/OpenCLBuiltins.td @@ -563,15 +563,12 @@ foreach name = ["acos", "acosh", "acospi", "log", "log2", "log10", "log1p", "logb", "rint", "round", "rsqrt", "sin", "sinh", "sinpi", +"sqrt", "tan", "tanh", "tanpi", "tgamma", "trunc", "lgamma"] in { def : Builtin; } - -// sqrt is handled in opencl-c-base.h to handle -// -cl-fp32-correctly-rounded-divide-sqrt. - foreach name = ["nan"] in { def : Builtin; def : Builtin; diff --git a/clang/test/CodeGenOpenCL/sqrt-fpmath.cl b/clang/test/CodeGenOpenCL/sqrt-fpmath.cl deleted file mode 100644 index df30085cba2e7d5..000 --- a/clang/test/CodeGenOpenCL/sqrt-fpmath.cl +++ /dev/null @@ -1,201 +0,0 @@ -// Test that float variants of
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
@@ -5612,6 +5612,10 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo , BundleList); EmitBlock(Cont); } + if (CI->getCalledFunction() && CI->getCalledFunction()->hasName() && + CI->getCalledFunction()->getName().contains("Z4sqrt")) { rjodinchr wrote: the language is checked by `SetSqrtFPAccuracy`. I'll have a look at a better match for the name https://github.com/llvm/llvm-project/pull/66651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
@@ -5612,6 +5612,10 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo , BundleList); EmitBlock(Cont); } + if (CI->getCalledFunction() && CI->getCalledFunction()->hasName() && + CI->getCalledFunction()->getName().contains("Z4sqrt")) { arsenm wrote: Probably should check the language, and exact match on the permissible names https://github.com/llvm/llvm-project/pull/66651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
https://github.com/rjodinchr updated https://github.com/llvm/llvm-project/pull/66651 >From b6df142239256e979a70896f324f9ed3547c640c Mon Sep 17 00:00:00 2001 From: Romaric Jodin Date: Mon, 18 Sep 2023 09:34:56 +0200 Subject: [PATCH 1/2] Revert "clang/OpenCL: Add inline implementations of sqrt in builtin header" This reverts commit 15e0fe0b6122e32657b98daf74a1fce028d2e5bf. --- clang/lib/Headers/opencl-c-base.h | 58 --- clang/lib/Headers/opencl-c.h| 26 +++ clang/lib/Sema/OpenCLBuiltins.td| 5 +- clang/test/CodeGenOpenCL/sqrt-fpmath.cl | 201 4 files changed, 27 insertions(+), 263 deletions(-) delete mode 100644 clang/test/CodeGenOpenCL/sqrt-fpmath.cl diff --git a/clang/lib/Headers/opencl-c-base.h b/clang/lib/Headers/opencl-c-base.h index d56e5ceae652ad5..2494f6213fc5695 100644 --- a/clang/lib/Headers/opencl-c-base.h +++ b/clang/lib/Headers/opencl-c-base.h @@ -819,64 +819,6 @@ int printf(__constant const char* st, ...) __attribute__((format(printf, 1, 2))) #endif // cl_intel_device_side_avc_motion_estimation -/** - * Compute square root. - * - * Provide inline implementations using the builtin so that we get appropriate - * !fpmath based on -cl-fp32-correctly-rounded-divide-sqrt, attached to - * llvm.sqrt. The implementation should still provide an external definition. - */ -#define __ovld __attribute__((overloadable)) -#define __cnfn __attribute__((const)) - -inline float __ovld __cnfn sqrt(float __x) { - return __builtin_elementwise_sqrt(__x); -} - -inline float2 __ovld __cnfn sqrt(float2 __x) { - return __builtin_elementwise_sqrt(__x); -} - -inline float3 __ovld __cnfn sqrt(float3 __x) { - return __builtin_elementwise_sqrt(__x); -} - -inline float4 __ovld __cnfn sqrt(float4 __x) { - return __builtin_elementwise_sqrt(__x); -} - -inline float8 __ovld __cnfn sqrt(float8 __x) { - return __builtin_elementwise_sqrt(__x); -} - -inline float16 __ovld __cnfn sqrt(float16 __x) { - return __builtin_elementwise_sqrt(__x); -} - -// We only really want to define the float variants here. However -// -fdeclare-opencl-builtins will not work if some overloads are already - // provided in the base header, so provide all overloads here. - -#ifdef cl_khr_fp64 -double __ovld __cnfn sqrt(double); -double2 __ovld __cnfn sqrt(double2); -double3 __ovld __cnfn sqrt(double3); -double4 __ovld __cnfn sqrt(double4); -double8 __ovld __cnfn sqrt(double8); -double16 __ovld __cnfn sqrt(double16); -#endif //cl_khr_fp64 -#ifdef cl_khr_fp16 -half __ovld __cnfn sqrt(half); -half2 __ovld __cnfn sqrt(half2); -half3 __ovld __cnfn sqrt(half3); -half4 __ovld __cnfn sqrt(half4); -half8 __ovld __cnfn sqrt(half8); -half16 __ovld __cnfn sqrt(half16); -#endif //cl_khr_fp16 - -#undef __cnfn -#undef __ovld - // Disable any extensions we may have enabled previously. #pragma OPENCL EXTENSION all : disable diff --git a/clang/lib/Headers/opencl-c.h b/clang/lib/Headers/opencl-c.h index 1efbbf8f8ee6a01..288bb18bc654ebc 100644 --- a/clang/lib/Headers/opencl-c.h +++ b/clang/lib/Headers/opencl-c.h @@ -8496,6 +8496,32 @@ half8 __ovld __cnfn sinpi(half8); half16 __ovld __cnfn sinpi(half16); #endif //cl_khr_fp16 +/** + * Compute square root. + */ +float __ovld __cnfn sqrt(float); +float2 __ovld __cnfn sqrt(float2); +float3 __ovld __cnfn sqrt(float3); +float4 __ovld __cnfn sqrt(float4); +float8 __ovld __cnfn sqrt(float8); +float16 __ovld __cnfn sqrt(float16); +#ifdef cl_khr_fp64 +double __ovld __cnfn sqrt(double); +double2 __ovld __cnfn sqrt(double2); +double3 __ovld __cnfn sqrt(double3); +double4 __ovld __cnfn sqrt(double4); +double8 __ovld __cnfn sqrt(double8); +double16 __ovld __cnfn sqrt(double16); +#endif //cl_khr_fp64 +#ifdef cl_khr_fp16 +half __ovld __cnfn sqrt(half); +half2 __ovld __cnfn sqrt(half2); +half3 __ovld __cnfn sqrt(half3); +half4 __ovld __cnfn sqrt(half4); +half8 __ovld __cnfn sqrt(half8); +half16 __ovld __cnfn sqrt(half16); +#endif //cl_khr_fp16 + /** * Compute tangent. */ diff --git a/clang/lib/Sema/OpenCLBuiltins.td b/clang/lib/Sema/OpenCLBuiltins.td index 9db450281912d2f..0cceba090bd8f26 100644 --- a/clang/lib/Sema/OpenCLBuiltins.td +++ b/clang/lib/Sema/OpenCLBuiltins.td @@ -563,15 +563,12 @@ foreach name = ["acos", "acosh", "acospi", "log", "log2", "log10", "log1p", "logb", "rint", "round", "rsqrt", "sin", "sinh", "sinpi", +"sqrt", "tan", "tanh", "tanpi", "tgamma", "trunc", "lgamma"] in { def : Builtin; } - -// sqrt is handled in opencl-c-base.h to handle -// -cl-fp32-correctly-rounded-divide-sqrt. - foreach name = ["nan"] in { def : Builtin; def : Builtin; diff --git a/clang/test/CodeGenOpenCL/sqrt-fpmath.cl b/clang/test/CodeGenOpenCL/sqrt-fpmath.cl deleted file mode 100644 index df30085cba2e7d5..000 --- a/clang/test/CodeGenOpenCL/sqrt-fpmath.cl +++ /dev/null @@ -1,201 +0,0 @@ -// Test that float variants of
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
https://github.com/rjodinchr edited https://github.com/llvm/llvm-project/pull/66651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
llvmbot wrote: @llvm/pr-subscribers-backend-x86 Changes This is reverting the previous implementation to avoid adding inline function in opencl headers. This was breaking clspv flow google/clspv#1231, while https://reviews.llvm.org/D156743 mentioned that just decoring the call node with `!pfmath` was enough. This PR is implementing this idea. The test has been updated with this implementation. --- Full diff: https://github.com/llvm/llvm-project/pull/66651.diff 5 Files Affected: - (modified) clang/lib/CodeGen/CGCall.cpp (+3) - (modified) clang/lib/Headers/opencl-c-base.h (-58) - (modified) clang/lib/Headers/opencl-c.h (+26) - (modified) clang/lib/Sema/OpenCLBuiltins.td (+1-4) - (modified) clang/test/CodeGenOpenCL/sqrt-fpmath.cl (+51-73) ``diff diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp index e15a4634b1d041b..98138811de7f8df 100644 --- a/clang/lib/CodeGen/CGCall.cpp +++ b/clang/lib/CodeGen/CGCall.cpp @@ -5612,6 +5612,9 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo , BundleList); EmitBlock(Cont); } + if (CI->getCalledFunction()->getName().contains("Z4sqrt")) { +SetSqrtFPAccuracy(CI); + } if (callOrInvoke) *callOrInvoke = CI; diff --git a/clang/lib/Headers/opencl-c-base.h b/clang/lib/Headers/opencl-c-base.h index d56e5ceae652ad5..2494f6213fc5695 100644 --- a/clang/lib/Headers/opencl-c-base.h +++ b/clang/lib/Headers/opencl-c-base.h @@ -819,64 +819,6 @@ int printf(__constant const char* st, ...) __attribute__((format(printf, 1, 2))) #endif // cl_intel_device_side_avc_motion_estimation -/** - * Compute square root. - * - * Provide inline implementations using the builtin so that we get appropriate - * !fpmath based on -cl-fp32-correctly-rounded-divide-sqrt, attached to - * llvm.sqrt. The implementation should still provide an external definition. - */ -#define __ovld __attribute__((overloadable)) -#define __cnfn __attribute__((const)) - -inline float __ovld __cnfn sqrt(float __x) { - return __builtin_elementwise_sqrt(__x); -} - -inline float2 __ovld __cnfn sqrt(float2 __x) { - return __builtin_elementwise_sqrt(__x); -} - -inline float3 __ovld __cnfn sqrt(float3 __x) { - return __builtin_elementwise_sqrt(__x); -} - -inline float4 __ovld __cnfn sqrt(float4 __x) { - return __builtin_elementwise_sqrt(__x); -} - -inline float8 __ovld __cnfn sqrt(float8 __x) { - return __builtin_elementwise_sqrt(__x); -} - -inline float16 __ovld __cnfn sqrt(float16 __x) { - return __builtin_elementwise_sqrt(__x); -} - -// We only really want to define the float variants here. However -// -fdeclare-opencl-builtins will not work if some overloads are already - // provided in the base header, so provide all overloads here. - -#ifdef cl_khr_fp64 -double __ovld __cnfn sqrt(double); -double2 __ovld __cnfn sqrt(double2); -double3 __ovld __cnfn sqrt(double3); -double4 __ovld __cnfn sqrt(double4); -double8 __ovld __cnfn sqrt(double8); -double16 __ovld __cnfn sqrt(double16); -#endif //cl_khr_fp64 -#ifdef cl_khr_fp16 -half __ovld __cnfn sqrt(half); -half2 __ovld __cnfn sqrt(half2); -half3 __ovld __cnfn sqrt(half3); -half4 __ovld __cnfn sqrt(half4); -half8 __ovld __cnfn sqrt(half8); -half16 __ovld __cnfn sqrt(half16); -#endif //cl_khr_fp16 - -#undef __cnfn -#undef __ovld - // Disable any extensions we may have enabled previously. #pragma OPENCL EXTENSION all : disable diff --git a/clang/lib/Headers/opencl-c.h b/clang/lib/Headers/opencl-c.h index 1efbbf8f8ee6a01..288bb18bc654ebc 100644 --- a/clang/lib/Headers/opencl-c.h +++ b/clang/lib/Headers/opencl-c.h @@ -8496,6 +8496,32 @@ half8 __ovld __cnfn sinpi(half8); half16 __ovld __cnfn sinpi(half16); #endif //cl_khr_fp16 +/** + * Compute square root. + */ +float __ovld __cnfn sqrt(float); +float2 __ovld __cnfn sqrt(float2); +float3 __ovld __cnfn sqrt(float3); +float4 __ovld __cnfn sqrt(float4); +float8 __ovld __cnfn sqrt(float8); +float16 __ovld __cnfn sqrt(float16); +#ifdef cl_khr_fp64 +double __ovld __cnfn sqrt(double); +double2 __ovld __cnfn sqrt(double2); +double3 __ovld __cnfn sqrt(double3); +double4 __ovld __cnfn sqrt(double4); +double8 __ovld __cnfn sqrt(double8); +double16 __ovld __cnfn sqrt(double16); +#endif //cl_khr_fp64 +#ifdef cl_khr_fp16 +half __ovld __cnfn sqrt(half); +half2 __ovld __cnfn sqrt(half2); +half3 __ovld __cnfn sqrt(half3); +half4 __ovld __cnfn sqrt(half4); +half8 __ovld __cnfn sqrt(half8); +half16 __ovld __cnfn sqrt(half16); +#endif //cl_khr_fp16 + /** * Compute tangent. */ diff --git a/clang/lib/Sema/OpenCLBuiltins.td b/clang/lib/Sema/OpenCLBuiltins.td index 9db450281912d2f..0cceba090bd8f26 100644 --- a/clang/lib/Sema/OpenCLBuiltins.td +++ b/clang/lib/Sema/OpenCLBuiltins.td @@ -563,15 +563,12 @@ foreach name = ["acos", "acosh", "acospi", "log", "log2", "log10", "log1p", "logb", "rint", "round", "rsqrt", "sin", "sinh", "sinpi", +"sqrt",
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
rjodinchr wrote: @alan-baker @arsenm @AnastasiaStulova Could you review this PR please? Thank you https://github.com/llvm/llvm-project/pull/66651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] clang/OpenCL: set sqrt fp accuracy on call to Z4sqrt (PR #66651)
https://github.com/rjodinchr created https://github.com/llvm/llvm-project/pull/66651 This is reverting the previous implementation to avoid adding inline function in opencl headers. This was breaking clspv flow google/clspv#1231, while https://reviews.llvm.org/D156743 mentioned that just decoring the call node with `!pfmath` was enough. This PR is implementing this idea. The test has been updated with this implementation. >From b6df142239256e979a70896f324f9ed3547c640c Mon Sep 17 00:00:00 2001 From: Romaric Jodin Date: Mon, 18 Sep 2023 09:34:56 +0200 Subject: [PATCH 1/2] Revert "clang/OpenCL: Add inline implementations of sqrt in builtin header" This reverts commit 15e0fe0b6122e32657b98daf74a1fce028d2e5bf. --- clang/lib/Headers/opencl-c-base.h | 58 --- clang/lib/Headers/opencl-c.h| 26 +++ clang/lib/Sema/OpenCLBuiltins.td| 5 +- clang/test/CodeGenOpenCL/sqrt-fpmath.cl | 201 4 files changed, 27 insertions(+), 263 deletions(-) delete mode 100644 clang/test/CodeGenOpenCL/sqrt-fpmath.cl diff --git a/clang/lib/Headers/opencl-c-base.h b/clang/lib/Headers/opencl-c-base.h index d56e5ceae652ad5..2494f6213fc5695 100644 --- a/clang/lib/Headers/opencl-c-base.h +++ b/clang/lib/Headers/opencl-c-base.h @@ -819,64 +819,6 @@ int printf(__constant const char* st, ...) __attribute__((format(printf, 1, 2))) #endif // cl_intel_device_side_avc_motion_estimation -/** - * Compute square root. - * - * Provide inline implementations using the builtin so that we get appropriate - * !fpmath based on -cl-fp32-correctly-rounded-divide-sqrt, attached to - * llvm.sqrt. The implementation should still provide an external definition. - */ -#define __ovld __attribute__((overloadable)) -#define __cnfn __attribute__((const)) - -inline float __ovld __cnfn sqrt(float __x) { - return __builtin_elementwise_sqrt(__x); -} - -inline float2 __ovld __cnfn sqrt(float2 __x) { - return __builtin_elementwise_sqrt(__x); -} - -inline float3 __ovld __cnfn sqrt(float3 __x) { - return __builtin_elementwise_sqrt(__x); -} - -inline float4 __ovld __cnfn sqrt(float4 __x) { - return __builtin_elementwise_sqrt(__x); -} - -inline float8 __ovld __cnfn sqrt(float8 __x) { - return __builtin_elementwise_sqrt(__x); -} - -inline float16 __ovld __cnfn sqrt(float16 __x) { - return __builtin_elementwise_sqrt(__x); -} - -// We only really want to define the float variants here. However -// -fdeclare-opencl-builtins will not work if some overloads are already - // provided in the base header, so provide all overloads here. - -#ifdef cl_khr_fp64 -double __ovld __cnfn sqrt(double); -double2 __ovld __cnfn sqrt(double2); -double3 __ovld __cnfn sqrt(double3); -double4 __ovld __cnfn sqrt(double4); -double8 __ovld __cnfn sqrt(double8); -double16 __ovld __cnfn sqrt(double16); -#endif //cl_khr_fp64 -#ifdef cl_khr_fp16 -half __ovld __cnfn sqrt(half); -half2 __ovld __cnfn sqrt(half2); -half3 __ovld __cnfn sqrt(half3); -half4 __ovld __cnfn sqrt(half4); -half8 __ovld __cnfn sqrt(half8); -half16 __ovld __cnfn sqrt(half16); -#endif //cl_khr_fp16 - -#undef __cnfn -#undef __ovld - // Disable any extensions we may have enabled previously. #pragma OPENCL EXTENSION all : disable diff --git a/clang/lib/Headers/opencl-c.h b/clang/lib/Headers/opencl-c.h index 1efbbf8f8ee6a01..288bb18bc654ebc 100644 --- a/clang/lib/Headers/opencl-c.h +++ b/clang/lib/Headers/opencl-c.h @@ -8496,6 +8496,32 @@ half8 __ovld __cnfn sinpi(half8); half16 __ovld __cnfn sinpi(half16); #endif //cl_khr_fp16 +/** + * Compute square root. + */ +float __ovld __cnfn sqrt(float); +float2 __ovld __cnfn sqrt(float2); +float3 __ovld __cnfn sqrt(float3); +float4 __ovld __cnfn sqrt(float4); +float8 __ovld __cnfn sqrt(float8); +float16 __ovld __cnfn sqrt(float16); +#ifdef cl_khr_fp64 +double __ovld __cnfn sqrt(double); +double2 __ovld __cnfn sqrt(double2); +double3 __ovld __cnfn sqrt(double3); +double4 __ovld __cnfn sqrt(double4); +double8 __ovld __cnfn sqrt(double8); +double16 __ovld __cnfn sqrt(double16); +#endif //cl_khr_fp64 +#ifdef cl_khr_fp16 +half __ovld __cnfn sqrt(half); +half2 __ovld __cnfn sqrt(half2); +half3 __ovld __cnfn sqrt(half3); +half4 __ovld __cnfn sqrt(half4); +half8 __ovld __cnfn sqrt(half8); +half16 __ovld __cnfn sqrt(half16); +#endif //cl_khr_fp16 + /** * Compute tangent. */ diff --git a/clang/lib/Sema/OpenCLBuiltins.td b/clang/lib/Sema/OpenCLBuiltins.td index 9db450281912d2f..0cceba090bd8f26 100644 --- a/clang/lib/Sema/OpenCLBuiltins.td +++ b/clang/lib/Sema/OpenCLBuiltins.td @@ -563,15 +563,12 @@ foreach name = ["acos", "acosh", "acospi", "log", "log2", "log10", "log1p", "logb", "rint", "round", "rsqrt", "sin", "sinh", "sinpi", +"sqrt", "tan", "tanh", "tanpi", "tgamma", "trunc", "lgamma"] in { def : Builtin; } - -// sqrt is handled in opencl-c-base.h to handle -//