Re: [PATCH] D23627: [CUDA] Improve handling of math functions.

Justin Lebar via cfe-commits Wed, 17 Aug 2016 16:28:09 -0700

jlebar added inline comments.

================
Comment at: clang/lib/Headers/__clang_cuda_cmath.h:125-133
@@ -122,8 +124,11 @@
 __DEVICE__ float modf(float __x, float *__iptr) { return ::modff(__x, __iptr); 
}
-__DEVICE__ float nexttoward(float __from, float __to) {
+__DEVICE__ float nexttoward(float __from, double __to) {
   return __builtin_nexttowardf(__from, __to);
 }
 __DEVICE__ double nexttoward(double __from, double __to) {
   return __builtin_nexttoward(__from, __to);
 }
+__DEVICE__ float nexttowardf(float __from, double __to) {
+  return __builtin_nexttowardf(__from, __to);
+}
 __DEVICE__ float pow(float __base, float __exp) {
----------------
tra wrote:
> You've got two identical `nexttoward(float, double)` now.
> Perhaps first one was supposed to remain `nexttoward(float, float)` ?
> 
> 
It's hard to see, but one is nexttowardf.


================
Comment at: clang/lib/Headers/__clang_cuda_cmath.h:184-197
@@ +183,16 @@
+
+// Defines an overload of __fn that accepts one two arithmetic arguments, calls
+// __fn((double)x, (double)y), and returns a double.
+//
+// Note this is different from OVERLOAD_1, which generates an overload that
+// accepts only *integral* arguments.
+#define __CUDA_CLANG_FN_INTEGER_OVERLOAD_2(__retty, __fn)                      
\
+  template <typename __T1, typename __T2>                                      
\
+  __DEVICE__ typename __clang_cuda_enable_if<                                  
\
+      std::numeric_limits<__T1>::is_specialized &&                             
\
+          std::numeric_limits<__T2>::is_specialized,                           
\
+      __retty>::type                                                           
\
+  __fn(__T1 __x, __T2 __y) {                                                   
\
+    return __fn((double)__x, (double)__y);                                     
\
+  }
+
----------------
tra wrote:
> `is_specialized` will be true for `long double` args and we'll instantiate 
> the function. Can we/should we produce an error instead?
I think it's OK.  Or at least, long double is kind of screwed up at the moment. 
 Sometimes we pick `__host__` overloads, sometimes we pick `__device__` 
overloads; I made no effort to make it correct.  I'm much more bullish on 
making use of long double a compile error as a way to solve these problems.


https://reviews.llvm.org/D23627



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D23627: [CUDA] Improve handling of math functions.

Reply via email to