Re: [PATCH] D16870: [CUDA] Tweak attribute-based overload resolution to match nvcc behavior.

Justin Lebar via cfe-commits Thu, 04 Feb 2016 11:06:45 -0800

jlebar accepted this revision.
jlebar added a comment.
This revision is now accepted and ready to land.


Looks sane to me.  Just some suggestions on the comments.


================
Comment at: lib/Sema/SemaCUDA.cpp:71
@@ -70,3 +70,3 @@
 // H  - handled in (x)
-// Preferences: b-best, f-fallback, l-last resort, n-never.
+// Preferences: +:native, *:host-device, o:same side, .:wrong side, -:never.
 //
----------------
If we're going to use symbols rather than letters, could we use 4, 3, 2, 1, 0?  
I think that would be easier to follow.

================
Comment at: lib/Sema/SemaCUDA.cpp:127
@@ -132,9 +126,3 @@
   if (CallerTarget == CFT_HostDevice) {
-    // Calling a function that matches compilation mode is OK.
-    // Calling a function from the other side is frowned upon.
-    if (getLangOpts().CUDAIsDevice)
-      return CalleeTarget == CFT_Device ? CFP_Fallback : QuestionableResult;
-    else
-      return (CalleeTarget == CFT_Host || CalleeTarget == CFT_Global)
-                 ? CFP_Fallback
-                 : QuestionableResult;
+    // It's OK to call mode-matching function from HD one.
+    if ((getLangOpts().CUDAIsDevice && CalleeTarget == CFT_Device) ||
----------------
Nit: "It's OK to call a mode-matching function from an HD function."

================
Comment at: lib/Sema/SemaOverload.cpp:8536
@@ +8535,3 @@
+  // compatible with existing code that relies on this. If we see such
+  // a case, return better variant right away.
+  if (S.getLangOpts().CUDA && S.getLangOpts().CUDATargetOverloads &&
----------------
Since we have language lawyers on the team, suggest adding articles to comment:

If an HD function calls a function which has host-only and device-only 
overloads, nvcc sees only the host-side function during host compilation and 
only the device function during device-side compilation.  (This appears to be a 
side-effect of its splitting of host and device code into separate TUs.)  Alas 
we need to be compatible with existing code that relies on this, so if we see 
such a case, return the better variant right away.

I actually might suggest rephrasing this a bit more, to something like:

When performing host-side compilation, nvcc doesn't see device functions, and 
similarly when performing device-side compilation, nvcc doesn't see host 
functions.  (This is a consequence of the fact that it splits host and device 
code into separate TUs.)  We see all functions in both compilation modes, so to 
match nvcc's behavior, we need to exclude some overload candidates from 
consideration based only on their host/device attributes.  Specifically, if one 
candidate call is WrongSide and the other is Native or SameSide, we ignore the 
WrongSide candidate.  If we don't return early here, we'll consider the CUDA 
target attributes again later in this function, as a tiebreaker between calls 
with otherwise identical priority according to the regular C++ overloading 
rules.

================
Comment at: test/CodeGenCUDA/function-overload.cu:96
@@ +95,3 @@
+
+// In this case during host compilation we expect to cal function
+// template even if __device__ function may be available and allowed
----------------
call


http://reviews.llvm.org/D16870



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D16870: [CUDA] Tweak attribute-based overload resolution to match nvcc behavior.

Reply via email to