masahi commented on a change in pull request #10177:
URL: https://github.com/apache/tvm/pull/10177#discussion_r801075958



##########
File path: python/tvm/contrib/cutlass/gen_conv2d.py
##########
@@ -252,6 +252,8 @@ def select_op(
             lambda align: all([dim % align == 0 for dim in [IC, OC]]),
             use_3xtf32,
             profile_all_alignments,
+            # Use fp32 accumulation for wgrad to align with cuDNN
+            accumlator_dtype="float32" if conv_kind == ConvKind.Wgrad else 
out_dtype,

Review comment:
       yeah, I agree that we can let `out_dtype` in conv2d essentially act like 
the accumulation data type, and add an explicit cast op if `accum_dtype != 
out_dtype`. That's fine as far as `ToMixedPrecision` pass goes, but for cutlass 
BYOC, we need to additionally pattern match against the added `cast(fp32 -> 
fp16)` op to know that this conv2d is fp32 accum -> fp16 out. And for cuDNN 
which is not implemented as BYOC, this doesn't work because all it sees is fp32 
accum -> fp32 out conv2d. And cuDNN wgrad doesn't support such dtype 
combination. 
   
   
   
   > Hmm yeah, so if I'm understanding correctly for conv2d_winograd we want to 
accumulate to fp32 but if it's not winograd we are ok with accumulating to fp16.
   
   @AndrewZhaoLuo Here wgrad means "conv2d gradient with respect to weight", 
not winograd :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to