================
@@ -228,10 +228,11 @@ void 
AMDGPUAtomicOptimizerImpl::visitAtomicRMWInst(AtomicRMWInst &I) {
 
   // If the value operand is divergent, each lane is contributing a different
   // value to the atomic calculation. We can only optimize divergent values if
-  // we have DPP available on our subtarget, and the atomic operation is 32
-  // bits.
+  // we have DPP available on our subtarget, and the atomic operation is either
+  // 32 or 64 bits.
   if (ValDivergent &&
-      (!ST->hasDPP() || DL->getTypeSizeInBits(I.getType()) != 32)) {
+      (!ST->hasDPP() || (DL->getTypeSizeInBits(I.getType()) != 32 &&
+      DL->getTypeSizeInBits(I.getType()) != 64))) {
----------------
arsenm wrote:

You should move the type logic into a separate predicate function. But also, 
you should base this on the actual type and not just the bitwidth. Checking the 
bitwidth will break in the future when atomics support more vector operations

https://github.com/llvm/llvm-project/pull/96473
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to