yaxunl marked 2 inline comments as done.
yaxunl added a comment.

In D71726#2039319 <https://reviews.llvm.org/D71726#2039319>, @arsenm wrote:

> In D71726#1801346 <https://reviews.llvm.org/D71726#1801346>, @__simt__ wrote:
>
> > In D71726#1792852 <https://reviews.llvm.org/D71726#1792852>, @yaxunl wrote:
> >
> > > In D71726#1791904 <https://reviews.llvm.org/D71726#1791904>, @jfb wrote:
> > >
> > > > This generally seems fine. Does it work on most backends? I want to 
> > > > make sure it doesn't fail in backends :)
> > >
> > >
> > > For x86_64, amdgcn, aarch64, armv7, mips64, it is translated to cmpxchg 
> > > by AtomicExpandPass and backends did codegen successfully.
> > >
> > > For hexagon, riscv32, it is translated to call of `__atomic_fetch_add_4` 
> > > for fadd float. This is concerning. Probably we need to add 
> > > `__atomic_fetch_{add|sub}_{f16|f32|f64}` ?
> >
> >
> > For systems that have load-link/store-conditional architectures, like ARM / 
> > PPC / base RISC-V without extension, I would imagine that using a cmpxchg 
> > loop is much worse than simply doing the floating-point add/sub in the 
> > middle of the atomic mini-transaction. I'm sure that we want back-ends to 
> > be capable of implementing this better than what this pass is doing, even 
> > when they don't have "native" fp atomics.
> >
> > You listed amdgcn... what does this do on nvptx?
>
>
> Targets can implement shouldExpandAtomicRMWInIR for the desired behavior, 
> which NVPTX currently does not implement. Looking at AtomicExpandPass, it 
> looks like either cmpxchg or LLSC expansions should work for the FP atomics 
> already


nvptx is similar to hexagon and riscv32, where fp atomics is translated to call 
of __atomic_fetch_add_4.

Since currently only amdgcn supports fp atomics, I am going to add a TargetInfo 
hook about whether fp atomics is supported and only emit fp atomics for targets 
supporting it.



================
Comment at: clang/lib/CodeGen/CGAtomic.cpp:597-598
   case AtomicExpr::AO__atomic_add_fetch:
-    PostOp = llvm::Instruction::Add;
+    PostOp = E->getValueType()->isFloatingType() ? llvm::Instruction::FAdd
+                                                 : llvm::Instruction::Add;
     LLVM_FALLTHROUGH;
----------------
arsenm wrote:
> Should this really be based on the type, or should the builtin name be 
> different for FP?
I think the original name is better. They are exactly what they are intended to 
be. They were not able to handle fp types therefore they used to emit 
diagnostics when fp types were passed to them. However now they are able to 
handle fp types.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71726/new/

https://reviews.llvm.org/D71726



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to