hliao added inline comments.
================ Comment at: clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu:19 +// COMMON-LABEL: define amdgpu_kernel void @_Z7kernel1Pi(i32*{{.*}} %x) +// OPT: [[VAL:%.*]] = load i32, i32* %x, align 4 // OPT: [[INC:%.*]] = add nsw i32 [[VAL]], 1 ---------------- arsenm wrote: > hliao wrote: > > arsenm wrote: > > > hliao wrote: > > > > arsenm wrote: > > > > > This is still a regression. Fixing up AA does not solve the problem > > > > > this promotions this is intended to solve. Generic accesses are worse > > > > > independently of the aliasing properties > > > > Do you mean FLAT load/store has worse addressing mode than GLOBAL ones? > > > Yes. The flat offsets have a smaller range, and do not have the saddr > > > mode. Flat accesses also won't avoid the extra lgmkcnt wait > > I plan to add support to select GLOBAL ones once we could confirm that > > pointer could only point to GLOBAL/CONSTANT address spaces. Do you think > > that's a reasonable solution? > I would much rather have the IR express the address space rather than fixing > it up later. IR passes are aware of the addressing mode differences. Relying > on AA for basic selection would also be worse for compile time LLVM IR is agnostic to the underlying addressing mode. In code selection, we won't use AA but just needs to check IR. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D89980/new/ https://reviews.llvm.org/D89980 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits