hliao added a comment. Even GLOBAL may have a better addressing mode, the unpromotable `alloca` resolved in this change has an even significant performance issue. We could favor GLOBAL LOAD/STORE for kernel function as I proposed in other threads but, considering that an aggregate argument may be accessed indirectly, we need to pass it indirectly.
Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D89980/new/ https://reviews.llvm.org/D89980 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits