yubing marked 15 inline comments as done. yubing added inline comments.
================ Comment at: llvm/lib/Target/X86/X86LowerAMXIntrinsics.cpp:311 + Value *ResElt = B.CreateAdd(EltC, SubVecR); + Value *NewVecC = B.CreateInsertElement(VecCPhi, ResElt, IdxC); + Value *NewVecD = B.CreateInsertElement(VecDPhi, ResElt, IdxC); ---------------- pengfei wrote: > yubing wrote: > > pengfei wrote: > > > Is it necessary to insert the ResElt to VecC? > > Yes, it is necessary since you should use updated eltC(aka, Cij) when you > > are doing matrix dotproduct: > > Cij =Cij+Ai1.*B1j > > Cij =Cij+Ai2.*B2j > > .... > > Cij =Cij+AiK.*BKj > But you don't need to update both C and D. Something like the psudo code > should enough: > ``` > for (k : K) > Dij += Aik * Bkj; > Dij += Cij > ``` I change code into the following style, and it can also reduce inner loop's size: ``` for (k : K) Cij += Aik * Bkj; Dij = Cij ``` Besides, I hoist the procedure of calculating (i,j)'s linear index above inner loops. ================ Comment at: llvm/test/CodeGen/X86/AMX/amx-low-intrinsics-no-bitcast.ll:13 +; CHECK-NEXT: br label [[TILELOAD_SCALARIZE_ROWS_BODY:%.*]] +; CHECK: tileload.scalarize.rows.body: +; CHECK-NEXT: br label [[TILELOAD_SCALARIZE_COLS_HEADER:%.*]] ---------------- pengfei wrote: > It seems the body block is not necessary In fact, ISEL PASS can merge basicblocks together. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D93594/new/ https://reviews.llvm.org/D93594 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits