tqchen commented on issue #5307: [TIR] Make lower_warp_memory support extent(threadIdx.x) < warp_size URL: https://github.com/apache/incubator-tvm/pull/5307#issuecomment-612459110 Thanks @roastduck. I wonder if we can also discuss the alternative abstractions. Right now the abstraction seems to suggest that conceptually the size of the warp is reduced to half(as the shuffle size). However, another way to view it would be to keep the size of the warp to be fixed(32), but support the index access pattern of the subgroups, for example, the canonical form below describes a shuffle in the group of 4 ```c++ A[wi] = B[ (wi/4)*4+ ((wi % 4) +1) %4] ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
