roastduck commented on issue #5307: [TIR] Make lower_warp_memory support extent(threadIdx.x) < warp_size URL: https://github.com/apache/incubator-tvm/pull/5307#issuecomment-612697828 I agree with the description of A0 and A1, and now I think there may be an A2 in the design space. - A2: Improve the shuffle detection. We can analyze all the three thread axes, instead of only `threadIdx.x`. If the detection is strong enough, we can do whatever shuffle we want without limiting `threadIdx.x` to be the warp axis. I think A2 may require less complexity than A0, where we should alter the thread indices. And for documentation, am I suppose to refine the comments in the code, or start a new documentation page for the topic of warp memory?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
