tqchen commented on issue #5307: [TIR] Make lower_warp_memory support 
extent(threadIdx.x) < warp_size
URL: https://github.com/apache/incubator-tvm/pull/5307#issuecomment-612638955
 
 
   Thanks for the discussions so far, it would be great to have a 
discussion(perhaps in the forum) about the possible conventions to define th 
warp, just to clarify further.
   
   - A0: Enforce threadIdx.x == warp_size, and make it the warp index
      - Add a subwarp shuffle detection directly on to the warp shuffle to 
detect sub-warp shuffle pattern like ```B[(wi/4)*4+ ((wi % 4) +1) %4]``` whcih 
corresponds to a sub-warp shuffle.
   - A1: Warp virualization -- Allow threadIdx.x to be at subwarp level. (this 
PR)
   
   From what I see, A0 might introduce a bit more complexity, but allows a 
mixture of subwarp and full warp shuffle patterns.  
   
   A1 is can be viewed as "virtualizing the warps" by having more virtual warps 
that acted at the sub-warp level, and use a single warp to simulate them, we 
cannot mix that with full warp shuffle.
   
   It would be great if we can think a bit about how to describe these 
different clearly and document them in the comment of the code, and possibly in 
the future developer docs.
   
   If my understanding is correct, and we can document A1's concept clearly, I 
can go ahead and merge this PR first, then we do followup discussons.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to