roastduck commented on issue #5307: [TIR] Make lower_warp_memory support 
extent(threadIdx.x) < warp_size
URL: https://github.com/apache/incubator-tvm/pull/5307#issuecomment-612561375
 
 
   I see, so you are talking about the programming model instead of the code 
generation. In the alternative model, a user doesn't cache a buffer to "warp" 
scope. Instead, a user accesses variables from other threads directly through 
indices. Have I correctly understood your idea?
   
   What are the advantages of the new model over the current one? You said the 
current model conceptually reduces a warp to half. Do you mean that it is hard 
to mix full-warp shuffling and half-warp shuffling for a single buffer? I think 
this problem can be solved by improving the detection algorithms used in the 
`lower_warp_memory` pass, without switching to a new programming model.
   
   If the expressing ability of the two models are equivalent, I prefer the 
current "bind to warp scope" model, because it is more general and there is no 
need to introducing a new language feature.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to