Lunderberg commented on PR #16782: URL: https://github.com/apache/tvm/pull/16782#issuecomment-2020540959
The implementation looks reasonable, though I have one main question for it: What is the behavior of the updated pass for a target that doesn't support SVE? Prior SVE-commits enabled the functionality, but didn't produce SVE in any of the default lowering passes. From [this line](https://github.com/apache/tvm/pull/16696/files#diff-f61b04b100f5145f2681340c81d3f2af221239594ed01e2e24896522329ce92cR598-R600), versions of LLVM before 11.0 do not support SVE, nor from my brief reading of the CUDA codegen [here](https://github.com/apache/tvm/blob/main/src/target/source/codegen_cuda.cc#L253) does CUDA. Since `VectorizeLoop` occurs after the `BindTarget` pass, we can check the function attribute to know which target will be executing each function. I think we should have the loop vectorization apply only to fixed-extent loops by default, but enable the scalable vectorization for targets that support it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org