merrymercy opened a new pull request #6041:
URL: https://github.com/apache/incubator-tvm/pull/6041
- Fix a bug when generating unrolled and vectorized cuda code
This bug is the same as the bug in
https://github.com/apache/incubator-tvm/pull/711. In the old PR, I only added a
new SSA scope for the "else" branch. But in the "then" branch, it has the same
problem. So I moved the addition of a new SSA scope to the top-level.
- Fix a bug when generating cuda code for`tir.reinterpret`
If we call `tir.reinterpret` on an rvalue, the existing strategy will
generate wrong code. Because we cannot get the address of an rvalue. To fix
this, we need to store the rvalue into a temporary variable and get the address
of this temporary variable.
- Improve the VeirfyGPUCode pass
Besides checking the LoadNode, we should also check the StoreNode.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]