patschmidt2 commented on issue #16566: URL: https://github.com/apache/tvm/issues/16566#issuecomment-1943962693
The reason I want an automated solution is that my hardware also supports intrinsics where one dimension can be a unit iterator. And then it gets messy quite fast, if any dimension can be a unit iterator I would have to write separate intrinics for every possible combination. Not that all of these combinations are a good idea, but they can show up during tuning, since I don't think there is a way to tell the sampling instructions that they should not produce unit iterators. I'm wondering if it would be a good idea to follow the approach that is also used in the blockize function. It analyzers the IterVars and splits them into inner and outer iters. From there a new block is constructed where the unit IterVars are simply not included again. Although this approach would impose one specific structure on defined intrinsics. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org