zhuwenxi commented on issue #7246: URL: https://github.com/apache/tvm/issues/7246#issuecomment-759957921
> Thank you @zhuwenxi! this is indeed an issue that we need to work to resolve. The main problem was the stack used for parallel packed call being raised into outside of the parallel for block during PackedCall lowering. > > We will need to think about ways to improve the packed call handling to avoid lifting such allocation to outside of the parallel for block @tqchen , thanks for the reply! Despite the race condition in a parallel schedule, I think the approach that allocate stack outside of (parallel) loops does have some sort of performance advantages, that it makes stack shared between multiple packed func call which could help save tremendous re-allocation time. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org