[GitHub] [tvm] Lunderberg commented on pull request #8196: [Vulkan][Refactor] Move ownership of per-CPU-thread objects to VulkanDeviceAPI

2021-06-14 Thread GitBox
Lunderberg commented on pull request #8196: URL: https://github.com/apache/tvm/pull/8196#issuecomment-859635982 Ooh, that's a nice side-effect to get! I guess that confirms that the segfault was caused by out-of-order destruction, though I'm still not quite sure how that happened, given

[GitHub] [tvm] Lunderberg commented on pull request #8196: [Vulkan][Refactor] Move ownership of per-CPU-thread objects to VulkanDeviceAPI

2021-06-10 Thread GitBox
Lunderberg commented on pull request #8196: URL: https://github.com/apache/tvm/pull/8196#issuecomment-858700548 The performance tests are matched pre/post change for cases that access the thread-specific resources, so I think it's good to go. @tqchen Any follow-up concerns? --

[GitHub] [tvm] Lunderberg commented on pull request #8196: [Vulkan][Refactor] Move ownership of per-CPU-thread objects to VulkanDeviceAPI

2021-06-08 Thread GitBox
Lunderberg commented on pull request #8196: URL: https://github.com/apache/tvm/pull/8196#issuecomment-857142889 On second thought, it looks like that difference was because `pytest-benchmark` does not include warmup runs by default. That run just happened to be the first that required

[GitHub] [tvm] Lunderberg commented on pull request #8196: [Vulkan][Refactor] Move ownership of per-CPU-thread objects to VulkanDeviceAPI

2021-06-08 Thread GitBox
Lunderberg commented on pull request #8196: URL: https://github.com/apache/tvm/pull/8196#issuecomment-857140287 Running some performance tests, it looks like the refactor has very little impact on the overall runtime. The plots below show the Q1/median/Q3 runtimes for different low-level

[GitHub] [tvm] Lunderberg commented on pull request #8196: [Vulkan][Refactor] Move ownership of per-CPU-thread objects to VulkanDeviceAPI

2021-06-04 Thread GitBox
Lunderberg commented on pull request #8196: URL: https://github.com/apache/tvm/pull/8196#issuecomment-855005511 Thank you @tqchen , and that makes sense with the potential overhead issues. I had been assuming that the main performance cost would be in the `std::unordered_map` lookup.

[GitHub] [tvm] Lunderberg commented on pull request #8196: [Vulkan][Refactor] Move ownership of per-CPU-thread objects to VulkanDeviceAPI

2021-06-04 Thread GitBox
Lunderberg commented on pull request #8196: URL: https://github.com/apache/tvm/pull/8196#issuecomment-854898061 Potential reviewers: @masahi @tmoreau89 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above