[GitHub] [tvm] mbrookhart commented on pull request #7195: [THRUST] Faster multi dimensional argsort by segmented sort
mbrookhart commented on pull request #7195: URL: https://github.com/apache/tvm/pull/7195#issuecomment-754096681 Also, I think you and I are using different versions of CUDA for the same GPU, that might explain the difference in the numbers I posted in #7099 and you posted here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [tvm] mbrookhart commented on pull request #7195: [THRUST] Faster multi dimensional argsort by segmented sort
mbrookhart commented on pull request #7195: URL: https://github.com/apache/tvm/pull/7195#issuecomment-754095011 This looks great. My only concern would possibly be that some object detection models (I'm thinking gluon SSD) have a very large number of boxes they sort before NMS. Could you add shapes (1, 1e5) and (1, 1e6) to your test? I expect my mergesort will fail badly, but I wonder what the difference between your implementation and the current thrust implementation will be. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org