Laurawly commented on pull request #6839: URL: https://github.com/apache/tvm/pull/6839#issuecomment-742845654
> I'm happy to spend the next few days trying to improve parallelism in get_valid_counts, but given that this is a performance improvement over main and it's more correct, I think we shouldn't block merging over the current performance, we could always do a second PR to improve performance once we have the unit tests in place Why don't you put the new implementation in a separate file say `nms_onnx.py` or a separate function. And we can see if we can merge it back once enough tests have passed to test the flakiness of the kernel and when it has better parallelism than using only `batch_size` number of threads. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org