Laurawly edited a comment on pull request #6839:
URL: https://github.com/apache/tvm/pull/6839#issuecomment-742808115


   > So, we seem to be at a bit of an impass. The implementation that is in 
master is not 100% correct, and it was even less correct before #7005. This PR 
is the best combination of correctness and performance we have a the moment. I 
would like to get it merged to have a correct implementation in place, but I 
would also like to find a way to speed up get valid counts, what I have here is 
not great.
   > 
   > I'm happy to spend the next few days trying to improve parallelism in 
get_valid_counts, but given that this is a performance improvement over main 
and it's more correct, I think we shouldn't block merging over the current 
performance, we could always do a second PR to improve performance once we have 
the unit tests in place
   
   We have tested the correctness of mxnet ssd related detection models before 
PR #7005 and they are in the correctness margin for customers to ship. I do 
suggest a perf comparison of your implementation and the one in main block by 
block. And an overall performance comparison with the implementation before PR 
#7005's change to `nms.py`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to