LeiWang1999 commented on PR #15462:
URL: https://github.com/apache/tvm/pull/15462#issuecomment-1663336808

   > Thank you, @LeiWang1999 for bringing cudnn backend :) Regarding 
find_cudnn_best_algo, there are existing API for both Python and C++ sides that 
you can reuse. 
https://github.com/apache/tvm/blob/unity/python/tvm/contrib/cudnn.py#L367 
https://github.com/apache/tvm/blob/unity/src/runtime/contrib/cudnn/conv_forward.cc#L212
   > 
   > By the way, do you also plan to work on attention operators by chance? 
https://docs.nvidia.com/deeplearning/cudnn/api/index.html#cudnnMultiHeadAttnForward
   
   Thanks, I understand that there are existing APIs for finding the best 
algorithm, but my confusion lies in determining when to incorporate the 
find_best_algo function. In certain inference frameworks, they utilize a static 
flag to enable algorithm discovery during the warmup phase. but I don't no if 
tvm can enable this at runtime.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to