mjdenkowski commented on issue #17559: [MXNET-1446] Quantization: intgemm matrix multiply wrappers URL: https://github.com/apache/incubator-mxnet/pull/17559#issuecomment-586425627 Hi @TaoLv, thanks for taking a look at this! We understand your comments about MXNet already having multiple GEMM libraries. We're particularly interested in Kenneth's (@kpuatamazon) intgemm because it provides functionality we weren't able to find in the existing libraries. As he mentioned in the PR, we're seeing a roughly 3X inference speedup on an already significantly optimized transformer implementation. Like many other Gluon users, our inference model is not currently expressible as a static graph. We would like to work out the best way to make this functionality available to the larger community. Are there particular concerns we can address about adding intgemm as a third party library? Is there another path to using intgemm with MXNet that you recommend?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services