ZihengJiang opened a new pull request #6980:
URL: https://github.com/apache/tvm/pull/6980


   This PR implements the automated quantization framework described in the 
[RFC](https://discuss.tvm.apache.org/t/rfc-search-based-automated-quantization/5483).
 More detail can be found in the [paper 
draft](https://www.ziheng.org/files/hago.pdf) also. We observe that HAGO 
achieves speedups of 2.09x, 1.97x, and 2.48x on Intel Xeon Cascade Lake CPUs, 
NVIDIA Tesla T4 GPUs, ARM Cortex-A CPUs on Raspberry Pi4 relative to full 
precision respectively, while maintaining high post-training quantization 
accuracy in each case.
   
   Still WIP.
   
   ## Highlights
   - Hardware Awareness 
   - Model-Agnostic Graph Transformation
   - Search-Based Optimization
   
   ## Results
   
   ### Accuracy for the MXNet Models
   
![image](https://user-images.githubusercontent.com/17693755/100298796-df917480-2f46-11eb-9f0f-9211e8f201cb.png)
   
   ### Accuracy for the TensorFlow Models
   
![image](https://user-images.githubusercontent.com/17693755/100298862-0c458c00-2f47-11eb-92c7-1d671013604b.png)
   
   ### Accuracy for the PyTorch Models
   
![image](https://user-images.githubusercontent.com/17693755/100298835-fdf77000-2f46-11eb-829d-92efe96d7820.png)
   
   ### Performance on x86 CPU
   
![image](https://user-images.githubusercontent.com/17693755/100298887-1d8e9880-2f47-11eb-92eb-b4b9fcfbe323.png)
   
   ### Performance on NVIDIA GPU
   
![image](https://user-images.githubusercontent.com/17693755/100298903-27b09700-2f47-11eb-959e-c2a2a588ebfb.png)
   
   ### Performance on ARM CPU
   
![image](https://user-images.githubusercontent.com/17693755/100298927-36974980-2f47-11eb-8b2b-6adbffb4c08a.png)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to