jinhuang415 commented on a change in pull request #10433: [MXNET-290] MKLDNN 
support for model quantization
URL: https://github.com/apache/incubator-mxnet/pull/10433#discussion_r188569030
 
 

 ##########
 File path: include/mxnet/c_api.h
 ##########
 @@ -1423,13 +1423,15 @@ MXNET_DLL int MXSymbolInferType(SymbolHandle sym,
  * \param excluded_symbols array of symbols to be excluded from being quantized
  * \param num_offline number of parameters that are quantized offline
  * \param offline_params array of c strings representing the names of params 
quantized offline
+ * \param dev_type device type 
  */
 MXNET_DLL int MXQuantizeSymbol(SymbolHandle sym_handle,
                                SymbolHandle *ret_sym_handle,
                                const mx_uint num_excluded_symbols,
                                const SymbolHandle *excluded_symbols,
                                const mx_uint num_offline,
-                               const char **offline_params);
+                               const char **offline_params,
+                               int dev_type);
 
 Review comment:
   @reminisce 
   I understand that for other kind of CPU it may need requantize, how about 
below approach:
   (1) add a few options for ```imagenet_gen_qsym.py``` to configure some 
features, like ```use_uint8``` (default off, to indicate whether quantize data 
to uint8), ```use_requantize``` (default on, to indicate whether we need 
requantize OP, to be added later), ```calib_input_data``` (default off, this is 
a feature we plan to add to enable input data calibration for calib layers to 
improve performance, to be added later), so users can pass different options to 
the quantization script based on their need and in most cases the default 
values will just work fine 
   (2) instead of passing dev_type in ```MXQuantizeSymbol()```, we can add 
parameters to pass specific feature/function on/off configuration, like 
```use_uint8```, ```use_requantize```, ```calib_input_data``` which mapped to 
quantization script options in (1)
   (3) in C++ handling function ```MXQuantizeSymbol()```, define different 
graph attributes for above feature/function configurations, and in 
```QuantizeGraph()``` perform quantize graph logic based on graph attribute 
values, this could make code reuse so we don't need to define seperate quantize 
graph function for CPU path.
   
   Please let me know your comments/suggestions, thanks.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to