[ 
https://issues.apache.org/jira/browse/MXNET-677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Zhang updated MXNET-677:
----------------------------
    Description: 
The reproducible repository is [linked here. 
|https://github.com/OneRaynyDay/mxnet-quantization-bug]

 

Currently, airbnb is using the quantization extensions of mxnet to boost 
inference time on several convolutional neural network models. However, it has 
been difficult to achieve. The most complicated bugs lie in the intersection 
between the python and C++ interface, like the ones that crash jupyter kernels 
and are hard to run pdb on.

Airbnb currently extensively uses gluon models and are not planning to move to 
Module models any time soon for training, but it seems that creating a 
quantized Module model solely for inference is useful. Please refer to the 
repository for a minimum reproducible example.

  was:
The reproducible repository is linked here: 

 

Currently, airbnb is using the quantization extensions of mxnet to boost 
inference time on several convolutional neural network models. However, it has 
been difficult to achieve. The most complicated bugs lie in the intersection 
between the python and C++ interface, like the ones that crash jupyter kernels 
and are hard to run pdb on.

Airbnb currently extensively uses gluon models and are not planning to move to 
Module models any time soon for training, but it seems that creating a 
quantized Module model solely for inference is useful. Please refer to the 
repository for a minimum reproducible example.


> int8 quantization does not work on toy mnist dataset
> ----------------------------------------------------
>
>                 Key: MXNET-677
>                 URL: https://issues.apache.org/jira/browse/MXNET-677
>             Project: Apache MXNet
>          Issue Type: Bug
>            Reporter: Ray Zhang
>            Priority: Blocker
>
> The reproducible repository is [linked here. 
> |https://github.com/OneRaynyDay/mxnet-quantization-bug]
>  
> Currently, airbnb is using the quantization extensions of mxnet to boost 
> inference time on several convolutional neural network models. However, it 
> has been difficult to achieve. The most complicated bugs lie in the 
> intersection between the python and C++ interface, like the ones that crash 
> jupyter kernels and are hard to run pdb on.
> Airbnb currently extensively uses gluon models and are not planning to move 
> to Module models any time soon for training, but it seems that creating a 
> quantized Module model solely for inference is useful. Please refer to the 
> repository for a minimum reproducible example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to