Thanks for your attention, Seb. We are inclined to be cautious on what can claim for this project. TensorRT has already supported converting TensorFlow and Caffe models to its compatible format for fast inference, but not MXNet. In this sense, it may not be fair to claim MXNet as the first one supporting Nvidia Volta.
What we are working on is more experimental and research oriented. We want to get the first-hand materials in our own hands by building a INT-8 inference prototype and have a thorough understanding on its strength and limitation, rather than handing it off completely to TensorRT, which is transparent to us. Considering that the project is experimental, it's still too early to make a conclusion here as there are plenty of known/unknown issues and unfinished work. On the other hand, we are glad to hear that Nvidia is working on supporting model conversion from MXNet to TensorRT (Dom please correct me if I'm mistaken). It would be super beneficial to MXNet on INT-8 if they could open-source their work as we would be able to maintain and add new features on our side. On Mon, Oct 2, 2017 at 8:04 PM, Dominic Divakaruni < dominic.divakar...@gmail.com> wrote: > 👏 > > On Mon, Oct 2, 2017 at 8:02 PM Seb Kiureghian <sebou...@gmail.com> wrote: > > > It would be awesome if MXNet were the first DL framework to support > Nvidia > > Volta. What do you all think about cutting a v0.12 release once that > > integration is ready? > > > > On Wed, Sep 27, 2017 at 10:38 PM, Jun Wu <wujun....@gmail.com> wrote: > > > > > I had been working on the sparse tensor project with Haibin. After it > was > > > wrapped up for the first stage, I started my work on the quantization > > > project (INT-8 inference). The benefits of using quantized models for > > > inference include much higher inference throughput than FP32 model with > > > acceptable accuracy loss and compact models saved on small devices. The > > > work currently aims at quantizing ConvNets, and we will consider > > expanding > > > it to RNN networks after getting good results for images. Meanwhile, > it's > > > expected to support quantization on CPU, GPU, and mobile devices. > > > > > > -- > > > Dominic Divakaruni > 206.475.9200 Cell >