Thanks for your attention, Seb. We are inclined to be cautious on what can
claim for this project. TensorRT has already supported converting
TensorFlow and Caffe models to its compatible format for fast inference,
but not MXNet. In this sense, it may not be fair to claim MXNet as the
first one supporting Nvidia Volta.

What we are working on is more experimental and research oriented. We want
to get the first-hand materials in our own hands by building a INT-8
inference prototype and have a thorough understanding on its strength and
limitation, rather than handing it off completely to TensorRT, which is
transparent to us. Considering that the project is experimental, it's still
too early to make a conclusion here as there are plenty of known/unknown
issues and unfinished work.

On the other hand, we are glad to hear that Nvidia is working on supporting
model conversion from MXNet to TensorRT (Dom please correct me if I'm
mistaken). It would be super beneficial to MXNet on INT-8 if they could
open-source their work as we would be able to maintain and add new features
on our side.


On Mon, Oct 2, 2017 at 8:04 PM, Dominic Divakaruni <
dominic.divakar...@gmail.com> wrote:

> 👏
>
> On Mon, Oct 2, 2017 at 8:02 PM Seb Kiureghian <sebou...@gmail.com> wrote:
>
> > It would be awesome if MXNet were the first DL framework to support
> Nvidia
> > Volta. What do you all think about cutting a v0.12 release once that
> > integration is ready?
> >
> > On Wed, Sep 27, 2017 at 10:38 PM, Jun Wu <wujun....@gmail.com> wrote:
> >
> > > I had been working on the sparse tensor project with Haibin. After it
> was
> > > wrapped up for the first stage, I started my work on the quantization
> > > project (INT-8 inference). The benefits of using quantized models for
> > > inference include much higher inference throughput than FP32 model with
> > > acceptable accuracy loss and compact models saved on small devices. The
> > > work currently aims at quantizing ConvNets, and we will consider
> > expanding
> > > it to RNN networks after getting good results for images. Meanwhile,
> it's
> > > expected to support quantization on CPU, GPU, and mobile devices.
> > >
> >
> --
>
>
> Dominic Divakaruni
> 206.475.9200 Cell
>

Reply via email to