Seb is talking about support for Cuda 9 and cuDNN 7. Pull requests below.
@ptrendx and Dick Carter are working through some performance issues but
should be done in a week (hopefully).

Jun, Bhavin,
Tensor RT runtime is a different subject. Nvidia is helping build a
converter for MXNet models. Not sure on the ETA. Tensor RT helps accelerate
vision models on the V100, TX2, P4/40 etc...


   - Enabling persistent batch norm with cuDNN 7:
   https://github.com/apache/incubator-mxnet/pull/7876
   - Making mixed precision work with all optimizers:
   https://github.com/apache/incubator-mxnet/pull/7654
   - Faster IO pipeline needed for Volta:
   https://github.com/apache/incubator-mxnet/pull/7152​;
   - Expose Tell in RecordIO reader:
   https://github.com/dmlc/dmlc-core/pull/301


On Mon, Oct 2, 2017 at 8:44 PM, Bhavin Thaker <bhavintha...@gmail.com>
wrote:

> Hi Seb: please use a different email thread for new topics of discussion.
>
> Hi Jun: I think Seb may be referring to Volta V100 support in MXNet and NOT
> P4/P40 inference accelerators.
>
> Corrections/clarifications welcome.
>
> Bhavin Thaker.
>
> On Mon, Oct 2, 2017 at 8:22 PM Jun Wu <wujun....@gmail.com> wrote:
>
> > Thanks for your attention, Seb. We are inclined to be cautious on what
> can
> > claim for this project. TensorRT has already supported converting
> > TensorFlow and Caffe models to its compatible format for fast inference,
> > but not MXNet. In this sense, it may not be fair to claim MXNet as the
> > first one supporting Nvidia Volta.
> >
> > What we are working on is more experimental and research oriented. We
> want
> > to get the first-hand materials in our own hands by building a INT-8
> > inference prototype and have a thorough understanding on its strength and
> > limitation, rather than handing it off completely to TensorRT, which is
> > transparent to us. Considering that the project is experimental, it's
> still
> > too early to make a conclusion here as there are plenty of known/unknown
> > issues and unfinished work.
> >
> > On the other hand, we are glad to hear that Nvidia is working on
> supporting
> > model conversion from MXNet to TensorRT (Dom please correct me if I'm
> > mistaken). It would be super beneficial to MXNet on INT-8 if they could
> > open-source their work as we would be able to maintain and add new
> features
> > on our side.
> >
> >
> > On Mon, Oct 2, 2017 at 8:04 PM, Dominic Divakaruni <
> > dominic.divakar...@gmail.com> wrote:
> >
> > > 👏
> > >
> > > On Mon, Oct 2, 2017 at 8:02 PM Seb Kiureghian <sebou...@gmail.com>
> > wrote:
> > >
> > > > It would be awesome if MXNet were the first DL framework to support
> > > Nvidia
> > > > Volta. What do you all think about cutting a v0.12 release once that
> > > > integration is ready?
> > > >
> > > > On Wed, Sep 27, 2017 at 10:38 PM, Jun Wu <wujun....@gmail.com>
> wrote:
> > > >
> > > > > I had been working on the sparse tensor project with Haibin. After
> it
> > > was
> > > > > wrapped up for the first stage, I started my work on the
> quantization
> > > > > project (INT-8 inference). The benefits of using quantized models
> for
> > > > > inference include much higher inference throughput than FP32 model
> > with
> > > > > acceptable accuracy loss and compact models saved on small devices.
> > The
> > > > > work currently aims at quantizing ConvNets, and we will consider
> > > > expanding
> > > > > it to RNN networks after getting good results for images.
> Meanwhile,
> > > it's
> > > > > expected to support quantization on CPU, GPU, and mobile devices.
> > > > >
> > > >
> > > --
> > >
> > >
> > > Dominic Divakaruni
> > > 206.475.9200 Cell
> > >
> >
>



-- 


Dominic Divakaruni
206.475.9200 Cell

Reply via email to