Seb is talking about support for Cuda 9 and cuDNN 7. Pull requests below. @ptrendx and Dick Carter are working through some performance issues but should be done in a week (hopefully).
Jun, Bhavin, Tensor RT runtime is a different subject. Nvidia is helping build a converter for MXNet models. Not sure on the ETA. Tensor RT helps accelerate vision models on the V100, TX2, P4/40 etc... - Enabling persistent batch norm with cuDNN 7: https://github.com/apache/incubator-mxnet/pull/7876 - Making mixed precision work with all optimizers: https://github.com/apache/incubator-mxnet/pull/7654 - Faster IO pipeline needed for Volta: https://github.com/apache/incubator-mxnet/pull/7152; - Expose Tell in RecordIO reader: https://github.com/dmlc/dmlc-core/pull/301 On Mon, Oct 2, 2017 at 8:44 PM, Bhavin Thaker <bhavintha...@gmail.com> wrote: > Hi Seb: please use a different email thread for new topics of discussion. > > Hi Jun: I think Seb may be referring to Volta V100 support in MXNet and NOT > P4/P40 inference accelerators. > > Corrections/clarifications welcome. > > Bhavin Thaker. > > On Mon, Oct 2, 2017 at 8:22 PM Jun Wu <wujun....@gmail.com> wrote: > > > Thanks for your attention, Seb. We are inclined to be cautious on what > can > > claim for this project. TensorRT has already supported converting > > TensorFlow and Caffe models to its compatible format for fast inference, > > but not MXNet. In this sense, it may not be fair to claim MXNet as the > > first one supporting Nvidia Volta. > > > > What we are working on is more experimental and research oriented. We > want > > to get the first-hand materials in our own hands by building a INT-8 > > inference prototype and have a thorough understanding on its strength and > > limitation, rather than handing it off completely to TensorRT, which is > > transparent to us. Considering that the project is experimental, it's > still > > too early to make a conclusion here as there are plenty of known/unknown > > issues and unfinished work. > > > > On the other hand, we are glad to hear that Nvidia is working on > supporting > > model conversion from MXNet to TensorRT (Dom please correct me if I'm > > mistaken). It would be super beneficial to MXNet on INT-8 if they could > > open-source their work as we would be able to maintain and add new > features > > on our side. > > > > > > On Mon, Oct 2, 2017 at 8:04 PM, Dominic Divakaruni < > > dominic.divakar...@gmail.com> wrote: > > > > > 👏 > > > > > > On Mon, Oct 2, 2017 at 8:02 PM Seb Kiureghian <sebou...@gmail.com> > > wrote: > > > > > > > It would be awesome if MXNet were the first DL framework to support > > > Nvidia > > > > Volta. What do you all think about cutting a v0.12 release once that > > > > integration is ready? > > > > > > > > On Wed, Sep 27, 2017 at 10:38 PM, Jun Wu <wujun....@gmail.com> > wrote: > > > > > > > > > I had been working on the sparse tensor project with Haibin. After > it > > > was > > > > > wrapped up for the first stage, I started my work on the > quantization > > > > > project (INT-8 inference). The benefits of using quantized models > for > > > > > inference include much higher inference throughput than FP32 model > > with > > > > > acceptable accuracy loss and compact models saved on small devices. > > The > > > > > work currently aims at quantizing ConvNets, and we will consider > > > > expanding > > > > > it to RNN networks after getting good results for images. > Meanwhile, > > > it's > > > > > expected to support quantization on CPU, GPU, and mobile devices. > > > > > > > > > > > > -- > > > > > > > > > Dominic Divakaruni > > > 206.475.9200 Cell > > > > > > -- Dominic Divakaruni 206.475.9200 Cell