[GitHub] [incubator-mxnet] hgt312 commented on a change in pull request #15292: Numpy-compatibe GCD operation

2019-07-02 Thread GitBox
hgt312 commented on a change in pull request #15292: Numpy-compatibe GCD 
operation
URL: https://github.com/apache/incubator-mxnet/pull/15292#discussion_r299801302
 
 

 ##
 File path: python/mxnet/symbol/numpy/_symbol.py
 ##
 @@ -992,6 +993,9 @@ def minimum(x1, x2, out=None):
 def add(x1, x2, out=None):
 return _ufunc_helper(x1, x2, _npi.add, _np.add, _npi.add_scalar, None, out)
 
+@set_module('mxnet.symbol.numpy')
 
 Review comment:
   More blank lines before and after `gcd`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] ckt624 commented on a change in pull request #15381: [Numpy] Add Documentations

2019-07-02 Thread GitBox
ckt624 commented on a change in pull request #15381: [Numpy] Add Documentations
URL: https://github.com/apache/incubator-mxnet/pull/15381#discussion_r299800327
 
 

 ##
 File path: python/mxnet/_numpy_op_doc.py
 ##
 @@ -19,7 +19,6 @@
 
 """Doc placeholder for numpy ops with prefix _np."""
 
-
 
 Review comment:
   Fixed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] mikemwx commented on a change in pull request #15382: [numpy][doc-fix] sum, copy, tile, argmax, sign, log, degrees

2019-07-02 Thread GitBox
mikemwx commented on a change in pull request #15382: [numpy][doc-fix] sum, 
copy, tile, argmax, sign, log, degrees
URL: https://github.com/apache/incubator-mxnet/pull/15382#discussion_r299791680
 
 

 ##
 File path: python/mxnet/numpy/multiarray.py
 ##
 @@ -2188,3 +2270,175 @@ def arctan(x, out=None, **kwargs):
 0.7853981633974483
 """
 return _mx_nd_np.arctan(x, out=out, **kwargs)
+
+@set_module('mxnet.numpy')
+def sign(x, out=None):
+"""
+sign(x, out=None)
+
+Returns an element-wise indication of the sign of a number.
+
+The `sign` function returns ``-1 if x < 0, 0 if x==0, 1 if x > 0``. Only 
supports real number.
+
+Parameters
+--
+x : ndarray or a scalar
+Input values.
+out : ndarray or None, optional
+A location into which the result is stored.
+If provided, it must have the same shape and dtype as input ndarray.
+If not provided or `None`, a freshly-allocated array is returned.
+
+Returns
+---
+y : ndarray
+The sign of `x`.
+This is a scalar if `x` is a scalar.
+
+Note
+---
+- Only supports real number as input elements.
+- Input type does not support Python native iterables(list, tuple, ...).
+- ``out`` param: cannot perform auto broadcasting. ``out`` ndarray's shape 
must be the same as the expected output.
+- ``out`` param: cannot perform auto type cast. ``out`` ndarray's dtype 
must be the same as the expected output.
+- ``out`` param does not support scalar input case.
+
+Examples
+
+>>> a = np.array([-5., 4.5])
+>>> np.sign(a)
+array([-1.,  1.])
+
+Scalars as input:
+
+>>> np.sign(4.0)
+1.0
+>>> np.sign(0)
+0
+
+Use ``out`` parameter:
+
+>>> b = np.zeros((2, ))
+>>> np.sign(a, out=b)
+array([-1.,  1.])
+>>> b
+array([-1.,  1.])
+
+"""
+return _mx_nd_np.sign(x, out=out)
+
+
+@set_module('mxnet.symbol.numpy')
 
 Review comment:
   Is there anything wrong with the namespace?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] haojin2 commented on a change in pull request #15381: [Numpy] Add Documentations

2019-07-02 Thread GitBox
haojin2 commented on a change in pull request #15381: [Numpy] Add Documentations
URL: https://github.com/apache/incubator-mxnet/pull/15381#discussion_r299786639
 
 

 ##
 File path: python/mxnet/_numpy_op_doc.py
 ##
 @@ -19,7 +19,6 @@
 
 """Doc placeholder for numpy ops with prefix _np."""
 
-
 
 Review comment:
   do not remove such lines, we require 2 lines between Python functions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet] branch master updated: [MXNET-978] Higher order gradient for sigmoid (#15288)

2019-07-02 Thread apeforest
This is an automated email from the ASF dual-hosted git repository.

apeforest pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 6a8d9eb  [MXNET-978] Higher order gradient for sigmoid (#15288)
6a8d9eb is described below

commit 6a8d9eb5fd4f7133c094149dc80a3a236534f223
Author: Lin Yuan 
AuthorDate: Tue Jul 2 22:53:39 2019 -0700

[MXNET-978] Higher order gradient for sigmoid (#15288)

* try to add support some ops

* add unit test for second order grad

* implement grad for relu and add unit test

* fix lint

* register FGradient attribute for backward relu

* resolve conflict

* remove unused imports

* change gradient using set_attr

* remove higher order grad test for negative(x)

* fix lint

* reverse indent

* remove unused backward operator

* refactor backward for sin(x) and cos(x)

* change value init to list init

* change to list initialization

* generate random shape in test

* fix a bug in second order backward

* fix lint

* fix lint

* address reviewer comment and renaming

* test 2nd order gradient for sigmoid

* higher order grads for sigmoid

* add unit test

* remove blank lines

* update test

* fix lint

* fix third order gradient for sigmoid
---
 src/common/exec_utils.h |  5 ++---
 src/imperative/imperative.cc|  4 
 src/operator/tensor/elemwise_unary_op_basic.cc  | 30 -
 src/operator/tensor/elemwise_unary_op_trig.cc   |  4 ++--
 tests/python/unittest/test_higher_order_grad.py | 17 ++
 5 files changed, 54 insertions(+), 6 deletions(-)

diff --git a/src/common/exec_utils.h b/src/common/exec_utils.h
index 0551b42..d8b7a33 100644
--- a/src/common/exec_utils.h
+++ b/src/common/exec_utils.h
@@ -286,7 +286,6 @@ inline void LogMemoryPlan(const nnvm::Graph& g) {
   const auto &idx = g.indexed_graph();
   const auto& vshape = g.GetAttr("shape");
   const auto& vtype = g.GetAttr("dtype");
-  const auto& vstorage = g.GetAttr("storage_id");
   // find node range
   uint32_t node_start = 0, node_end = idx.num_nodes();
   if (g.attrs.count("node_range")) {
@@ -304,13 +303,13 @@ inline void LogMemoryPlan(const nnvm::Graph& g) {
 auto eid = idx.entry_id(e);
 size_t kilo_bytes = vshape[eid].Size() * 
mshadow::mshadow_sizeof(vtype[eid]) / 1024;
 LOG(INFO) << "\t\tinput " << eid << ": " << vshape[eid] << " ("
-  << kilo_bytes << " KB) -> " << storage_str(vstorage[eid]);
+  << kilo_bytes << " KB)";
   }
   for (uint32_t index = 0; index < inode.source->num_outputs(); ++index) {
 uint32_t eid = idx.entry_id(nid, index);
 size_t kilo_bytes = vshape[eid].Size() * 
mshadow::mshadow_sizeof(vtype[eid]) / 1024;
 LOG(INFO) << "\t\toutput " << eid << ": " << vshape[eid] << " ("
-  << kilo_bytes << " KB) -> " << storage_str(vstorage[eid]);
+  << kilo_bytes << " KB)";
   }
 }
   }
diff --git a/src/imperative/imperative.cc b/src/imperative/imperative.cc
index d8fba1c..e2c0c9d 100644
--- a/src/imperative/imperative.cc
+++ b/src/imperative/imperative.cc
@@ -501,6 +501,10 @@ std::vector Imperative::Backward(
 }
   }
 
+  if (dmlc::GetEnv("MXNET_MEM_PLAN_VERBOSE_LOGGING", false)) {
+common::LogMemoryPlan(graph);
+  }
+
   // Execution
 
   bool prev_recording = set_is_recording(create_graph);
diff --git a/src/operator/tensor/elemwise_unary_op_basic.cc 
b/src/operator/tensor/elemwise_unary_op_basic.cc
index 98dc8da..26c7408 100644
--- a/src/operator/tensor/elemwise_unary_op_basic.cc
+++ b/src/operator/tensor/elemwise_unary_op_basic.cc
@@ -121,7 +121,35 @@ The storage type of ``sigmoid`` output is always dense
 .set_attr("FGradient", 
ElemwiseGradUseOut{"_backward_sigmoid"});
 
 MXNET_OPERATOR_REGISTER_BINARY_WITH_SPARSE_CPU(_backward_sigmoid,
-   
unary_bwd);
+   
unary_bwd)
+.set_attr("FGradient",
+[](const nnvm::NodePtr& n, const std::vector& ograds) {
+  // n->inputs[0] : y_grad
+  // n->inputs[1] : f(x) = sigmoid(x)
+  // ograds[0] : head_grads
+  // f''(x) = f'(x) * (1 - 2*f(x))
+  // NodeEntry{n} : y_grad * f'(x)
+  auto ones = MakeNode("ones_like", n->attrs.name + "_grad_ones", 
{n->inputs[1]}, nullptr, &n);
+  const std::unordered_map args = {{"scalar", 
"2.0"}};
+  auto two_y = MakeNode("_mul_scalar", n->attrs.name + "_mul_two", 
{n->inputs[1]}, &args, &n);
+  auto one_minus_two_y = MakeNode("elemwise_sub", n->attrs.name + 
"_grad_sub",
+{nnvm::NodeEntry{ones}, 
nnvm::NodeEntry{two_y}}, null

[GitHub] [incubator-mxnet] apeforest merged pull request #15288: [MXNET-978] Higher order gradient for sigmoid

2019-07-02 Thread GitBox
apeforest merged pull request #15288: [MXNET-978] Higher order gradient for 
sigmoid
URL: https://github.com/apache/incubator-mxnet/pull/15288
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] thomelane commented on issue #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
thomelane commented on issue #15427: [TUTORIAL] Gluon performance tips and 
tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#issuecomment-507934941
 
 
   @pengzhao-intel thanks for the follow-up. made the changes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299773533
 
 

 ##
 File path: docs/tutorials/gluon/performance.md
 ##
 @@ -0,0 +1,483 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Gluon Performance Tips & Tricks
+
+Compared to traditional machine learning methods, the field of deep-learning 
has increased model accuracy across a wide range of tasks, but it has also 
increased the amount of computation required for model training and inference. 
Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution 
of networks, but it can sometimes be hard to write code that uses the hardware 
to its full potential. We will be looking at a few simple tips and trick in 
this tutorial that you can use to speed up training and ultimately save on 
training costs.
+
+We'll start by writing some code to train an image classification network for 
the CIFAR-10 dataset, and then benchmark the throughput of the network in terms 
of samples processed per second. After some performance analysis, we'll 
identify the bottlenecks (i.e. the components limiting throughput) and improve 
the training speed step-by-step. We'll bring together all the tips and tricks 
at the end and evaluate our performance gains.
+
+
+```python
+from __future__ import print_function
+import multiprocessing
+import time
+import mxnet as mx
+import numpy as np
+```
+
+An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this 
tutorial. You are likely to get different results and find different 
bottlenecks on other hardware, but these tips and tricks should still help 
improve training speed for bottleneck components. A GPU is recommended for this 
example.
+
+
+```python
+ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu()
+print("Using {} context.".format(ctx))
+```
+
+Using gpu(0) context.
+
+
+We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon.
+
+
+```python
+dataset = mx.gluon.data.vision.CIFAR10(train=True)
+print('{} samples'.format(len(dataset)))
+```
+
+5 samples
+
+
+So we can learn how to identify training bottlenecks, let's intentionally 
introduce a short `sleep` into the data loading pipeline. We transform each 
32x32 CIFAR-10 image to 224x224 so we can use it with the ResNet-50 network 
designed for ImageNet. [CIFAR-10 specific ResNet 
networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet)
 exist but we use the more standard ImageNet variants in this example.
 
 Review comment:
   It was added to make the data pipeline a bottleneck. Added another comment 
to clarify.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299773279
 
 

 ##
 File path: docs/tutorials/gluon/performance.md
 ##
 @@ -0,0 +1,483 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Gluon Performance Tips & Tricks
+
+Compared to traditional machine learning methods, the field of deep-learning 
has increased model accuracy across a wide range of tasks, but it has also 
increased the amount of computation required for model training and inference. 
Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution 
of networks, but it can sometimes be hard to write code that uses the hardware 
to its full potential. We will be looking at a few simple tips and trick in 
this tutorial that you can use to speed up training and ultimately save on 
training costs.
 
 Review comment:
   Can use most of these tricks for inference too, added a comment for this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299772839
 
 

 ##
 File path: docs/tutorials/gluon/performance.md
 ##
 @@ -0,0 +1,483 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Gluon Performance Tips & Tricks
+
+Compared to traditional machine learning methods, the field of deep-learning 
has increased model accuracy across a wide range of tasks, but it has also 
increased the amount of computation required for model training and inference. 
Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution 
of networks, but it can sometimes be hard to write code that uses the hardware 
to its full potential. We will be looking at a few simple tips and trick in 
this tutorial that you can use to speed up training and ultimately save on 
training costs.
+
+We'll start by writing some code to train an image classification network for 
the CIFAR-10 dataset, and then benchmark the throughput of the network in terms 
of samples processed per second. After some performance analysis, we'll 
identify the bottlenecks (i.e. the components limiting throughput) and improve 
the training speed step-by-step. We'll bring together all the tips and tricks 
at the end and evaluate our performance gains.
+
+
+```python
+from __future__ import print_function
+import multiprocessing
+import time
+import mxnet as mx
+import numpy as np
+```
+
+An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this 
tutorial. You are likely to get different results and find different 
bottlenecks on other hardware, but these tips and tricks should still help 
improve training speed for bottleneck components. A GPU is recommended for this 
example.
+
 
 Review comment:
   Changed to 'tutorial' to be more clear on where the GPU is recommended. Many 
references to GPU throughout so it's recommended to be able to follow along.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299772496
 
 

 ##
 File path: docs/tutorials/index.md
 ##
 @@ -91,6 +91,7 @@ Select API: 
* [Image similiarity search with InfoGAN](/tutorials/gluon/info_gan.html)
 * Practitioner Guides
 * [Gotchas using NumPy](/tutorials/gluon/gotchas_numpy_in_mxnet.html)
+* [Performance Tips & Tricks](/tutorials/gluon/performance.html)
 
 Review comment:
   Although we're in the Gluon section so adds a bit of noise.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299772496
 
 

 ##
 File path: docs/tutorials/index.md
 ##
 @@ -91,6 +91,7 @@ Select API: 
* [Image similiarity search with InfoGAN](/tutorials/gluon/info_gan.html)
 * Practitioner Guides
 * [Gotchas using NumPy](/tutorials/gluon/gotchas_numpy_in_mxnet.html)
+* [Performance Tips & Tricks](/tutorials/gluon/performance.html)
 
 Review comment:
   Although we're in the Gluon section so adds a bit of noise.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299772230
 
 

 ##
 File path: docs/tutorials/index.md
 ##
 @@ -91,6 +91,7 @@ Select API: 
* [Image similiarity search with InfoGAN](/tutorials/gluon/info_gan.html)
 * Practitioner Guides
 * [Gotchas using NumPy](/tutorials/gluon/gotchas_numpy_in_mxnet.html)
+* [Performance Tips & Tricks](/tutorials/gluon/performance.html)
 
 Review comment:
   Updated, thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299771879
 
 

 ##
 File path: tests/tutorials/test_tutorials.py
 ##
 @@ -114,6 +114,9 @@ def test_gluon_save_load_params():
 
 def test_gluon_hybrid():
 assert _test_tutorial_nb('gluon/hybrid')
+
+def test_gluon_hybrid():
 
 Review comment:
   Updated.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299771708
 
 

 ##
 File path: tests/tutorials/test_tutorials.py
 ##
 @@ -114,6 +114,9 @@ def test_gluon_save_load_params():
 
 def test_gluon_hybrid():
 assert _test_tutorial_nb('gluon/hybrid')
+
+def test_gluon_hybrid():
 
 Review comment:
   Great catch @wkcn!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet] branch master updated: Remove mhard-float option. This is already deprecated by Google. (#15435)

2019-07-02 Thread zhasheng
This is an automated email from the ASF dual-hosted git repository.

zhasheng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 1547578  Remove mhard-float option. This is already deprecated by 
Google. (#15435)
1547578 is described below

commit 15475788cee87eb6c6b08ddd0af245af7c05536f
Author: Disi A 
AuthorDate: Wed Jul 3 00:05:24 2019 -0400

Remove mhard-float option. This is already deprecated by Google. (#15435)
---
 amalgamation/Makefile | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/amalgamation/Makefile b/amalgamation/Makefile
index d4b2ee0..701c1f1 100644
--- a/amalgamation/Makefile
+++ b/amalgamation/Makefile
@@ -114,8 +114,8 @@ jni_libmxnet_predict.so: jni_libmxnet_predict.o
 ifneq ($(ANDROID), 1)
 android:
 else
-CFLAGS+=  -mhard-float -D_NDK_MATH_NO_SOFTFP=1 -O3
-LDFLAGS+=  -Wl,--no-warn-mismatch -lm_hard
+CFLAGS+= -O3
+LDFLAGS+= -Wl,--no-warn-mismatch -lm_hard
 android: jni_libmxnet_predict.so
 endif
 



[GitHub] [incubator-mxnet] szha merged pull request #15435: Remove mhard-float option when building Amalgamation for Android.

2019-07-02 Thread GitBox
szha merged pull request #15435: Remove mhard-float option when building 
Amalgamation for Android.
URL: https://github.com/apache/incubator-mxnet/pull/15435
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet] branch master updated (c6bb2ce -> 512a491)

2019-07-02 Thread zhasheng
This is an automated email from the ASF dual-hosted git repository.

zhasheng pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git.


from c6bb2ce  Use omp threads for cpu data loader (#15379)
 add 512a491  Temporarily Commenting out Flaky Test (#15436)

No new revisions were added by this update.

Summary of changes:
 tests/python/unittest/test_profiler.py | 3 +++
 1 file changed, 3 insertions(+)



[GitHub] [incubator-mxnet] szha merged pull request #15436: Temporarily Commenting out Flaky Test

2019-07-02 Thread GitBox
szha merged pull request #15436: Temporarily Commenting out Flaky Test
URL: https://github.com/apache/incubator-mxnet/pull/15436
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] DickJC123 commented on issue #15449: cuda/cuDNN lib version checking. Force cuDNN v7 usage.

2019-07-02 Thread GitBox
DickJC123 commented on issue #15449: cuda/cuDNN lib version checking.  Force 
cuDNN v7 usage.
URL: https://github.com/apache/incubator-mxnet/pull/15449#issuecomment-507915947
 
 
   Versioning issues were recently discussed in the dev forum: 
https://lists.apache.org/thread.html/96d4a46a0a3c98ea1f3a3237de713ef5f40967fcb0817d661c18e950@%3Cdev.mxnet.apache.org%3E
   
   Although the PR as it stands does not preclude CUDA8, I propose to add 
STATIC_ASSERT_CUDA_VERSION_GE(9000) given enough consensus.
   
   Tagging @ptrendx @KellenSunderland @marcoabreu @larroy 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] iblis17 commented on issue #14720: [Flaky] Flaky Test test_clamp on windows/cpu

2019-07-02 Thread GitBox
iblis17 commented on issue #14720: [Flaky] Flaky Test test_clamp on windows/cpu
URL: 
https://github.com/apache/incubator-mxnet/issues/14720#issuecomment-507914739
 
 
   Close via https://github.com/apache/incubator-mxnet/pull/14776.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] iblis17 closed issue #14720: [Flaky] Flaky Test test_clamp on windows/cpu

2019-07-02 Thread GitBox
iblis17 closed issue #14720: [Flaky] Flaky Test test_clamp on windows/cpu
URL: https://github.com/apache/incubator-mxnet/issues/14720
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] iblis17 commented on issue #15415: documentation link is broken, goes to spam site

2019-07-02 Thread GitBox
iblis17 commented on issue #15415: documentation link is broken, goes to spam 
site
URL: 
https://github.com/apache/incubator-mxnet/issues/15415#issuecomment-507914528
 
 
   @aaronmarkham If you can get access to a CI worker, there is a dir 
`/work/mxnet/julia/docs/build`.
   
   >  the julia docs are now down, I think it would be a good idea to go ahead 
and host these locally.
   
   Actually, it's hosted via GitHub page, I guess the problem is the domain 
CNAME setting has been pointed to somewhere.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] DickJC123 opened a new pull request #15449: cuda/cuDNN lib version checking. Force cuDNN v7 usage.

2019-07-02 Thread GitBox
DickJC123 opened a new pull request #15449: cuda/cuDNN lib version checking.  
Force cuDNN v7 usage.
URL: https://github.com/apache/incubator-mxnet/pull/15449
 
 
   This PR addresses two issues:
   - rnn.cc of mxnet v1.5 does not compile against cudnn v6.  This PR 
enforces systems that rebuild mxnet to have cudnn v7, and improves the error 
message for compiling against v6.
   - We are accumulating stale code that references no-longer-supported 
cuda/cudnn versions.  This PR provides a means for cleaning out this code.
   
   This PR introduces both runtime and compile-time cuda and cuDNN version 
checking.  The compile time checks are based on new macros: 
STATIC_ASSERT_CUDNN_VERSION_GE(min_version) and 
STATIC_ASSERT_CUDA_VERSION_GE(min_version).  Example usage:
   Before PR:
   ```
   #if MXNET_USE_CUDNN
   #if CUDNN_VERSION >= 7000
   
   #elif CUDNN_VERSION >= 6000
   
   #else
   LOG(FATAL) << "cuDNN too old.";
   #endif
   #endif  // MXNET_USE_CUDNN
   ```
   After PR (given the assumption that we're now requiring cuDNN v7):
   ```
   #if MXNET_USE_CUDNN
   STATIC_ASSERT_CUDNN_VERSION_GE(7000);
   
   #endif  // MXNET_USE_CUDNN
   
   Discussion continues in the comments section.
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to 
the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) 
created (except PRs with tiny changes)
   - [X ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding 
a new operator)
   - Nightly tests are added for complicated/long-running ones (e.g. changing 
distributed kvstore)
   - Build tests will be added for build configuration changes (e.g. adding a 
new build option with NCCL)
   - [X ] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - For new C++ functions in header files, their functionalities and arguments 
are documented. 
   - For new examples, README.md is added to explain the what the example does, 
the source of the dataset, expected performance on test set and reference to 
the original paper if applicable
   - Check the API doc at 
http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
   - [X ] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   This PR improves the compile-time message to a user trying to build MXNet 
1.5 against cuDNN v6.
   Before PR, only the missing library entrypoint is mentioned:
   ```
   g++ ... -c src/operator/operator.cc -o build/src/operator/operator.o
   src/operator/rnn.cc: In function 'std::vector 
mxnet::op::RNNResourceEx(const nnvm::NodeAttrs&, int, mxnet::DispatchMode)':
src/operator/rnn.cc:179:28: error: 'kCuDNNDropoutDesc' is not a member of 
'mxnet::ResourceRequest'
   request.emplace_back(ResourceRequest::kCuDNNDropoutDesc);
   ```
   After the PR, the error mentions the library version issue directly:
   ```
   g++ ... -c src/operator/optimizer_op.cc -o build/src/operator/optimizer_op.o
   In file included from src/operator/././operator_common.h:42:0,
from src/operator/./rnn-inl.h:45,
from src/operator/rnn.cc:29:
   src/operator/rnn.cc: In function 'std::vector 
mxnet::op::RNNResourceEx(const nnvm::NodeAttrs&, int, mxnet::DispatchMode)':
   src/operator/././../common/cuda_utils.h:467:3: error: static assertion 
failed: Compiled-against cuDNN version 6021 is too old, please upgrade system 
to version 7000 or later.
  static_assert(CUDNN_VERSION >= min_version, "Compiled-against cuDNN 
version " \
  ^
   src/operator/rnn.cc:175:5: note: in expansion of macro 
'STATIC_ASSERT_CUDNN_VERSION_GE'
STATIC_ASSERT_CUDNN_VERSION_GE(7000);
^
   src/operator/rnn.cc:180:28: error: 'kCuDNNDropoutDesc' is not a member of 
'mxnet::ResourceRequest'
  request.emplace_back(ResourceRequest::kCuDNNDropoutDesc);
   ^
   ```
   This PR provides 2 runtime checks and issues a warning:
   - when the compiled-against cuda or cuDNN library version does not match 
the linked-against version, and
   - when the library versions are old w.r.t. the versions tested against 
by the MXNet CI.
   
   I built the PR against cuda 9 and cuDNN v7.1.4.  Running any model will emit 
the warning:
   ```
   [01:05:03] src/common/cuda_utils.cc:50: Upgrade advisory: this mxnet has 
been built against cuda library version 9000, which is older than the oldest 
version tested by CI (1).  Set M

[GitHub] [incubator-mxnet] cyrusbehr commented on issue #15393: Unable to build mxnet with OpenCV4 on Raspberry Pi 3B

2019-07-02 Thread GitBox
cyrusbehr commented on issue #15393: Unable to build mxnet with OpenCV4 on 
Raspberry Pi 3B
URL: 
https://github.com/apache/incubator-mxnet/issues/15393#issuecomment-507912741
 
 
   @larroy do the docker files built Mxnet with c++ support `USE_CPP_PACKAGE`? 
Or does it only support python right now? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] zoeygxy commented on a change in pull request #15390: [Numpy fix-doc]modify numpy doc

2019-07-02 Thread GitBox
zoeygxy commented on a change in pull request #15390: [Numpy  fix-doc]modify 
numpy doc
URL: https://github.com/apache/incubator-mxnet/pull/15390#discussion_r299754141
 
 

 ##
 File path: python/mxnet/symbol/numpy/_symbol.py
 ##
 @@ -1555,4 +1613,156 @@ def sqrt(x, out=None, **kwargs):
 return _unary_func_helper(x, _npi.sqrt, _np.sqrt, out=out, **kwargs)
 
 
+@set_module('mxnet.symbol.numpy')
+def ceil(x, out=None, **kwargs):
+r"""
+Return the ceiling of the input, element-wise.
+
+The ceil of the ndarray `x` is the smallest integer `i`, such that
+`i >= x`.  It is often denoted as :math:`\lceil x \rceil`.
+
+Parameters
+--
+x : _Symbol or scalar
+Input array.
+out : _Symbol or None
+A location into which the result is stored. If provided, it
+must have a shape that the inputs broadcast to. If not provided
+or None, a freshly-allocated array is returned. The dtype of the
+output is the same as that of the input if the input is an ndarray.
+
+Returns
+---
+y :
+_Symbol or scalar
+The ceiling of each element in `x`, with `float` dtype.
+This is a scalar if `x` is a scalar.
+
+Examples
+
+>>> a = np.array([-1.7, -1.5, -0.2, 0.2, 1.5, 1.7, 2.0])
+>>> np.ceil(a)
+array([-1., -1., -0.,  1.,  2.,  2.,  2.])
+
+>>> #if you use parameter out, x and out must be ndarray. if not, you will 
get an error!
+>>> a = np.array(1)
+>>> np.ceil(np.array(3.5), a)
+array(4.)
+>>> a
+array(4.)
+
+"""
+return _unary_func_helper(x, _npi.ceil, _np.ceil, out=out, **kwargs)
+
+
+@set_module('mxnet.symbol.numpy')
+def log1p(x, out=None, **kwargs):
+"""
+Return the natural logarithm of one plus the input array, element-wise.
+
+Calculates ``log(1 + x)``.
+
+Parameters
+--
+x :
+_Symbol or scalar
+Input array.
+out : _Symbol or None
+A location into which the result is stored. If provided, it
+must have a shape that the inputs broadcast to. If not provided
+or None, a freshly-allocated array is returned. The dtype of the
+output is the same as that of the input if the input is an ndarray.
+
+Returns
+---
+y : _Symbol or scalar
+Natural logarithm of 1 + x, element-wise. This is a scalar
+if x is a scalar.
+
+Notes
+-
+
+For real-valued input, `log1p` is accurate also for `x` so small
+that `1 + x == 1` in floating-point accuracy.
+
+Logarithm is a multivalued function: for each `x` there is an infinite
+number of `z` such that `exp(z) = 1 + x`. The convention is to return
+the `z` whose imaginary part lies in `[-pi, pi]`.
+
+For real-valued input data types, `log1p` always returns real output.
+For each value that cannot be expressed as a real number or infinity,
+it yields ``nan`` and sets the `invalid` floating point error flag.
+
+For complex-valued input, `log1p` is a complex analytical function that
+has a branch cut `[-inf, -1]` and is continuous from above on it.
+`log1p` handles the floating-point negative zero as an infinitesimal
+negative number, conforming to the C99 standard.
+
+Examples
+
+>>> np.log1p(1e-99)
+1e-99
+
+"""
+return _unary_func_helper(x, _npi.log1p, _np.log1p, out=out, **kwargs)
+
+
+@set_module('mxnet.symbol.numpy')
+def tanh(x, out=None, **kwargs):
+"""
+Compute hyperbolic tangent element-wise.
+
+Equivalent to ``np.sinh(x)/np.cosh(x)``.
+
+Parameters
+--
+x :
+_Symbol
+Input array.
+out : _Symbol or None
+A location into which the result is stored. If provided, it
+must have a shape that the inputs broadcast to. If not provided
+or None, a freshly-allocated array is returned. The dtype of the
+output is the same as that of the input if the input is an ndarray.
+Returns
+---
+y : _Symbol
+The corresponding hyperbolic tangent values.
+
+Notes
+-
+If `out` is provided, the function writes the result into it,
+and returns a reference to `out`.  (See Examples)
+
+- Not support complex computation (like imaginary number)
+
+>>> np.tanh(np.pi*1j)
+TypeError: type  not supported
+
+Examples
+
+>>> np.tanh(np.array[0, np.pi]))
 
 Review comment:
   @gyshi  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] zoeygxy commented on a change in pull request #15390: [Numpy fix-doc]modify numpy doc

2019-07-02 Thread GitBox
zoeygxy commented on a change in pull request #15390: [Numpy  fix-doc]modify 
numpy doc
URL: https://github.com/apache/incubator-mxnet/pull/15390#discussion_r299754108
 
 

 ##
 File path: python/mxnet/_numpy_op_doc.py
 ##
 @@ -173,3 +212,73 @@ def _np_cumsum(a, axis=None, dtype=None, out=None):
 `axis` is not None or `a` is a 1-d array.
 """
 pass
+
+
+def _np_max(axis=None, keepdims=False, initial=None, out=None):
+"""
+Return the maximum of an array or maximum along an axis.
+
+Parameters
+--
+a : ndarray
+Input data.
+axis : None or int or tuple of ints, optional
+Axis or axes along which to operate.  By default, flattened input is
+used.
+
+If this is a tuple of ints, the maximum is selected over multiple axes,
+instead of a single axis or all the axes as before.
+
+keepdims : bool, optional
+If this is set to True, the axes which are reduced are left
+in the result as dimensions with size one. With this option,
+the result will broadcast correctly against the input array.
+
+If the default value is passed, then `keepdims` will not be
+passed through to the `amax` method of sub-classes of
+`ndarray`, however any non-default value will be.  If the
+sub-class' method does not implement `keepdims` any
+exceptions will be raised.
+
+initial :
+Parameter initial is not supported yet, we will support it in the 
future.
+now it must be None.
+
+out : ndarray, optional
 
 Review comment:
   @gyshi Please add this in the `note` area. This is different from native 
numpy. I would also suggest giving a note here in `out` parameter description. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] pengzhao-intel commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
pengzhao-intel commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299751199
 
 

 ##
 File path: docs/tutorials/gluon/performance.md
 ##
 @@ -0,0 +1,483 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Gluon Performance Tips & Tricks
+
+Compared to traditional machine learning methods, the field of deep-learning 
has increased model accuracy across a wide range of tasks, but it has also 
increased the amount of computation required for model training and inference. 
Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution 
of networks, but it can sometimes be hard to write code that uses the hardware 
to its full potential. We will be looking at a few simple tips and trick in 
this tutorial that you can use to speed up training and ultimately save on 
training costs.
 
 Review comment:
   "We will be looking at a few simple tips and trick in this tutorial that you 
can use to speed up training and ultimately save on training costs."
   Does this document only cover for "training"? If so, how about change title 
to "Gluon Performance Tips ... for training"?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] pengzhao-intel commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
pengzhao-intel commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299751380
 
 

 ##
 File path: docs/tutorials/gluon/performance.md
 ##
 @@ -0,0 +1,483 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Gluon Performance Tips & Tricks
+
+Compared to traditional machine learning methods, the field of deep-learning 
has increased model accuracy across a wide range of tasks, but it has also 
increased the amount of computation required for model training and inference. 
Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution 
of networks, but it can sometimes be hard to write code that uses the hardware 
to its full potential. We will be looking at a few simple tips and trick in 
this tutorial that you can use to speed up training and ultimately save on 
training costs.
+
+We'll start by writing some code to train an image classification network for 
the CIFAR-10 dataset, and then benchmark the throughput of the network in terms 
of samples processed per second. After some performance analysis, we'll 
identify the bottlenecks (i.e. the components limiting throughput) and improve 
the training speed step-by-step. We'll bring together all the tips and tricks 
at the end and evaluate our performance gains.
+
+
+```python
+from __future__ import print_function
+import multiprocessing
+import time
+import mxnet as mx
+import numpy as np
+```
+
+An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this 
tutorial. You are likely to get different results and find different 
bottlenecks on other hardware, but these tips and tricks should still help 
improve training speed for bottleneck components. A GPU is recommended for this 
example.
+
+
+```python
+ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu()
+print("Using {} context.".format(ctx))
+```
+
+Using gpu(0) context.
+
+
+We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon.
+
+
+```python
+dataset = mx.gluon.data.vision.CIFAR10(train=True)
+print('{} samples'.format(len(dataset)))
+```
+
+5 samples
+
+
+So we can learn how to identify training bottlenecks, let's intentionally 
introduce a short `sleep` into the data loading pipeline. We transform each 
32x32 CIFAR-10 image to 224x224 so we can use it with the ResNet-50 network 
designed for ImageNet. [CIFAR-10 specific ResNet 
networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet)
 exist but we use the more standard ImageNet variants in this example.
 
 Review comment:
   Why add a "sleep"?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] pengzhao-intel commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
pengzhao-intel commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299751165
 
 

 ##
 File path: docs/tutorials/gluon/performance.md
 ##
 @@ -0,0 +1,483 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Gluon Performance Tips & Tricks
+
+Compared to traditional machine learning methods, the field of deep-learning 
has increased model accuracy across a wide range of tasks, but it has also 
increased the amount of computation required for model training and inference. 
Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution 
of networks, but it can sometimes be hard to write code that uses the hardware 
to its full potential. We will be looking at a few simple tips and trick in 
this tutorial that you can use to speed up training and ultimately save on 
training costs.
+
+We'll start by writing some code to train an image classification network for 
the CIFAR-10 dataset, and then benchmark the throughput of the network in terms 
of samples processed per second. After some performance analysis, we'll 
identify the bottlenecks (i.e. the components limiting throughput) and improve 
the training speed step-by-step. We'll bring together all the tips and tricks 
at the end and evaluate our performance gains.
+
+
+```python
+from __future__ import print_function
+import multiprocessing
+import time
+import mxnet as mx
+import numpy as np
+```
+
+An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this 
tutorial. You are likely to get different results and find different 
bottlenecks on other hardware, but these tips and tricks should still help 
improve training speed for bottleneck components. A GPU is recommended for this 
example.
+
 
 Review comment:
   I think both CPU and GPU is fine for this small case.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] KellenSunderland commented on a change in pull request #15335: enable TensorRT integration with cpp api

2019-07-02 Thread GitBox
KellenSunderland commented on a change in pull request #15335: enable TensorRT 
integration with cpp api
URL: https://github.com/apache/incubator-mxnet/pull/15335#discussion_r299753358
 
 

 ##
 File path: cpp-package/include/mxnet-cpp/contrib.h
 ##
 @@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*!
+*  Copyright (c) 2019 by Contributors
+* \file contrib.h
+* \brief utility function to enable some contrib features
+* \author Haohuan Wang
+*/
+#ifndef MXNET_CPP_CONTRIB_H_
+#define MXNET_CPP_CONTRIB_H_
+
+#include 
+#include 
+#include 
+#include 
+#include "mxnet-cpp/symbol.h"
+
+namespace mxnet {
+namespace cpp {
+namespace details {
+
+  /*!
+   * split a string with the given delimiter
+   * @param str string to be parsed
+   * @param delimiter delimiter
+   * @return delimited list of string
+   */
+  inline std::vector split(const std::string& str, const 
std::string& delimiter) {
+std::vector splitted;
+size_t last = 0;
+size_t next = 0;
+while ((next = str.find(delimiter, last)) != std::string::npos) {
+  splitted.push_back(str.substr(last, next - last));
+  last = next + 1;
+}
+splitted.push_back(str.substr(last));
+return splitted;
+  }
+
+}  // namespace details
+
+namespace contrib {
 
 Review comment:
   @szha Hey Sheng, have we had contrib namespaces in C++ before?  The intent 
of this code basically aligns with the intent of having a contrib level python 
API.  Does namespacing it out like this make sense to you?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] pengzhao-intel commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
pengzhao-intel commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299751199
 
 

 ##
 File path: docs/tutorials/gluon/performance.md
 ##
 @@ -0,0 +1,483 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Gluon Performance Tips & Tricks
+
+Compared to traditional machine learning methods, the field of deep-learning 
has increased model accuracy across a wide range of tasks, but it has also 
increased the amount of computation required for model training and inference. 
Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution 
of networks, but it can sometimes be hard to write code that uses the hardware 
to its full potential. We will be looking at a few simple tips and trick in 
this tutorial that you can use to speed up training and ultimately save on 
training costs.
 
 Review comment:
   " but it can sometimes be hard to write code that uses the hardware to its 
full potential."
   The framework is the bridge between HW and SW so it supposes it will be easy 
to write the code. The performance is another thing.
   
   "We will be looking at a few simple tips and trick in this tutorial that you 
can use to speed up training and ultimately save on training costs."
   Does this document only cover for "training"? If so, how about change tilte 
to "Gluon Performance Tips ... for training"?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] zoeygxy commented on a change in pull request #15390: [Numpy fix-doc]modify numpy doc

2019-07-02 Thread GitBox
zoeygxy commented on a change in pull request #15390: [Numpy  fix-doc]modify 
numpy doc
URL: https://github.com/apache/incubator-mxnet/pull/15390#discussion_r299753290
 
 

 ##
 File path: python/mxnet/_numpy_op_doc.py
 ##
 @@ -32,24 +33,70 @@ def _np_reshape(a, newshape, order='C'):
 an integer, then the result will be a 1-D array of that length.
 One shape dimension can be -1. In this case, the value is
 inferred from the length of the array and remaining dimensions.
-order : {'C'}, optional
+order : {'C', 'F', 'A'}, optional
 Read the elements of `a` using this index order, and place the
 elements into the reshaped array using this index order.  'C'
 means to read / write the elements using C-like index order,
 with the last axis index changing fastest, back to the first
-axis index changing slowest. Other order types such as 'F'/'A'
-may be added in the future.
+axis index changing slowest. 'F' means to read / write the
+elements using Fortran-like index order, with the first index
+changing fastest, and the last index changing slowest. Note that
+the 'C' and 'F' options take no account of the memory layout of
+the underlying array, and only refer to the order of indexing.
+'A' means to read / write the elements in Fortran-like index
+order if `a` is Fortran *contiguous* in memory, C-like order
+otherwise.
 
 Returns
 ---
 reshaped_array : ndarray
-It will be always a copy of the original array. This behavior is 
different
-from the official NumPy package where views of the original array may 
be
-generated.
+This will be a new view object if possible; otherwise, it will
+be a copy.  Note there is no guarantee of the *memory layout* (C- or
+Fortran- contiguous) of the returned array.
 
-See Also
+
+Notes
+-
+It is not always possible to change the shape of an array without
+copying the data. If you want an error to be raised when the data is 
copied,
+you should assign the new shape to the shape attribute of the array::
+
+ >>> a = np.zeros((10, 2))
+ # A transpose makes the array non-contiguous
+ >>> b = a.T
+ # Taking a view makes it possible to modify the shape without modifying
+ # the initial object.
+
+>>> a = np.arange(6).reshape((3, 2))
+>>> a
+array([[0., 1.],
+   [2., 3.],
+   [4., 5.]])
+
+You can think of reshaping as first raveling the array (using the given
+index order), then inserting the elements from the raveled array into the
+new array using the same kind of index ordering as was used for the
+raveling.
+
+>>> np.reshape(a, (2, 3)) # C-like index ordering
+array([[0., 1., 2.],
+   [3., 4., 5.]])
+
+- order only support C-order
+- input not support scalar
+- not support zero-size shape
+
+Examples
 
-ndarray.reshape : Equivalent method.
+>>> a = np.array([[1,2,3], [4,5,6]])
+>>> np.reshape(a, 6)
+array([1., 2., 3., 4., 5., 6.])
+
+>>> np.reshape(a, (3,-1))   # the unspecified value is inferred to be 2
 
 Review comment:
   Have you tested this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] pengzhao-intel commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
pengzhao-intel commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299749824
 
 

 ##
 File path: docs/tutorials/index.md
 ##
 @@ -91,6 +91,7 @@ Select API: 
* [Image similiarity search with InfoGAN](/tutorials/gluon/info_gan.html)
 * Practitioner Guides
 * [Gotchas using NumPy](/tutorials/gluon/gotchas_numpy_in_mxnet.html)
+* [Performance Tips & Tricks](/tutorials/gluon/performance.html)
 
 Review comment:
   Better to align with the name in the page "Gluon Performance Tips & Tricks"


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] xinyu-intel opened a new pull request #15448: [MKLDNN]Enhance Quantization APIs and Tutorial

2019-07-02 Thread GitBox
xinyu-intel opened a new pull request #15448: [MKLDNN]Enhance Quantization APIs 
and Tutorial
URL: https://github.com/apache/incubator-mxnet/pull/15448
 
 
   ## Description ##
   
   - Create a MKL-DNN specific user-level api `quantize_model_mkldnn` which 
combines fusion and quantization.
   
   - Enable `resnet50_v1b` quantized model.
   
   - Split `quantize_model` API into three parts to make it flexible for users 
to integrate quantization flow into their project:
   1)`quantize_graph`: quantize fp32 model to int8 model w/o calibration and 
return a collector for collecting calibration information in the next step.
   2)[outside api]: users need only add a few lines together with mod.forward 
for collecting calibration information.
   3)`calib_graph`: generate calibrated model based on filled collector.
   
   - Draft a tutorial to introduce **How to quantize custom models for 
production-level inference with MKL-DNN backend**.
   
   @pengzhao-intel @TaoLv @ZhennanQin @ciyongch 
   
   ## Checklist ##
   ### Essentials ###
   Please feel free to remove inapplicable items for your PR.
   - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to 
the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) 
created (except PRs with tiny changes)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage:
   - Unit tests are added for small changes to verify correctness (e.g. adding 
a new operator)
   - Nightly tests are added for complicated/long-running ones (e.g. changing 
distributed kvstore)
   - Build tests will be added for build configuration changes (e.g. adding a 
new build option with NCCL)
   - [ ] Code is well-documented: 
   - For user-facing API changes, API doc string has been updated. 
   - For new C++ functions in header files, their functionalities and arguments 
are documented. 
   - For new examples, README.md is added to explain the what the example does, 
the source of the dataset, expected performance on test set and reference to 
the original paper if applicable
   - Check the API doc at 
http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
   - [ ] To the my best knowledge, examples are either not affected by this 
change, or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] Feature1, tests, (and when applicable, API doc)
   - [ ] Feature2, tests, (and when applicable, API doc)
   
   ## Comments ##
   - If this change is a backward incompatible change, why must this change be 
made.
   - Interesting edge cases to note here
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] larroy commented on issue #15369: Fix build with system's openmp

2019-07-02 Thread GitBox
larroy commented on issue #15369: Fix build with system's openmp
URL: https://github.com/apache/incubator-mxnet/pull/15369#issuecomment-507901448
 
 
   @szha understood. it would be beneficial to be able to replicate that with 
CMake by having a selector for the openmp lib until then.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] larroy commented on issue #15424: fixed config.mk and Makefile bugs for installing mkl

2019-07-02 Thread GitBox
larroy commented on issue #15424: fixed config.mk and Makefile bugs for 
installing mkl
URL: https://github.com/apache/incubator-mxnet/pull/15424#issuecomment-507901172
 
 
   Shouldn't USE_STATIC_MKL appear in the Makefile directly then?  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.

2019-07-02 Thread marcoabreu
This is an automated email from the ASF dual-hosted git repository.

marcoabreu pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 3884f6a  Bump the publish timestamp.
3884f6a is described below

commit 3884f6abde723dfdf2b43299fd4baa9ff5038f7e
Author: mxnet-ci 
AuthorDate: Wed Jul 3 01:17:26 2019 +

Bump the publish timestamp.
---
 date.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/date.txt b/date.txt
new file mode 100644
index 000..e835a92
--- /dev/null
+++ b/date.txt
@@ -0,0 +1 @@
+Wed Jul  3 01:17:26 UTC 2019



[GitHub] [incubator-mxnet] Kangzf1996 removed a comment on issue #15392: Fails to make -j4

2019-07-02 Thread GitBox
Kangzf1996 removed a comment on issue #15392: Fails to make -j4
URL: 
https://github.com/apache/incubator-mxnet/issues/15392#issuecomment-506820105
 
 
   > @Kangzf1996 , Wow. This seems to be using MXNet 0.10.0 which as released 
in May 2017. Isn't this project compatible with latest MXNet?
   
@vdantu Hi, there are also some errors when I using MXNet1.4.0.
   when I run the make -j4, the error is following:
   compilation terminated.
   make: *** [Makefile:461: build/src/operator/nn/mkldnn/mkldnn_copy.o] Error 1
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] Kangzf1996 commented on issue #15392: Fails to make -j4

2019-07-02 Thread GitBox
Kangzf1996 commented on issue #15392: Fails to make -j4
URL: 
https://github.com/apache/incubator-mxnet/issues/15392#issuecomment-507890195
 
 
   > @Kangzf1996 , Wow. This seems to be using MXNet 0.10.0 which as released 
in May 2017. Isn't this project compatible with latest MXNet?
   
   @vdantu Hi, there are also some errors when I using MXNet1.4.0.
   when I run the make -j4, the error is following:
   compilation terminated.
   make: *** [Makefile:461: build/src/operator/nn/mkldnn/mkldnn_copy.o] Error 1


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] TaoLv commented on issue #15424: fixed config.mk and Makefile bugs for installing mkl

2019-07-02 Thread GitBox
TaoLv commented on issue #15424: fixed config.mk and Makefile bugs for 
installing mkl
URL: https://github.com/apache/incubator-mxnet/pull/15424#issuecomment-507889995
 
 
   I think the lower cases BLAS check in Makefile was added for runtime feature 
detection. It's not really used for BLAS linkage. Please correct me if I'm 
wrong @larroy .
   I tried USE_BLAS=mkl in make command line and can find MKL `.a` files in the 
link line and there is no MKL `.so` files in ldd output. Can you double check? 
@nuslq 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] larroy commented on issue #15405: Fix memory leak in NaiveEngine

2019-07-02 Thread GitBox
larroy commented on issue #15405: Fix memory leak in NaiveEngine
URL: https://github.com/apache/incubator-mxnet/pull/15405#issuecomment-507887002
 
 
   @mxnet-label-bot add [pr-awaiting-merge]


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] wkcn commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
wkcn commented on a change in pull request #15427: [TUTORIAL] Gluon performance 
tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299727333
 
 

 ##
 File path: tests/tutorials/test_tutorials.py
 ##
 @@ -114,6 +114,9 @@ def test_gluon_save_load_params():
 
 def test_gluon_hybrid():
 assert _test_tutorial_nb('gluon/hybrid')
+
+def test_gluon_hybrid():
 
 Review comment:
   Thanks for your contribution! It seems to be test_gluon_performance.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] IvyBazan removed a comment on issue #15445: MXNet export broken link

2019-07-02 Thread GitBox
IvyBazan removed a comment on issue #15445: MXNet export broken link
URL: 
https://github.com/apache/incubator-mxnet/issues/15445#issuecomment-507884878
 
 
   @mxnet-label-bot Website


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] IvyBazan edited a comment on issue #15445: MXNet export broken link

2019-07-02 Thread GitBox
IvyBazan edited a comment on issue #15445: MXNet export broken link
URL: 
https://github.com/apache/incubator-mxnet/issues/15445#issuecomment-507884878
 
 
   @mxnet-label-bot Website


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] IvyBazan commented on issue #15445: MXNet export broken link

2019-07-02 Thread GitBox
IvyBazan commented on issue #15445: MXNet export broken link
URL: 
https://github.com/apache/incubator-mxnet/issues/15445#issuecomment-507884878
 
 
   @mxnet-label-bot add Website


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] Caenorst commented on a change in pull request #15335: enable TensorRT integration with cpp api

2019-07-02 Thread GitBox
Caenorst commented on a change in pull request #15335: enable TensorRT 
integration with cpp api
URL: https://github.com/apache/incubator-mxnet/pull/15335#discussion_r299727739
 
 

 ##
 File path: cpp-package/include/mxnet-cpp/contrib.h
 ##
 @@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/*!
+*  Copyright (c) 2019 by Contributors
+* \file contrib.h
+* \brief utility function to enable some contrib features
+* \author Haohuan Wang
+*/
+#ifndef MXNET_CPP_CONTRIB_H_
+#define MXNET_CPP_CONTRIB_H_
+
+#include 
+#include 
+#include 
+#include 
+#include "mxnet-cpp/symbol.h"
+
+namespace mxnet {
+namespace cpp {
+namespace details {
+
+  /*!
+   * split a string with the given delimiter
+   * @param str string to be parsed
+   * @param delimiter delimiter
+   * @return delimited list of string
+   */
+  inline std::vector split(const std::string& str, const 
std::string& delimiter) {
+std::vector splitted;
+size_t last = 0;
+size_t next = 0;
+while ((next = str.find(delimiter, last)) != std::string::npos) {
+  splitted.push_back(str.substr(last, next - last));
+  last = next + 1;
+}
+splitted.push_back(str.substr(last));
+return splitted;
+  }
+
+}  // namespace details
+
+namespace contrib {
+
+  // needs to be same with
+  //   
https://github.com/apache/incubator-mxnet/blob/1c874cfc807cee755c38f6486e8e0f4d94416cd8/src/operator/subgraph/tensorrt/tensorrt-inl.h#L190
+  static const std::string TENSORRT_SUBGRAPH_PARAM_IDENTIFIER = 
"subgraph_params_names";
+  // needs to be same with
+  //   
https://github.com/apache/incubator-mxnet/blob/master/src/operator/subgraph/tensorrt/tensorrt.cc#L244
+  static const std::string TENSORRT_SUBGRAPH_PARAM_PREFIX = "subgraph_param_";
 
 Review comment:
   I see, ignore my original comment then


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] thomelane commented on issue #15396: [TUTORIAL] Gluon and Sparse NDArray

2019-07-02 Thread GitBox
thomelane commented on issue #15396: [TUTORIAL] Gluon and Sparse NDArray
URL: https://github.com/apache/incubator-mxnet/pull/15396#issuecomment-507884716
 
 
   thanks for the reviews @eric-haibin-lin @aaronmarkham. updates made.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API

2019-07-02 Thread GitBox
anirudh2290 commented on a change in pull request #15245: FP16 Support for C 
Predict API
URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299717110
 
 

 ##
 File path: amalgamation/python/mxnet_predict.py
 ##
 @@ -133,15 +199,38 @@ def __init__(self, symbol_file,
 handle = PredictorHandle()
 param_raw_bytes = bytearray(param_raw_bytes)
 ptr = (ctypes.c_char * 
len(param_raw_bytes)).from_buffer(param_raw_bytes)
-_check_call(_LIB.MXPredCreate(
+
+# data types
+num_provided_arg_types = 0
+# provided type argument names
+provided_arg_type_names = ctypes.POINTER(ctypes.c_char_p)()
+# provided types
+provided_arg_type_data = ctypes.POINTER(mx_uint)()
+if type_dict is not None:
+provided_arg_type_names = []
+provided_arg_type_data = []
+for k, v in type_dict.items():
+v = np.dtype(v).type
+if v in _DTYPE_NP_TO_MX:
+provided_arg_type_names.append(k)
 
 Review comment:
   I don't think there is an easy way to pass a map to the C API and this is 
how we pass a map of name value to the C API today. This is also how it is done 
at other places in MXNet. Would be interested to see the issue.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API

2019-07-02 Thread GitBox
anirudh2290 commented on a change in pull request #15245: FP16 Support for C 
Predict API
URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299719073
 
 

 ##
 File path: amalgamation/python/mxnet_predict.py
 ##
 @@ -160,10 +249,18 @@ def forward(self, **kwargs):
 >>> predictor.forward(data=mydata)
 >>> out = predictor.get_output(0)
 """
+if self.type_dict and len(self.type_dict) != len(kwargs.items()):
+raise ValueError("number of kwargs should be same as len of 
type_dict" \
+ "Please check your forward pass inputs" \
+ "or type_dict passed to Predictor instantiation")
+
 for k, v in kwargs.items():
 if not isinstance(v, np.ndarray):
 raise ValueError("Expect numpy ndarray as input")
-v = np.asarray(v, dtype=np.float32, order='C')
+if self.type_dict and k in self.type_dict:
+v = np.asarray(v, dtype=self.type_dict[k], order='C')
+else:
+v = np.asarray(v, dtype=np.float32, order='C')
 
 Review comment:
   if not supported type is used it will go into if clause and fail. it will 
silently convert to FP32 to if not provided.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API

2019-07-02 Thread GitBox
anirudh2290 commented on a change in pull request #15245: FP16 Support for C 
Predict API
URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299722242
 
 

 ##
 File path: include/mxnet/c_predict_api.h
 ##
 @@ -85,6 +85,44 @@ MXNET_DLL int MXPredCreate(const char* symbol_json_str,
const mx_uint* input_shape_data,
PredictorHandle* out);
 
+/*!
+ * \brief create a predictor
+ * \param symbol_json_str The JSON string of the symbol.
+ * \param param_bytes The in-memory raw bytes of parameter ndarray file.
+ * \param param_size The size of parameter ndarray file.
+ * \param dev_type The device type, 1: cpu, 2: gpu
+ * \param dev_id The device id of the predictor.
+ * \param num_input_nodes Number of input nodes to the net.
+ *For feedforward net, this is 1.
+ * \param input_keys The name of the input argument.
+ *For feedforward net, this is {"data"}
+ * \param input_shape_indptr Index pointer of shapes of each input node.
+ *The length of this array = num_input_nodes + 1.
+ *For feedforward net that takes 4 dimensional input, this is {0, 4}.
+ * \param input_shape_data A flattened data of shapes of each input node.
+ *For feedforward net that takes 4 dimensional input, this is the shape 
data.
+ * \param num_provided_arg_dtypes
+ *The length of provided_arg_dtypes.
+ * \param provided_arg_dtype_names
+ *The provided_arg_dtype_names the names of args for which dtypes are 
provided.
+ * \param provided_arg_dtypes
+ *The provided_arg_dtypes the dtype provided
+ * \param out The created predictor handle.
+ * \return 0 when success, -1 when failure.
+ */
+MXNET_DLL int MXPredCreateEx(const char* symbol_json_str,
+ const void* param_bytes,
+ int param_size,
+ int dev_type, int dev_id,
+ mx_uint num_input_nodes,
 
 Review comment:
   This is just to keep the same interface as MXPredCreate. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API

2019-07-02 Thread GitBox
anirudh2290 commented on a change in pull request #15245: FP16 Support for C 
Predict API
URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299725608
 
 

 ##
 File path: src/c_api/c_predict_api.cc
 ##
 @@ -210,19 +249,31 @@ int _CreatePartialOut(const char* symbol_json_str,
 
   std::vector arg_arrays, aux_arrays;
   for (size_t i = 0; i < arg_shapes.size(); ++i) {
-NDArray nd = NDArray(arg_shapes[i], ctx);
+NDArray nd;
+if (result_arg_types[i] != -1) {
+  nd = NDArray(arg_shapes[i], ctx, false, result_arg_types[i]);
+} else {
+  nd = NDArray(arg_shapes[i], ctx);
+}
 if (arg_params.count(arg_names[i]) != 0) {
   CopyFromTo(arg_params[arg_names[i]], &nd);
 }
 arg_arrays.push_back(nd);
   }
+
   for (size_t i = 0; i < aux_shapes.size(); ++i) {
-NDArray nd = NDArray(aux_shapes[i], ctx);
+NDArray nd;
+if (result_aux_types[i] != -1) {
 
 Review comment:
   This was added as I was foreseeing the AMP change. and didnt even have the 
   ```
   CHECK(infer_type_complete)
   << "The type information is not enough, please provide input 
arg_types "
  "with provided_arg_dtype_names and provided_arg_dtypes";
   ```
   Will add a test for AMP for C Predict API and then get back here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API

2019-07-02 Thread GitBox
anirudh2290 commented on a change in pull request #15245: FP16 Support for C 
Predict API
URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299718649
 
 

 ##
 File path: amalgamation/python/mxnet_predict.py
 ##
 @@ -160,10 +249,18 @@ def forward(self, **kwargs):
 >>> predictor.forward(data=mydata)
 >>> out = predictor.get_output(0)
 """
+if self.type_dict and len(self.type_dict) != len(kwargs.items()):
+raise ValueError("number of kwargs should be same as len of 
type_dict" \
+ "Please check your forward pass inputs" \
+ "or type_dict passed to Predictor instantiation")
+
 for k, v in kwargs.items():
 if not isinstance(v, np.ndarray):
 raise ValueError("Expect numpy ndarray as input")
-v = np.asarray(v, dtype=np.float32, order='C')
+if self.type_dict and k in self.type_dict:
+v = np.asarray(v, dtype=self.type_dict[k], order='C')
 
 Review comment:
   I think this row-major only. Converting all inputs to row major format, 
which mxnet expects.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API

2019-07-02 Thread GitBox
anirudh2290 commented on a change in pull request #15245: FP16 Support for C 
Predict API
URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299725759
 
 

 ##
 File path: src/c_api/c_predict_api.cc
 ##
 @@ -444,6 +538,20 @@ int MXPredGetOutputShape(PredictorHandle handle,
   API_END();
 }
 
+int MXPredGetOutputType(PredictorHandle handle,
+mx_uint out_index,
+int* out_dtype) {
+  MXAPIPredictor* p = static_cast(handle);
+  API_BEGIN();
+  CHECK_LT(out_index, p->out_arrays.size())
+<< "Index exceed number of outputs";
 
 Review comment:
   okay sure.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API

2019-07-02 Thread GitBox
anirudh2290 commented on a change in pull request #15245: FP16 Support for C 
Predict API
URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299719401
 
 

 ##
 File path: amalgamation/python/mxnet_predict.py
 ##
 @@ -160,10 +249,18 @@ def forward(self, **kwargs):
 >>> predictor.forward(data=mydata)
 >>> out = predictor.get_output(0)
 """
+if self.type_dict and len(self.type_dict) != len(kwargs.items()):
+raise ValueError("number of kwargs should be same as len of 
type_dict" \
+ "Please check your forward pass inputs" \
+ "or type_dict passed to Predictor instantiation")
+
 for k, v in kwargs.items():
 if not isinstance(v, np.ndarray):
 raise ValueError("Expect numpy ndarray as input")
-v = np.asarray(v, dtype=np.float32, order='C')
+if self.type_dict and k in self.type_dict:
+v = np.asarray(v, dtype=self.type_dict[k], order='C')
+else:
+v = np.asarray(v, dtype=np.float32, order='C')
 
 Review comment:
   not sure i understand. can you elaborate.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API

2019-07-02 Thread GitBox
anirudh2290 commented on a change in pull request #15245: FP16 Support for C 
Predict API
URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299716688
 
 

 ##
 File path: amalgamation/python/mxnet_predict.py
 ##
 @@ -25,17 +25,74 @@
 
 import os
 import sys
+from array import array
 import ctypes
 import logging
 import numpy as np
 
+# pylint: disable= no-member
+_DTYPE_NP_TO_MX = {
+None: -1,
 
 Review comment:
   this is similar to symbol API and NDarray API where None is mapped to -1.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API

2019-07-02 Thread GitBox
anirudh2290 commented on a change in pull request #15245: FP16 Support for C 
Predict API
URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299722802
 
 

 ##
 File path: src/c_api/c_predict_api.cc
 ##
 @@ -187,21 +206,41 @@ int _CreatePartialOut(const char* symbol_json_str,
 
   try {
 mxnet::ShapeVector in_shapes;
+nnvm::DTypeVector in_types;
 for (std::string key : sym.ListInputNames(Symbol::kAll)) {
   if (known_shape.count(key) != 0) {
 in_shapes.push_back(known_shape[key]);
   } else {
 in_shapes.emplace_back();
   }
 }
+
+for (std::string key : sym.ListInputNames(Symbol::kAll)) {
+  if (arg_types.count(key) != 0) {
+in_types.push_back(arg_types[key]);
+  } else if (aux_types.count(key) != 0) {
+in_types.push_back(aux_types[key]);
+  }
+}
 nnvm::Graph g; g.outputs = sym.outputs;
 g = mxnet::exec::InferShape(std::move(g), std::move(in_shapes), 
"__shape__");
+g = mxnet::exec::InferType(std::move(g), std::move(in_types), "__dtype__");
 bool infer_complete = (g.GetAttr("shape_num_unknown_nodes") == 0);
+// This is tricky for AMP Use case, for example, with only weights input 
types
+// cannot be inferred in AMP. Thus for AMP converted model type_dict will 
be
+// required
+bool infer_type_complete = (g.GetAttr("dtype_num_unknown_nodes") 
== 0);
 CHECK(infer_complete)
   << "The shape information of is not enough to get the shapes";
+CHECK(infer_type_complete)
+<< "The type information is not enough, please provide input arg_types 
"
+   "with provided_arg_dtype_names and provided_arg_dtypes";
 
 Review comment:
   hmm, i thought about this, but this is not limited to the python user, C 
Predict API can be used from Python, CPP etc, calling _CreatePartialOut and 
therefore i used provided_arg_dtype_names and provided_arg_dtypes which are the 
API param names.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] thomelane commented on issue #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
thomelane commented on issue #15427: [TUTORIAL] Gluon performance tips and 
tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#issuecomment-507882866
 
 
   thanks for all the reviews @aaronmarkham @pengzhao-intel @ptrendx. updates 
have been made.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299724871
 
 

 ##
 File path: docs/tutorials/gluon/performance.md
 ##
 @@ -0,0 +1,483 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Gluon Performance Tips & Tricks
+
+Compared to traditional machine learning methods, the field of deep-learning 
has increased model accuracy across a wide range of tasks, but it has also 
increased the amount of computation required for model training and inference. 
Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution 
of networks, but it can sometimes be hard to write code that uses the hardware 
to its full potential. We will be looking at a few simple tips and trick in 
this tutorial that you can use to speed up training and ultimately save on 
training costs.
+
+We'll start by writing some code to train an image classification network for 
the CIFAR-10 dataset, and then benchmark the throughput of the network in terms 
of samples processed per second. After some performance analysis, we'll 
identify the bottlenecks (i.e. the components limiting throughput) and improve 
the training speed step-by-step. We'll bring together all the tips and tricks 
at the end and evaluate our performance gains.
+
+
+```python
+from __future__ import print_function
+import multiprocessing
+import time
+import mxnet as mx
+import numpy as np
+```
+
+An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this 
tutorial. You are likely to get difference results and find different 
bottlenecks on other hardware, but these tips and tricks should still help 
improve training speed for bottleneck components. A GPU is recommended for this 
example.
+
+
+```python
+ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu()
+print("Using {} context.".format(ctx))
+```
+
+Using gpu(0) context.
+
+
+We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon.
+
+
+```python
+dataset = mx.gluon.data.vision.CIFAR10(train=True)
+print('{} samples'.format(len(dataset)))
+```
+
+5 samples
+
+
+So we can learn how to identify training bottlenecks, let's intentionally 
introduce a short `sleep` into the data loading pipeline. We transform each 
32x32 CIFAR-10 image to 244x244 so we can use it with the ResNet-50 network 
designed for ImageNet. [CIFAR-10 specific ResNet 
networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet)
 exist but we use the more standard ImageNet variants in this example.
+
+
+```python
+def transform_fn(x):
+time.sleep(0.01)  # artificial slow-down
+image = mx.image.imresize(x, w=244, h=244)
+return image.astype('float32').transpose((2, 0, 1))
+
+dataset = dataset.transform_first(transform_fn)
+```
+
+Setting our batch size to 16, we can create the `DataLoader`.
+
+
+```python
+batch_size = 16
+dataloader = mx.gluon.data.DataLoader(dataset,
+  batch_size=batch_size,
+  shuffle=True,
+  last_batch="discard")
+print('{} batches'.format(len(dataloader)))
+```
+
+3125 batches
+
+
+Up next, we create all of the other components required for training, such as 
the network, the loss function, the evaluation metric and parameter trainer.
+
+
+```python
+net = mx.gluon.model_zoo.vision.resnet50_v2(pretrained=False, ctx=ctx)
+net.initialize(mx.init.Xavier(magnitude=2.3), ctx=ctx)
+loss_fn = mx.gluon.loss.SoftmaxCrossEntropyLoss()
+metric = mx.metric.Accuracy()
+learning_rate = 0.001
+trainer = mx.gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 
learning_rate})
+```
+
+## Initial Benchmark
+
+As a starting point, let's benchmark the throughput of our training loop: 
calculating the average samples per second across 25 iterations, where each 
iteration is a batch of 16 samples. We'll run a single forward pass through the 
network before starting our benchmark timer to avoid including shape inference 
and lazy initialization in the throughput calculations.
+
+
+```python
+def single_forward(net, dataloader, dtype='float32'):
+data, label = next(iter(dataloader))
+data = data.astype(dtype)
+data = data.as_in_context(ctx)
+pred = net(data)
+pred.wait_to_read()
+```
+
+
+```python
+single_forward(net, dataloader)
+iters = 25
+num_samples = 0
+num_iters = 0
+start_time = time.time()
+for iter_idx, (data, label) in enumerate(dataloader):
+num_samples += data.shape[0]
+num_iters += 1
+data = data.as_in_context(ctx)
+label = label.as_in_context(ctx)
+with mx.autograd.record():
+pred = net(data)
+loss = loss_fn(pred, label)
+loss.backward()
+trainer.step(data.shape[0])
+metric.update(label, pred)
+print('.', end='')
+if num_iters >= iters:
+break
+mx.nd.waitall()
+end_time = time.time()
+total_time = end_time - start_tim

[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299724835
 
 

 ##
 File path: docs/tutorials/gluon/performance.md
 ##
 @@ -0,0 +1,483 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Gluon Performance Tips & Tricks
+
+Compared to traditional machine learning methods, the field of deep-learning 
has increased model accuracy across a wide range of tasks, but it has also 
increased the amount of computation required for model training and inference. 
Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution 
of networks, but it can sometimes be hard to write code that uses the hardware 
to its full potential. We will be looking at a few simple tips and trick in 
this tutorial that you can use to speed up training and ultimately save on 
training costs.
+
+We'll start by writing some code to train an image classification network for 
the CIFAR-10 dataset, and then benchmark the throughput of the network in terms 
of samples processed per second. After some performance analysis, we'll 
identify the bottlenecks (i.e. the components limiting throughput) and improve 
the training speed step-by-step. We'll bring together all the tips and tricks 
at the end and evaluate our performance gains.
+
+
+```python
+from __future__ import print_function
+import multiprocessing
+import time
+import mxnet as mx
+import numpy as np
+```
+
+An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this 
tutorial. You are likely to get difference results and find different 
bottlenecks on other hardware, but these tips and tricks should still help 
improve training speed for bottleneck components. A GPU is recommended for this 
example.
+
+
+```python
+ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu()
+print("Using {} context.".format(ctx))
+```
+
+Using gpu(0) context.
+
+
+We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon.
+
+
+```python
+dataset = mx.gluon.data.vision.CIFAR10(train=True)
+print('{} samples'.format(len(dataset)))
+```
+
+5 samples
+
+
+So we can learn how to identify training bottlenecks, let's intentionally 
introduce a short `sleep` into the data loading pipeline. We transform each 
32x32 CIFAR-10 image to 244x244 so we can use it with the ResNet-50 network 
designed for ImageNet. [CIFAR-10 specific ResNet 
networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet)
 exist but we use the more standard ImageNet variants in this example.
+
+
+```python
+def transform_fn(x):
+time.sleep(0.01)  # artificial slow-down
+image = mx.image.imresize(x, w=244, h=244)
+return image.astype('float32').transpose((2, 0, 1))
+
+dataset = dataset.transform_first(transform_fn)
+```
+
+Setting our batch size to 16, we can create the `DataLoader`.
+
+
+```python
+batch_size = 16
+dataloader = mx.gluon.data.DataLoader(dataset,
+  batch_size=batch_size,
+  shuffle=True,
+  last_batch="discard")
+print('{} batches'.format(len(dataloader)))
+```
+
+3125 batches
+
+
+Up next, we create all of the other components required for training, such as 
the network, the loss function, the evaluation metric and parameter trainer.
+
+
+```python
+net = mx.gluon.model_zoo.vision.resnet50_v2(pretrained=False, ctx=ctx)
+net.initialize(mx.init.Xavier(magnitude=2.3), ctx=ctx)
+loss_fn = mx.gluon.loss.SoftmaxCrossEntropyLoss()
+metric = mx.metric.Accuracy()
+learning_rate = 0.001
+trainer = mx.gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 
learning_rate})
+```
+
+## Initial Benchmark
+
+As a starting point, let's benchmark the throughput of our training loop: 
calculating the average samples per second across 25 iterations, where each 
iteration is a batch of 16 samples. We'll run a single forward pass through the 
network before starting our benchmark timer to avoid including shape inference 
and lazy initialization in the throughput calculations.
+
+
+```python
+def single_forward(net, dataloader, dtype='float32'):
+data, label = next(iter(dataloader))
+data = data.astype(dtype)
+data = data.as_in_context(ctx)
+pred = net(data)
+pred.wait_to_read()
+```
+
+
+```python
+single_forward(net, dataloader)
+iters = 25
+num_samples = 0
+num_iters = 0
+start_time = time.time()
+for iter_idx, (data, label) in enumerate(dataloader):
+num_samples += data.shape[0]
+num_iters += 1
+data = data.as_in_context(ctx)
+label = label.as_in_context(ctx)
+with mx.autograd.record():
+pred = net(data)
+loss = loss_fn(pred, label)
+loss.backward()
+trainer.step(data.shape[0])
+metric.update(label, pred)
+print('.', end='')
+if num_iters >= iters:
+break
+mx.nd.waitall()
+end_time = time.time()
+total_time = end_time - start_tim

[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299723669
 
 

 ##
 File path: docs/tutorials/gluon/performance.md
 ##
 @@ -0,0 +1,483 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Gluon Performance Tips & Tricks
+
+Compared to traditional machine learning methods, the field of deep-learning 
has increased model accuracy across a wide range of tasks, but it has also 
increased the amount of computation required for model training and inference. 
Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution 
of networks, but it can sometimes be hard to write code that uses the hardware 
to its full potential. We will be looking at a few simple tips and trick in 
this tutorial that you can use to speed up training and ultimately save on 
training costs.
+
+We'll start by writing some code to train an image classification network for 
the CIFAR-10 dataset, and then benchmark the throughput of the network in terms 
of samples processed per second. After some performance analysis, we'll 
identify the bottlenecks (i.e. the components limiting throughput) and improve 
the training speed step-by-step. We'll bring together all the tips and tricks 
at the end and evaluate our performance gains.
+
+
+```python
+from __future__ import print_function
+import multiprocessing
+import time
+import mxnet as mx
+import numpy as np
+```
+
+An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this 
tutorial. You are likely to get difference results and find different 
bottlenecks on other hardware, but these tips and tricks should still help 
improve training speed for bottleneck components. A GPU is recommended for this 
example.
+
+
+```python
+ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu()
+print("Using {} context.".format(ctx))
+```
+
+Using gpu(0) context.
+
+
+We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon.
+
+
+```python
+dataset = mx.gluon.data.vision.CIFAR10(train=True)
+print('{} samples'.format(len(dataset)))
+```
+
+5 samples
+
+
+So we can learn how to identify training bottlenecks, let's intentionally 
introduce a short `sleep` into the data loading pipeline. We transform each 
32x32 CIFAR-10 image to 244x244 so we can use it with the ResNet-50 network 
designed for ImageNet. [CIFAR-10 specific ResNet 
networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet)
 exist but we use the more standard ImageNet variants in this example.
+
+
+```python
+def transform_fn(x):
+time.sleep(0.01)  # artificial slow-down
+image = mx.image.imresize(x, w=244, h=244)
+return image.astype('float32').transpose((2, 0, 1))
+
+dataset = dataset.transform_first(transform_fn)
+```
+
+Setting our batch size to 16, we can create the `DataLoader`.
+
+
+```python
+batch_size = 16
+dataloader = mx.gluon.data.DataLoader(dataset,
+  batch_size=batch_size,
+  shuffle=True,
+  last_batch="discard")
+print('{} batches'.format(len(dataloader)))
+```
+
+3125 batches
+
+
+Up next, we create all of the other components required for training, such as 
the network, the loss function, the evaluation metric and parameter trainer.
+
+
+```python
+net = mx.gluon.model_zoo.vision.resnet50_v2(pretrained=False, ctx=ctx)
+net.initialize(mx.init.Xavier(magnitude=2.3), ctx=ctx)
+loss_fn = mx.gluon.loss.SoftmaxCrossEntropyLoss()
+metric = mx.metric.Accuracy()
+learning_rate = 0.001
+trainer = mx.gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 
learning_rate})
+```
+
+## Initial Benchmark
+
+As a starting point, let's benchmark the throughput of our training loop: 
calculating the average samples per second across 25 iterations, where each 
iteration is a batch of 16 samples. We'll run a single forward pass through the 
network before starting our benchmark timer to avoid including shape inference 
and lazy initialization in the throughput calculations.
+
+
+```python
+def single_forward(net, dataloader, dtype='float32'):
+data, label = next(iter(dataloader))
+data = data.astype(dtype)
+data = data.as_in_context(ctx)
+pred = net(data)
+pred.wait_to_read()
+```
+
+
+```python
+single_forward(net, dataloader)
+iters = 25
+num_samples = 0
+num_iters = 0
+start_time = time.time()
+for iter_idx, (data, label) in enumerate(dataloader):
+num_samples += data.shape[0]
+num_iters += 1
+data = data.as_in_context(ctx)
+label = label.as_in_context(ctx)
+with mx.autograd.record():
+pred = net(data)
+loss = loss_fn(pred, label)
+loss.backward()
+trainer.step(data.shape[0])
+metric.update(label, pred)
+print('.', end='')
+if num_iters >= iters:
+break
+mx.nd.waitall()
+end_time = time.time()
+total_time = end_time - start_tim

[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299722331
 
 

 ##
 File path: docs/tutorials/gluon/performance.md
 ##
 @@ -0,0 +1,483 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Gluon Performance Tips & Tricks
+
+Compared to traditional machine learning methods, the field of deep-learning 
has increased model accuracy across a wide range of tasks, but it has also 
increased the amount of computation required for model training and inference. 
Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution 
of networks, but it can sometimes be hard to write code that uses the hardware 
to its full potential. We will be looking at a few simple tips and trick in 
this tutorial that you can use to speed up training and ultimately save on 
training costs.
+
+We'll start by writing some code to train an image classification network for 
the CIFAR-10 dataset, and then benchmark the throughput of the network in terms 
of samples processed per second. After some performance analysis, we'll 
identify the bottlenecks (i.e. the components limiting throughput) and improve 
the training speed step-by-step. We'll bring together all the tips and tricks 
at the end and evaluate our performance gains.
+
+
+```python
+from __future__ import print_function
+import multiprocessing
+import time
+import mxnet as mx
+import numpy as np
+```
+
+An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this 
tutorial. You are likely to get difference results and find different 
bottlenecks on other hardware, but these tips and tricks should still help 
improve training speed for bottleneck components. A GPU is recommended for this 
example.
 
 Review comment:
   Good catch.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299722411
 
 

 ##
 File path: docs/tutorials/gluon/performance.md
 ##
 @@ -0,0 +1,483 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Gluon Performance Tips & Tricks
+
+Compared to traditional machine learning methods, the field of deep-learning 
has increased model accuracy across a wide range of tasks, but it has also 
increased the amount of computation required for model training and inference. 
Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution 
of networks, but it can sometimes be hard to write code that uses the hardware 
to its full potential. We will be looking at a few simple tips and trick in 
this tutorial that you can use to speed up training and ultimately save on 
training costs.
+
+We'll start by writing some code to train an image classification network for 
the CIFAR-10 dataset, and then benchmark the throughput of the network in terms 
of samples processed per second. After some performance analysis, we'll 
identify the bottlenecks (i.e. the components limiting throughput) and improve 
the training speed step-by-step. We'll bring together all the tips and tricks 
at the end and evaluate our performance gains.
+
+
+```python
+from __future__ import print_function
+import multiprocessing
+import time
+import mxnet as mx
+import numpy as np
+```
+
+An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this 
tutorial. You are likely to get difference results and find different 
bottlenecks on other hardware, but these tips and tricks should still help 
improve training speed for bottleneck components. A GPU is recommended for this 
example.
+
+
+```python
+ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu()
+print("Using {} context.".format(ctx))
+```
+
+Using gpu(0) context.
+
+
+We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon.
+
+
+```python
+dataset = mx.gluon.data.vision.CIFAR10(train=True)
+print('{} samples'.format(len(dataset)))
+```
+
+5 samples
+
+
+So we can learn how to identify training bottlenecks, let's intentionally 
introduce a short `sleep` into the data loading pipeline. We transform each 
32x32 CIFAR-10 image to 244x244 so we can use it with the ResNet-50 network 
designed for ImageNet. [CIFAR-10 specific ResNet 
networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet)
 exist but we use the more standard ImageNet variants in this example.
 
 Review comment:
   Good catch. Changed, rerun notebook and updated stats.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] aaronmarkham commented on issue #15433: remove comments from nano instructions

2019-07-02 Thread GitBox
aaronmarkham commented on issue #15433: remove comments from nano instructions
URL: https://github.com/apache/incubator-mxnet/pull/15433#issuecomment-507879412
 
 
   > Shouldn't you also remove the Scala&Java instructions from the building 
from source part if you do not include prerequisites for Java install and 
remove the `These instructions also cover how to setup MXNet's Java Inference 
API.` remark in the introduction?
   
   Yep! Good catch. Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] roywei commented on issue #15429: Operator Performance Regression on CPU

2019-07-02 Thread GitBox
roywei commented on issue #15429: Operator Performance Regression on CPU
URL: 
https://github.com/apache/incubator-mxnet/issues/15429#issuecomment-507878408
 
 
   Thanks @ciyongch , setting the environment variables did reduce the 
variances.
   I have updated the document(1st sheet): 
https://docs.google.com/spreadsheets/d/1_eezNWbrBAm3s3i6G1m0Rd3YYdTEnmKlYtn4klqdyN0/edit#gid=196553607
   
   With the current data Dot and Dropout is not a big concern now. Relu's 
regression is something we have to accept as otherwise it could lead to nans 
and bugs.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and 
Sparse NDArray
URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r299717760
 
 

 ##
 File path: docs/tutorials/sparse/train_gluon.md
 ##
 @@ -0,0 +1,469 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Sparse NDArrays with Gluon
+
+When working on machine learning problems, you may encounter situations where 
the input data is sparse (i.e. the majority of values are zero). One example of 
this is in recommendation systems. You could have millions of user and product 
features, but only a few of these features are present for each sample. Without 
special treatment, the sheer magnitude of the feature space can lead to 
out-of-memory situations and cause significant slowdowns when training and 
making predictions.
+
+MXNet supports a number of sparse storage types (often called 'stype' for 
short) for these situations. In this tutorial, we'll start by generating some 
sparse data, write it to disk in the LibSVM format and then read back using the 
[`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for 
training. We use the Gluon API to train the model and leverage sparse storage 
types such as 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray)
 and 
[`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray)
 to maximise performance and memory efficiency.
+
+
+```python
+import mxnet as mx
+import numpy as np
+import time
+```
+
+### Generating Sparse Data
+
+You will most likely have a sparse dataset in mind already if you're reading 
this tutorial, but let's create a dummy dataset to use in the examples that 
follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 
features of which 99.999% of values will be zero (i.e. 10 non-zero features for 
each sample). We take this as our input data for training and calculate a label 
based on an arbitrary rule: whether the feature sum is higher than average.
+
+
+```python
+num_samples = 1000
+num_features = 100
+data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', 
density=0.1)
+# generate label: 1 if row sum above average, 0 otherwise.
+label = data.sum(axis=1) > data.sum(axis=1).mean()
+```
+
+
+```python
+print(type(data))
+print(data[:10].asnumpy())
+print('{:,.0f} elements'.format(np.product(data.shape)))
+print('{:,.0f} non-zero elements'.format(data.data.size))
+```
+
+
+[[0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ ...
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]]
+1,000,000,000 elements
+10,000 non-zero elements
+
+
+Our storage type is CSR (Compressed Sparse Row) which is the ideal type for 
sparse data along multiple axes. See [this in-depth 
tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html)
 for more information. Just to confirm the generation process ran correctly, we 
can see that the vast majority of values are indeed zero. One of the first 
questions to ask would be how much memory is saved by storing this data in a 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray)
 versus a standard 
[`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray).
 Since sparse arrays are constructed from many components (e.g. `data`, 
`indices` and `indptr`) we define a function called `get_nbytes` to calculate 
the number of bytes taken in memory to store an array. We compare the same data 
stored in a standard 
[`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray)
 (with `data.tostype('default')`) to the 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray).
+
+
+```python
+def get_nbytes(array):
+fn = lambda a: a.size * np.dtype(a).itemsize
+if isinstance(array, mx.ndarray.sparse.CSRNDArray):
+return fn(array.data) + fn(array.indices) + fn(array.indptr)
+elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray):
+return fn(array.data) + fn(array.indices)
+elif isinstance(array, mx.ndarray.NDArray):
+return fn(array)
+else:
+TypeError('{} not supported'.format(type(array)))
+```
+
+
+```python
+print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs')
+print('CSRNDArray', get_nbytes(data)/100, 'MBs')
+```
+
+NDarray: 4000.0 MBs
+CSRNDArray 0.128008 MBs
+
+
+Given the extremely high sparsity of the data, we observe a huge memory saving 
here! 0.13 MBs versus 4 GBs: ~30,000 times 

[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and 
Sparse NDArray
URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r299716855
 
 

 ##
 File path: docs/tutorials/sparse/train_gluon.md
 ##
 @@ -0,0 +1,469 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Sparse NDArrays with Gluon
+
+When working on machine learning problems, you may encounter situations where 
the input data is sparse (i.e. the majority of values are zero). One example of 
this is in recommendation systems. You could have millions of user and product 
features, but only a few of these features are present for each sample. Without 
special treatment, the sheer magnitude of the feature space can lead to 
out-of-memory situations and cause significant slowdowns when training and 
making predictions.
+
+MXNet supports a number of sparse storage types (often called 'stype' for 
short) for these situations. In this tutorial, we'll start by generating some 
sparse data, write it to disk in the LibSVM format and then read back using the 
[`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for 
training. We use the Gluon API to train the model and leverage sparse storage 
types such as 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray)
 and 
[`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray)
 to maximise performance and memory efficiency.
+
+
+```python
+import mxnet as mx
+import numpy as np
+import time
+```
+
+### Generating Sparse Data
+
+You will most likely have a sparse dataset in mind already if you're reading 
this tutorial, but let's create a dummy dataset to use in the examples that 
follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 
features of which 99.999% of values will be zero (i.e. 10 non-zero features for 
each sample). We take this as our input data for training and calculate a label 
based on an arbitrary rule: whether the feature sum is higher than average.
+
+
+```python
+num_samples = 1000
+num_features = 100
+data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', 
density=0.1)
+# generate label: 1 if row sum above average, 0 otherwise.
+label = data.sum(axis=1) > data.sum(axis=1).mean()
+```
+
+
+```python
+print(type(data))
+print(data[:10].asnumpy())
+print('{:,.0f} elements'.format(np.product(data.shape)))
+print('{:,.0f} non-zero elements'.format(data.data.size))
+```
+
+
+[[0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ ...
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]]
+1,000,000,000 elements
+10,000 non-zero elements
+
+
+Our storage type is CSR (Compressed Sparse Row) which is the ideal type for 
sparse data along multiple axes. See [this in-depth 
tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html)
 for more information. Just to confirm the generation process ran correctly, we 
can see that the vast majority of values are indeed zero. One of the first 
questions to ask would be how much memory is saved by storing this data in a 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray)
 versus a standard 
[`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray).
 Since sparse arrays are constructed from many components (e.g. `data`, 
`indices` and `indptr`) we define a function called `get_nbytes` to calculate 
the number of bytes taken in memory to store an array. We compare the same data 
stored in a standard 
[`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray)
 (with `data.tostype('default')`) to the 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray).
+
+
+```python
+def get_nbytes(array):
+fn = lambda a: a.size * np.dtype(a).itemsize
+if isinstance(array, mx.ndarray.sparse.CSRNDArray):
+return fn(array.data) + fn(array.indices) + fn(array.indptr)
+elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray):
+return fn(array.data) + fn(array.indices)
+elif isinstance(array, mx.ndarray.NDArray):
+return fn(array)
+else:
+TypeError('{} not supported'.format(type(array)))
+```
+
+
+```python
+print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs')
+print('CSRNDArray', get_nbytes(data)/100, 'MBs')
+```
+
+NDarray: 4000.0 MBs
+CSRNDArray 0.128008 MBs
+
+
+Given the extremely high sparsity of the data, we observe a huge memory saving 
here! 0.13 MBs versus 4 GBs: ~30,000 times 

[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and 
Sparse NDArray
URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r299716886
 
 

 ##
 File path: docs/tutorials/sparse/train_gluon.md
 ##
 @@ -0,0 +1,469 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Sparse NDArrays with Gluon
+
+When working on machine learning problems, you may encounter situations where 
the input data is sparse (i.e. the majority of values are zero). One example of 
this is in recommendation systems. You could have millions of user and product 
features, but only a few of these features are present for each sample. Without 
special treatment, the sheer magnitude of the feature space can lead to 
out-of-memory situations and cause significant slowdowns when training and 
making predictions.
+
+MXNet supports a number of sparse storage types (often called 'stype' for 
short) for these situations. In this tutorial, we'll start by generating some 
sparse data, write it to disk in the LibSVM format and then read back using the 
[`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for 
training. We use the Gluon API to train the model and leverage sparse storage 
types such as 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray)
 and 
[`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray)
 to maximise performance and memory efficiency.
+
+
+```python
+import mxnet as mx
+import numpy as np
+import time
+```
+
+### Generating Sparse Data
+
+You will most likely have a sparse dataset in mind already if you're reading 
this tutorial, but let's create a dummy dataset to use in the examples that 
follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 
features of which 99.999% of values will be zero (i.e. 10 non-zero features for 
each sample). We take this as our input data for training and calculate a label 
based on an arbitrary rule: whether the feature sum is higher than average.
+
+
+```python
+num_samples = 1000
+num_features = 100
+data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', 
density=0.1)
+# generate label: 1 if row sum above average, 0 otherwise.
+label = data.sum(axis=1) > data.sum(axis=1).mean()
+```
+
+
+```python
+print(type(data))
+print(data[:10].asnumpy())
+print('{:,.0f} elements'.format(np.product(data.shape)))
+print('{:,.0f} non-zero elements'.format(data.data.size))
+```
+
+
+[[0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ ...
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]]
+1,000,000,000 elements
+10,000 non-zero elements
+
+
+Our storage type is CSR (Compressed Sparse Row) which is the ideal type for 
sparse data along multiple axes. See [this in-depth 
tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html)
 for more information. Just to confirm the generation process ran correctly, we 
can see that the vast majority of values are indeed zero. One of the first 
questions to ask would be how much memory is saved by storing this data in a 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray)
 versus a standard 
[`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray).
 Since sparse arrays are constructed from many components (e.g. `data`, 
`indices` and `indptr`) we define a function called `get_nbytes` to calculate 
the number of bytes taken in memory to store an array. We compare the same data 
stored in a standard 
[`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray)
 (with `data.tostype('default')`) to the 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray).
+
+
+```python
+def get_nbytes(array):
+fn = lambda a: a.size * np.dtype(a).itemsize
+if isinstance(array, mx.ndarray.sparse.CSRNDArray):
+return fn(array.data) + fn(array.indices) + fn(array.indptr)
+elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray):
+return fn(array.data) + fn(array.indices)
+elif isinstance(array, mx.ndarray.NDArray):
+return fn(array)
+else:
+TypeError('{} not supported'.format(type(array)))
+```
+
+
+```python
+print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs')
+print('CSRNDArray', get_nbytes(data)/100, 'MBs')
+```
+
+NDarray: 4000.0 MBs
+CSRNDArray 0.128008 MBs
+
+
+Given the extremely high sparsity of the data, we observe a huge memory saving 
here! 0.13 MBs versus 4 GBs: ~30,000 times 

[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and 
Sparse NDArray
URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r299716928
 
 

 ##
 File path: docs/tutorials/sparse/train_gluon.md
 ##
 @@ -0,0 +1,469 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Sparse NDArrays with Gluon
+
+When working on machine learning problems, you may encounter situations where 
the input data is sparse (i.e. the majority of values are zero). One example of 
this is in recommendation systems. You could have millions of user and product 
features, but only a few of these features are present for each sample. Without 
special treatment, the sheer magnitude of the feature space can lead to 
out-of-memory situations and cause significant slowdowns when training and 
making predictions.
+
+MXNet supports a number of sparse storage types (often called 'stype' for 
short) for these situations. In this tutorial, we'll start by generating some 
sparse data, write it to disk in the LibSVM format and then read back using the 
[`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for 
training. We use the Gluon API to train the model and leverage sparse storage 
types such as 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray)
 and 
[`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray)
 to maximise performance and memory efficiency.
+
+
+```python
+import mxnet as mx
+import numpy as np
+import time
+```
+
+### Generating Sparse Data
+
+You will most likely have a sparse dataset in mind already if you're reading 
this tutorial, but let's create a dummy dataset to use in the examples that 
follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 
features of which 99.999% of values will be zero (i.e. 10 non-zero features for 
each sample). We take this as our input data for training and calculate a label 
based on an arbitrary rule: whether the feature sum is higher than average.
+
+
+```python
+num_samples = 1000
+num_features = 100
+data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', 
density=0.1)
+# generate label: 1 if row sum above average, 0 otherwise.
+label = data.sum(axis=1) > data.sum(axis=1).mean()
+```
+
+
+```python
+print(type(data))
+print(data[:10].asnumpy())
+print('{:,.0f} elements'.format(np.product(data.shape)))
+print('{:,.0f} non-zero elements'.format(data.data.size))
+```
+
+
+[[0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ ...
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]]
+1,000,000,000 elements
+10,000 non-zero elements
+
+
+Our storage type is CSR (Compressed Sparse Row) which is the ideal type for 
sparse data along multiple axes. See [this in-depth 
tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html)
 for more information. Just to confirm the generation process ran correctly, we 
can see that the vast majority of values are indeed zero. One of the first 
questions to ask would be how much memory is saved by storing this data in a 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray)
 versus a standard 
[`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray).
 Since sparse arrays are constructed from many components (e.g. `data`, 
`indices` and `indptr`) we define a function called `get_nbytes` to calculate 
the number of bytes taken in memory to store an array. We compare the same data 
stored in a standard 
[`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray)
 (with `data.tostype('default')`) to the 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray).
+
+
+```python
+def get_nbytes(array):
+fn = lambda a: a.size * np.dtype(a).itemsize
+if isinstance(array, mx.ndarray.sparse.CSRNDArray):
+return fn(array.data) + fn(array.indices) + fn(array.indptr)
+elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray):
+return fn(array.data) + fn(array.indices)
+elif isinstance(array, mx.ndarray.NDArray):
+return fn(array)
+else:
+TypeError('{} not supported'.format(type(array)))
+```
+
+
+```python
+print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs')
+print('CSRNDArray', get_nbytes(data)/100, 'MBs')
+```
+
+NDarray: 4000.0 MBs
+CSRNDArray 0.128008 MBs
+
+
+Given the extremely high sparsity of the data, we observe a huge memory saving 
here! 0.13 MBs versus 4 GBs: ~30,000 times 

[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and 
Sparse NDArray
URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r299716683
 
 

 ##
 File path: docs/tutorials/sparse/train_gluon.md
 ##
 @@ -0,0 +1,469 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Sparse NDArrays with Gluon
+
+When working on machine learning problems, you may encounter situations where 
the input data is sparse (i.e. the majority of values are zero). One example of 
this is in recommendation systems. You could have millions of user and product 
features, but only a few of these features are present for each sample. Without 
special treatment, the sheer magnitude of the feature space can lead to 
out-of-memory situations and cause significant slowdowns when training and 
making predictions.
+
+MXNet supports a number of sparse storage types (often called 'stype' for 
short) for these situations. In this tutorial, we'll start by generating some 
sparse data, write it to disk in the LibSVM format and then read back using the 
[`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for 
training. We use the Gluon API to train the model and leverage sparse storage 
types such as 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray)
 and 
[`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray)
 to maximise performance and memory efficiency.
+
+
+```python
+import mxnet as mx
+import numpy as np
+import time
+```
+
+### Generating Sparse Data
+
+You will most likely have a sparse dataset in mind already if you're reading 
this tutorial, but let's create a dummy dataset to use in the examples that 
follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 
features of which 99.999% of values will be zero (i.e. 10 non-zero features for 
each sample). We take this as our input data for training and calculate a label 
based on an arbitrary rule: whether the feature sum is higher than average.
+
+
+```python
+num_samples = 1000
+num_features = 100
+data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', 
density=0.1)
+# generate label: 1 if row sum above average, 0 otherwise.
+label = data.sum(axis=1) > data.sum(axis=1).mean()
+```
+
+
+```python
+print(type(data))
+print(data[:10].asnumpy())
+print('{:,.0f} elements'.format(np.product(data.shape)))
+print('{:,.0f} non-zero elements'.format(data.data.size))
+```
+
+
+[[0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ ...
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]]
+1,000,000,000 elements
+10,000 non-zero elements
+
+
+Our storage type is CSR (Compressed Sparse Row) which is the ideal type for 
sparse data along multiple axes. See [this in-depth 
tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html)
 for more information. Just to confirm the generation process ran correctly, we 
can see that the vast majority of values are indeed zero. One of the first 
questions to ask would be how much memory is saved by storing this data in a 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray)
 versus a standard 
[`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray).
 Since sparse arrays are constructed from many components (e.g. `data`, 
`indices` and `indptr`) we define a function called `get_nbytes` to calculate 
the number of bytes taken in memory to store an array. We compare the same data 
stored in a standard 
[`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray)
 (with `data.tostype('default')`) to the 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray).
+
+
+```python
+def get_nbytes(array):
+fn = lambda a: a.size * np.dtype(a).itemsize
+if isinstance(array, mx.ndarray.sparse.CSRNDArray):
+return fn(array.data) + fn(array.indices) + fn(array.indptr)
+elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray):
+return fn(array.data) + fn(array.indices)
+elif isinstance(array, mx.ndarray.NDArray):
+return fn(array)
+else:
+TypeError('{} not supported'.format(type(array)))
+```
+
+
+```python
+print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs')
+print('CSRNDArray', get_nbytes(data)/100, 'MBs')
+```
+
+NDarray: 4000.0 MBs
+CSRNDArray 0.128008 MBs
+
+
+Given the extremely high sparsity of the data, we observe a huge memory saving 
here! 0.13 MBs versus 4 GBs: ~30,000 times 

[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and 
Sparse NDArray
URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r299716459
 
 

 ##
 File path: docs/tutorials/sparse/train_gluon.md
 ##
 @@ -0,0 +1,469 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Sparse NDArrays with Gluon
+
+When working on machine learning problems, you may encounter situations where 
the input data is sparse (i.e. the majority of values are zero). One example of 
this is in recommendation systems. You could have millions of user and product 
features, but only a few of these features are present for each sample. Without 
special treatment, the sheer magnitude of the feature space can lead to 
out-of-memory situations and cause significant slowdowns when training and 
making predictions.
+
+MXNet supports a number of sparse storage types (often called 'stype' for 
short) for these situations. In this tutorial, we'll start by generating some 
sparse data, write it to disk in the LibSVM format and then read back using the 
[`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for 
training. We use the Gluon API to train the model and leverage sparse storage 
types such as 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray)
 and 
[`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray)
 to maximise performance and memory efficiency.
+
+
+```python
+import mxnet as mx
+import numpy as np
+import time
+```
+
+### Generating Sparse Data
+
+You will most likely have a sparse dataset in mind already if you're reading 
this tutorial, but let's create a dummy dataset to use in the examples that 
follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 
features of which 99.999% of values will be zero (i.e. 10 non-zero features for 
each sample). We take this as our input data for training and calculate a label 
based on an arbitrary rule: whether the feature sum is higher than average.
+
+
+```python
+num_samples = 1000
+num_features = 100
+data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', 
density=0.1)
+# generate label: 1 if row sum above average, 0 otherwise.
+label = data.sum(axis=1) > data.sum(axis=1).mean()
+```
+
+
+```python
+print(type(data))
+print(data[:10].asnumpy())
+print('{:,.0f} elements'.format(np.product(data.shape)))
+print('{:,.0f} non-zero elements'.format(data.data.size))
+```
+
+
+[[0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ ...
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]]
+1,000,000,000 elements
+10,000 non-zero elements
+
+
+Our storage type is CSR (Compressed Sparse Row) which is the ideal type for 
sparse data along multiple axes. See [this in-depth 
tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html)
 for more information. Just to confirm the generation process ran correctly, we 
can see that the vast majority of values are indeed zero. One of the first 
questions to ask would be how much memory is saved by storing this data in a 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray)
 versus a standard 
[`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray).
 Since sparse arrays are constructed from many components (e.g. `data`, 
`indices` and `indptr`) we define a function called `get_nbytes` to calculate 
the number of bytes taken in memory to store an array. We compare the same data 
stored in a standard 
[`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray)
 (with `data.tostype('default')`) to the 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray).
+
+
+```python
+def get_nbytes(array):
+fn = lambda a: a.size * np.dtype(a).itemsize
+if isinstance(array, mx.ndarray.sparse.CSRNDArray):
+return fn(array.data) + fn(array.indices) + fn(array.indptr)
+elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray):
+return fn(array.data) + fn(array.indices)
+elif isinstance(array, mx.ndarray.NDArray):
+return fn(array)
+else:
+TypeError('{} not supported'.format(type(array)))
+```
+
+
+```python
+print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs')
+print('CSRNDArray', get_nbytes(data)/100, 'MBs')
+```
+
+NDarray: 4000.0 MBs
+CSRNDArray 0.128008 MBs
+
+
+Given the extremely high sparsity of the data, we observe a huge memory saving 
here! 0.13 MBs versus 4 GBs: ~30,000 times 

[GitHub] [incubator-mxnet] pengzhao-intel commented on issue #15303: Fix amalgamation failure.

2019-07-02 Thread GitBox
pengzhao-intel commented on issue #15303: Fix amalgamation failure.
URL: https://github.com/apache/incubator-mxnet/pull/15303#issuecomment-507873807
 
 
   @ZhennanQin @TaoLv  to help review this change :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and 
Sparse NDArray
URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r299716247
 
 

 ##
 File path: docs/tutorials/sparse/train_gluon.md
 ##
 @@ -0,0 +1,469 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Sparse NDArrays with Gluon
+
+When working on machine learning problems, you may encounter situations where 
the input data is sparse (i.e. the majority of values are zero). One example of 
this is in recommendation systems. You could have millions of user and product 
features, but only a few of these features are present for each sample. Without 
special treatment, the sheer magnitude of the feature space can lead to 
out-of-memory situations and cause significant slowdowns when training and 
making predictions.
+
+MXNet supports a number of sparse storage types (often called 'stype' for 
short) for these situations. In this tutorial, we'll start by generating some 
sparse data, write it to disk in the LibSVM format and then read back using the 
[`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for 
training. We use the Gluon API to train the model and leverage sparse storage 
types such as 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray)
 and 
[`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray)
 to maximise performance and memory efficiency.
+
+
+```python
+import mxnet as mx
+import numpy as np
+import time
+```
+
+### Generating Sparse Data
+
+You will most likely have a sparse dataset in mind already if you're reading 
this tutorial, but let's create a dummy dataset to use in the examples that 
follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 
features of which 99.999% of values will be zero (i.e. 10 non-zero features for 
each sample). We take this as our input data for training and calculate a label 
based on an arbitrary rule: whether the feature sum is higher than average.
+
+
+```python
+num_samples = 1000
+num_features = 100
+data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', 
density=0.1)
+# generate label: 1 if row sum above average, 0 otherwise.
+label = data.sum(axis=1) > data.sum(axis=1).mean()
+```
+
+
+```python
+print(type(data))
+print(data[:10].asnumpy())
+print('{:,.0f} elements'.format(np.product(data.shape)))
+print('{:,.0f} non-zero elements'.format(data.data.size))
+```
+
+
+[[0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ ...
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]]
+1,000,000,000 elements
+10,000 non-zero elements
+
+
+Our storage type is CSR (Compressed Sparse Row) which is the ideal type for 
sparse data along multiple axes. See [this in-depth 
tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html)
 for more information. Just to confirm the generation process ran correctly, we 
can see that the vast majority of values are indeed zero. One of the first 
questions to ask would be how much memory is saved by storing this data in a 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray)
 versus a standard 
[`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray).
 Since sparse arrays are constructed from many components (e.g. `data`, 
`indices` and `indptr`) we define a function called `get_nbytes` to calculate 
the number of bytes taken in memory to store an array. We compare the same data 
stored in a standard 
[`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray)
 (with `data.tostype('default')`) to the 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray).
+
+
+```python
+def get_nbytes(array):
+fn = lambda a: a.size * np.dtype(a).itemsize
+if isinstance(array, mx.ndarray.sparse.CSRNDArray):
+return fn(array.data) + fn(array.indices) + fn(array.indptr)
+elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray):
+return fn(array.data) + fn(array.indices)
+elif isinstance(array, mx.ndarray.NDArray):
+return fn(array)
+else:
+TypeError('{} not supported'.format(type(array)))
+```
+
+
+```python
+print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs')
+print('CSRNDArray', get_nbytes(data)/100, 'MBs')
+```
+
+NDarray: 4000.0 MBs
+CSRNDArray 0.128008 MBs
+
+
+Given the extremely high sparsity of the data, we observe a huge memory saving 
here! 0.13 MBs versus 4 GBs: ~30,000 times 

[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
ptrendx commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299716110
 
 

 ##
 File path: docs/tutorials/gluon/performance.md
 ##
 @@ -0,0 +1,483 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Gluon Performance Tips & Tricks
+
+Compared to traditional machine learning methods, the field of deep-learning 
has increased model accuracy across a wide range of tasks, but it has also 
increased the amount of computation required for model training and inference. 
Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution 
of networks, but it can sometimes be hard to write code that uses the hardware 
to its full potential. We will be looking at a few simple tips and trick in 
this tutorial that you can use to speed up training and ultimately save on 
training costs.
+
+We'll start by writing some code to train an image classification network for 
the CIFAR-10 dataset, and then benchmark the throughput of the network in terms 
of samples processed per second. After some performance analysis, we'll 
identify the bottlenecks (i.e. the components limiting throughput) and improve 
the training speed step-by-step. We'll bring together all the tips and tricks 
at the end and evaluate our performance gains.
+
+
+```python
+from __future__ import print_function
+import multiprocessing
+import time
+import mxnet as mx
+import numpy as np
+```
+
+An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this 
tutorial. You are likely to get difference results and find different 
bottlenecks on other hardware, but these tips and tricks should still help 
improve training speed for bottleneck components. A GPU is recommended for this 
example.
+
+
+```python
+ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu()
+print("Using {} context.".format(ctx))
+```
+
+Using gpu(0) context.
+
+
+We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon.
+
+
+```python
+dataset = mx.gluon.data.vision.CIFAR10(train=True)
+print('{} samples'.format(len(dataset)))
+```
+
+5 samples
+
+
+So we can learn how to identify training bottlenecks, let's intentionally 
introduce a short `sleep` into the data loading pipeline. We transform each 
32x32 CIFAR-10 image to 244x244 so we can use it with the ResNet-50 network 
designed for ImageNet. [CIFAR-10 specific ResNet 
networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet)
 exist but we use the more standard ImageNet variants in this example.
+
+
+```python
+def transform_fn(x):
+time.sleep(0.01)  # artificial slow-down
+image = mx.image.imresize(x, w=244, h=244)
+return image.astype('float32').transpose((2, 0, 1))
+
+dataset = dataset.transform_first(transform_fn)
+```
+
+Setting our batch size to 16, we can create the `DataLoader`.
+
+
+```python
+batch_size = 16
+dataloader = mx.gluon.data.DataLoader(dataset,
+  batch_size=batch_size,
+  shuffle=True,
+  last_batch="discard")
+print('{} batches'.format(len(dataloader)))
+```
+
+3125 batches
+
+
+Up next, we create all of the other components required for training, such as 
the network, the loss function, the evaluation metric and parameter trainer.
+
+
+```python
+net = mx.gluon.model_zoo.vision.resnet50_v2(pretrained=False, ctx=ctx)
+net.initialize(mx.init.Xavier(magnitude=2.3), ctx=ctx)
+loss_fn = mx.gluon.loss.SoftmaxCrossEntropyLoss()
+metric = mx.metric.Accuracy()
+learning_rate = 0.001
+trainer = mx.gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 
learning_rate})
+```
+
+## Initial Benchmark
+
+As a starting point, let's benchmark the throughput of our training loop: 
calculating the average samples per second across 25 iterations, where each 
iteration is a batch of 16 samples. We'll run a single forward pass through the 
network before starting our benchmark timer to avoid including shape inference 
and lazy initialization in the throughput calculations.
+
+
+```python
+def single_forward(net, dataloader, dtype='float32'):
+data, label = next(iter(dataloader))
+data = data.astype(dtype)
+data = data.as_in_context(ctx)
+pred = net(data)
+pred.wait_to_read()
+```
+
+
+```python
+single_forward(net, dataloader)
+iters = 25
+num_samples = 0
+num_iters = 0
+start_time = time.time()
+for iter_idx, (data, label) in enumerate(dataloader):
+num_samples += data.shape[0]
+num_iters += 1
+data = data.as_in_context(ctx)
+label = label.as_in_context(ctx)
+with mx.autograd.record():
+pred = net(data)
+loss = loss_fn(pred, label)
+loss.backward()
+trainer.step(data.shape[0])
+metric.update(label, pred)
+print('.', end='')
+if num_iters >= iters:
+break
+mx.nd.waitall()
+end_time = time.time()
+total_time = end_time - start_time

[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
ptrendx commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299715652
 
 

 ##
 File path: docs/tutorials/gluon/performance.md
 ##
 @@ -0,0 +1,483 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Gluon Performance Tips & Tricks
+
+Compared to traditional machine learning methods, the field of deep-learning 
has increased model accuracy across a wide range of tasks, but it has also 
increased the amount of computation required for model training and inference. 
Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution 
of networks, but it can sometimes be hard to write code that uses the hardware 
to its full potential. We will be looking at a few simple tips and trick in 
this tutorial that you can use to speed up training and ultimately save on 
training costs.
+
+We'll start by writing some code to train an image classification network for 
the CIFAR-10 dataset, and then benchmark the throughput of the network in terms 
of samples processed per second. After some performance analysis, we'll 
identify the bottlenecks (i.e. the components limiting throughput) and improve 
the training speed step-by-step. We'll bring together all the tips and tricks 
at the end and evaluate our performance gains.
+
+
+```python
+from __future__ import print_function
+import multiprocessing
+import time
+import mxnet as mx
+import numpy as np
+```
+
+An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this 
tutorial. You are likely to get difference results and find different 
bottlenecks on other hardware, but these tips and tricks should still help 
improve training speed for bottleneck components. A GPU is recommended for this 
example.
+
+
+```python
+ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu()
+print("Using {} context.".format(ctx))
+```
+
+Using gpu(0) context.
+
+
+We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon.
+
+
+```python
+dataset = mx.gluon.data.vision.CIFAR10(train=True)
+print('{} samples'.format(len(dataset)))
+```
+
+5 samples
+
+
+So we can learn how to identify training bottlenecks, let's intentionally 
introduce a short `sleep` into the data loading pipeline. We transform each 
32x32 CIFAR-10 image to 244x244 so we can use it with the ResNet-50 network 
designed for ImageNet. [CIFAR-10 specific ResNet 
networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet)
 exist but we use the more standard ImageNet variants in this example.
+
+
+```python
+def transform_fn(x):
+time.sleep(0.01)  # artificial slow-down
+image = mx.image.imresize(x, w=244, h=244)
+return image.astype('float32').transpose((2, 0, 1))
+
+dataset = dataset.transform_first(transform_fn)
+```
+
+Setting our batch size to 16, we can create the `DataLoader`.
+
+
+```python
+batch_size = 16
+dataloader = mx.gluon.data.DataLoader(dataset,
+  batch_size=batch_size,
+  shuffle=True,
+  last_batch="discard")
+print('{} batches'.format(len(dataloader)))
+```
+
+3125 batches
+
+
+Up next, we create all of the other components required for training, such as 
the network, the loss function, the evaluation metric and parameter trainer.
+
+
+```python
+net = mx.gluon.model_zoo.vision.resnet50_v2(pretrained=False, ctx=ctx)
+net.initialize(mx.init.Xavier(magnitude=2.3), ctx=ctx)
+loss_fn = mx.gluon.loss.SoftmaxCrossEntropyLoss()
+metric = mx.metric.Accuracy()
+learning_rate = 0.001
+trainer = mx.gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 
learning_rate})
+```
+
+## Initial Benchmark
+
+As a starting point, let's benchmark the throughput of our training loop: 
calculating the average samples per second across 25 iterations, where each 
iteration is a batch of 16 samples. We'll run a single forward pass through the 
network before starting our benchmark timer to avoid including shape inference 
and lazy initialization in the throughput calculations.
+
+
+```python
+def single_forward(net, dataloader, dtype='float32'):
+data, label = next(iter(dataloader))
+data = data.astype(dtype)
+data = data.as_in_context(ctx)
+pred = net(data)
+pred.wait_to_read()
+```
+
+
+```python
+single_forward(net, dataloader)
+iters = 25
+num_samples = 0
+num_iters = 0
+start_time = time.time()
+for iter_idx, (data, label) in enumerate(dataloader):
+num_samples += data.shape[0]
+num_iters += 1
+data = data.as_in_context(ctx)
+label = label.as_in_context(ctx)
+with mx.autograd.record():
+pred = net(data)
+loss = loss_fn(pred, label)
+loss.backward()
+trainer.step(data.shape[0])
+metric.update(label, pred)
+print('.', end='')
+if num_iters >= iters:
+break
+mx.nd.waitall()
+end_time = time.time()
+total_time = end_time - start_time

[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and 
Sparse NDArray
URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r299715639
 
 

 ##
 File path: docs/tutorials/sparse/train_gluon.md
 ##
 @@ -0,0 +1,469 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Sparse NDArrays with Gluon
+
+When working on machine learning problems, you may encounter situations where 
the input data is sparse (i.e. the majority of values are zero). One example of 
this is in recommendation systems. You could have millions of user and product 
features, but only a few of these features are present for each sample. Without 
special treatment, the sheer magnitude of the feature space can lead to 
out-of-memory situations and cause significant slowdowns when training and 
making predictions.
+
+MXNet supports a number of sparse storage types (often called 'stype' for 
short) for these situations. In this tutorial, we'll start by generating some 
sparse data, write it to disk in the LibSVM format and then read back using the 
[`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for 
training. We use the Gluon API to train the model and leverage sparse storage 
types such as 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray)
 and 
[`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray)
 to maximise performance and memory efficiency.
+
+
+```python
+import mxnet as mx
+import numpy as np
+import time
+```
+
+### Generating Sparse Data
+
+You will most likely have a sparse dataset in mind already if you're reading 
this tutorial, but let's create a dummy dataset to use in the examples that 
follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 
features of which 99.999% of values will be zero (i.e. 10 non-zero features for 
each sample). We take this as our input data for training and calculate a label 
based on an arbitrary rule: whether the feature sum is higher than average.
+
+
+```python
+num_samples = 1000
+num_features = 100
+data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', 
density=0.1)
+# generate label: 1 if row sum above average, 0 otherwise.
+label = data.sum(axis=1) > data.sum(axis=1).mean()
+```
+
+
+```python
+print(type(data))
+print(data[:10].asnumpy())
+print('{:,.0f} elements'.format(np.product(data.shape)))
+print('{:,.0f} non-zero elements'.format(data.data.size))
+```
+
+
+[[0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ ...
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]]
+1,000,000,000 elements
+10,000 non-zero elements
+
+
+Our storage type is CSR (Compressed Sparse Row) which is the ideal type for 
sparse data along multiple axes. See [this in-depth 
tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html)
 for more information. Just to confirm the generation process ran correctly, we 
can see that the vast majority of values are indeed zero. One of the first 
questions to ask would be how much memory is saved by storing this data in a 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray)
 versus a standard 
[`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray).
 Since sparse arrays are constructed from many components (e.g. `data`, 
`indices` and `indptr`) we define a function called `get_nbytes` to calculate 
the number of bytes taken in memory to store an array. We compare the same data 
stored in a standard 
[`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray)
 (with `data.tostype('default')`) to the 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray).
+
+
+```python
+def get_nbytes(array):
+fn = lambda a: a.size * np.dtype(a).itemsize
+if isinstance(array, mx.ndarray.sparse.CSRNDArray):
+return fn(array.data) + fn(array.indices) + fn(array.indptr)
+elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray):
+return fn(array.data) + fn(array.indices)
+elif isinstance(array, mx.ndarray.NDArray):
+return fn(array)
+else:
+TypeError('{} not supported'.format(type(array)))
+```
+
+
+```python
+print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs')
+print('CSRNDArray', get_nbytes(data)/100, 'MBs')
+```
+
+NDarray: 4000.0 MBs
+CSRNDArray 0.128008 MBs
+
+
+Given the extremely high sparsity of the data, we observe a huge memory saving 
here! 0.13 MBs versus 4 GBs: ~30,000 times 

[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and 
Sparse NDArray
URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r299715527
 
 

 ##
 File path: docs/tutorials/sparse/train_gluon.md
 ##
 @@ -0,0 +1,469 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Sparse NDArrays with Gluon
+
+When working on machine learning problems, you may encounter situations where 
the input data is sparse (i.e. the majority of values are zero). One example of 
this is in recommendation systems. You could have millions of user and product 
features, but only a few of these features are present for each sample. Without 
special treatment, the sheer magnitude of the feature space can lead to 
out-of-memory situations and cause significant slowdowns when training and 
making predictions.
+
+MXNet supports a number of sparse storage types (often called 'stype' for 
short) for these situations. In this tutorial, we'll start by generating some 
sparse data, write it to disk in the LibSVM format and then read back using the 
[`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for 
training. We use the Gluon API to train the model and leverage sparse storage 
types such as 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray)
 and 
[`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray)
 to maximise performance and memory efficiency.
+
+
+```python
+import mxnet as mx
+import numpy as np
+import time
+```
+
+### Generating Sparse Data
+
+You will most likely have a sparse dataset in mind already if you're reading 
this tutorial, but let's create a dummy dataset to use in the examples that 
follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 
features of which 99.999% of values will be zero (i.e. 10 non-zero features for 
each sample). We take this as our input data for training and calculate a label 
based on an arbitrary rule: whether the feature sum is higher than average.
+
+
+```python
+num_samples = 1000
+num_features = 100
+data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', 
density=0.1)
+# generate label: 1 if row sum above average, 0 otherwise.
+label = data.sum(axis=1) > data.sum(axis=1).mean()
+```
+
+
+```python
+print(type(data))
+print(data[:10].asnumpy())
+print('{:,.0f} elements'.format(np.product(data.shape)))
+print('{:,.0f} non-zero elements'.format(data.data.size))
+```
+
+
+[[0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ ...
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]
+ [0. 0. 0. ... 0. 0. 0.]]
+1,000,000,000 elements
+10,000 non-zero elements
+
+
+Our storage type is CSR (Compressed Sparse Row) which is the ideal type for 
sparse data along multiple axes. See [this in-depth 
tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html)
 for more information. Just to confirm the generation process ran correctly, we 
can see that the vast majority of values are indeed zero. One of the first 
questions to ask would be how much memory is saved by storing this data in a 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray)
 versus a standard 
[`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray).
 Since sparse arrays are constructed from many components (e.g. `data`, 
`indices` and `indptr`) we define a function called `get_nbytes` to calculate 
the number of bytes taken in memory to store an array. We compare the same data 
stored in a standard 
[`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray)
 (with `data.tostype('default')`) to the 
[`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray).
+
+
+```python
+def get_nbytes(array):
+fn = lambda a: a.size * np.dtype(a).itemsize
+if isinstance(array, mx.ndarray.sparse.CSRNDArray):
+return fn(array.data) + fn(array.indices) + fn(array.indptr)
+elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray):
+return fn(array.data) + fn(array.indices)
+elif isinstance(array, mx.ndarray.NDArray):
+return fn(array)
+else:
+TypeError('{} not supported'.format(type(array)))
+```
+
+
+```python
+print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs')
+print('CSRNDArray', get_nbytes(data)/100, 'MBs')
+```
+
+NDarray: 4000.0 MBs
+CSRNDArray 0.128008 MBs
+
+
+Given the extremely high sparsity of the data, we observe a huge memory saving 
here! 0.13 MBs versus 4 GBs: ~30,000 times 

[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
ptrendx commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299715062
 
 

 ##
 File path: docs/tutorials/gluon/performance.md
 ##
 @@ -0,0 +1,483 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Gluon Performance Tips & Tricks
+
+Compared to traditional machine learning methods, the field of deep-learning 
has increased model accuracy across a wide range of tasks, but it has also 
increased the amount of computation required for model training and inference. 
Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution 
of networks, but it can sometimes be hard to write code that uses the hardware 
to its full potential. We will be looking at a few simple tips and trick in 
this tutorial that you can use to speed up training and ultimately save on 
training costs.
+
+We'll start by writing some code to train an image classification network for 
the CIFAR-10 dataset, and then benchmark the throughput of the network in terms 
of samples processed per second. After some performance analysis, we'll 
identify the bottlenecks (i.e. the components limiting throughput) and improve 
the training speed step-by-step. We'll bring together all the tips and tricks 
at the end and evaluate our performance gains.
+
+
+```python
+from __future__ import print_function
+import multiprocessing
+import time
+import mxnet as mx
+import numpy as np
+```
+
+An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this 
tutorial. You are likely to get difference results and find different 
bottlenecks on other hardware, but these tips and tricks should still help 
improve training speed for bottleneck components. A GPU is recommended for this 
example.
+
+
+```python
+ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu()
+print("Using {} context.".format(ctx))
+```
+
+Using gpu(0) context.
+
+
+We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon.
+
+
+```python
+dataset = mx.gluon.data.vision.CIFAR10(train=True)
+print('{} samples'.format(len(dataset)))
+```
+
+5 samples
+
+
+So we can learn how to identify training bottlenecks, let's intentionally 
introduce a short `sleep` into the data loading pipeline. We transform each 
32x32 CIFAR-10 image to 244x244 so we can use it with the ResNet-50 network 
designed for ImageNet. [CIFAR-10 specific ResNet 
networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet)
 exist but we use the more standard ImageNet variants in this example.
+
+
+```python
+def transform_fn(x):
+time.sleep(0.01)  # artificial slow-down
+image = mx.image.imresize(x, w=244, h=244)
+return image.astype('float32').transpose((2, 0, 1))
+
+dataset = dataset.transform_first(transform_fn)
+```
+
+Setting our batch size to 16, we can create the `DataLoader`.
+
+
+```python
+batch_size = 16
+dataloader = mx.gluon.data.DataLoader(dataset,
+  batch_size=batch_size,
+  shuffle=True,
+  last_batch="discard")
+print('{} batches'.format(len(dataloader)))
+```
+
+3125 batches
+
+
+Up next, we create all of the other components required for training, such as 
the network, the loss function, the evaluation metric and parameter trainer.
+
+
+```python
+net = mx.gluon.model_zoo.vision.resnet50_v2(pretrained=False, ctx=ctx)
+net.initialize(mx.init.Xavier(magnitude=2.3), ctx=ctx)
+loss_fn = mx.gluon.loss.SoftmaxCrossEntropyLoss()
+metric = mx.metric.Accuracy()
+learning_rate = 0.001
+trainer = mx.gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 
learning_rate})
+```
+
+## Initial Benchmark
+
+As a starting point, let's benchmark the throughput of our training loop: 
calculating the average samples per second across 25 iterations, where each 
iteration is a batch of 16 samples. We'll run a single forward pass through the 
network before starting our benchmark timer to avoid including shape inference 
and lazy initialization in the throughput calculations.
+
+
+```python
+def single_forward(net, dataloader, dtype='float32'):
+data, label = next(iter(dataloader))
+data = data.astype(dtype)
+data = data.as_in_context(ctx)
+pred = net(data)
+pred.wait_to_read()
+```
+
+
+```python
+single_forward(net, dataloader)
+iters = 25
+num_samples = 0
+num_iters = 0
+start_time = time.time()
+for iter_idx, (data, label) in enumerate(dataloader):
+num_samples += data.shape[0]
+num_iters += 1
+data = data.as_in_context(ctx)
+label = label.as_in_context(ctx)
+with mx.autograd.record():
+pred = net(data)
+loss = loss_fn(pred, label)
+loss.backward()
+trainer.step(data.shape[0])
+metric.update(label, pred)
+print('.', end='')
+if num_iters >= iters:
+break
+mx.nd.waitall()
+end_time = time.time()
+total_time = end_time - start_time

[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
ptrendx commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299714081
 
 

 ##
 File path: docs/tutorials/gluon/performance.md
 ##
 @@ -0,0 +1,483 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Gluon Performance Tips & Tricks
+
+Compared to traditional machine learning methods, the field of deep-learning 
has increased model accuracy across a wide range of tasks, but it has also 
increased the amount of computation required for model training and inference. 
Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution 
of networks, but it can sometimes be hard to write code that uses the hardware 
to its full potential. We will be looking at a few simple tips and trick in 
this tutorial that you can use to speed up training and ultimately save on 
training costs.
+
+We'll start by writing some code to train an image classification network for 
the CIFAR-10 dataset, and then benchmark the throughput of the network in terms 
of samples processed per second. After some performance analysis, we'll 
identify the bottlenecks (i.e. the components limiting throughput) and improve 
the training speed step-by-step. We'll bring together all the tips and tricks 
at the end and evaluate our performance gains.
+
+
+```python
+from __future__ import print_function
+import multiprocessing
+import time
+import mxnet as mx
+import numpy as np
+```
+
+An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this 
tutorial. You are likely to get difference results and find different 
bottlenecks on other hardware, but these tips and tricks should still help 
improve training speed for bottleneck components. A GPU is recommended for this 
example.
+
+
+```python
+ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu()
+print("Using {} context.".format(ctx))
+```
+
+Using gpu(0) context.
+
+
+We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon.
+
+
+```python
+dataset = mx.gluon.data.vision.CIFAR10(train=True)
+print('{} samples'.format(len(dataset)))
+```
+
+5 samples
+
+
+So we can learn how to identify training bottlenecks, let's intentionally 
introduce a short `sleep` into the data loading pipeline. We transform each 
32x32 CIFAR-10 image to 244x244 so we can use it with the ResNet-50 network 
designed for ImageNet. [CIFAR-10 specific ResNet 
networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet)
 exist but we use the more standard ImageNet variants in this example.
 
 Review comment:
   The original ImageNet ResNet actually work on 224x224 images, not 244x244.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
ptrendx commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299713848
 
 

 ##
 File path: docs/tutorials/gluon/performance.md
 ##
 @@ -0,0 +1,483 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Gluon Performance Tips & Tricks
+
+Compared to traditional machine learning methods, the field of deep-learning 
has increased model accuracy across a wide range of tasks, but it has also 
increased the amount of computation required for model training and inference. 
Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution 
of networks, but it can sometimes be hard to write code that uses the hardware 
to its full potential. We will be looking at a few simple tips and trick in 
this tutorial that you can use to speed up training and ultimately save on 
training costs.
+
+We'll start by writing some code to train an image classification network for 
the CIFAR-10 dataset, and then benchmark the throughput of the network in terms 
of samples processed per second. After some performance analysis, we'll 
identify the bottlenecks (i.e. the components limiting throughput) and improve 
the training speed step-by-step. We'll bring together all the tips and tricks 
at the end and evaluate our performance gains.
+
+
+```python
+from __future__ import print_function
+import multiprocessing
+import time
+import mxnet as mx
+import numpy as np
+```
+
+An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this 
tutorial. You are likely to get difference results and find different 
bottlenecks on other hardware, but these tips and tricks should still help 
improve training speed for bottleneck components. A GPU is recommended for this 
example.
 
 Review comment:
   ```suggestion
   An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this 
tutorial. You are likely to get different results and find different 
bottlenecks on other hardware, but these tips and tricks should still help 
improve training speed for bottleneck components. A GPU is recommended for this 
example.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299712376
 
 

 ##
 File path: docs/tutorials/gluon/performance.md
 ##
 @@ -0,0 +1,485 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Gluon Performance Tips & Tricks
+
+Compared to traditional machine learning methods, the field of deep-learning 
has increased model accuracy across a wide range of tasks, but it has also 
increased the amount of computation required for model training and inference. 
Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution 
of networks, but it can sometimes be hard to write code that uses the hardware 
to its full potential. We will be looking at a few simple tips and trick in 
this tutorial that you can use to speed up training and ultimately save on 
training costs.
+
+We'll start by writing some code to train an image classification network for 
the CIFAR-10 dataset, and then benchmark the throughput of the network in terms 
of samples processed per second. After some performance analysis, we'll 
identify the bottlenecks (i.e. the components limiting throughput) and improve 
the training speed step-by-step. We'll bring together all the tips and tricks 
at the end and evaluate our performance gains.
+
+
+```python
+from __future__ import print_function
+import multiprocessing
+import time
+import mxnet as mx
+import numpy as np
+from PIL import Image
 
 Review comment:
   Great catch! Added that to perform a slow rotation augmentation (that isn't 
an MXNet transform), but changed to a `sleep` instead. Switched out for MXNet 
function, removed dependency, and re-ran the notebook.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299711755
 
 

 ##
 File path: docs/tutorials/gluon/performance.md
 ##
 @@ -0,0 +1,485 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Gluon Performance Tips & Tricks
+
+Compared to traditional machine learning methods, the field of deep-learning 
has increased model accuracy across a wide range of tasks, but it has also 
increased the amount of computation required for model training and inference. 
Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution 
of networks, but it can sometimes be hard to write code that uses the hardware 
to its full potential. We will be looking at a few simple tips and trick in 
this tutorial that you can use to speed up training and ultimately save on 
training costs.
+
+We'll start by writing some code to train an image classification network for 
the CIFAR-10 dataset, and then benchmark the throughput of the network in terms 
of samples processed per second. After some performance analysis, we'll 
identify the bottlenecks (i.e. the components limiting throughput) and improve 
the training speed step-by-step. We'll bring together all the tips and tricks 
at the end and evaluate our performance gains.
+
+
+```python
+from __future__ import print_function
+import multiprocessing
+import time
+import mxnet as mx
+import numpy as np
+from PIL import Image
+```
+
+An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this 
tutorial. You are likely to get difference results and find different 
bottlenecks on other hardware, but these tips and tricks should still help 
improve training speed for bottleneck components. A GPU is recommended for this 
example.
+
+
+```python
+ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu()
+print("Using {} context.".format(ctx))
+```
+
+Using gpu(0) context.
+
+
+We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon.
+
+
+```python
+dataset = mx.gluon.data.vision.CIFAR10(train=True)
+print('{} samples'.format(len(dataset)))
+```
+
+5 samples
+
+
+So we can learn how to identify training bottlenecks, let's intentionally 
introduce a short `sleep` into the data loading pipeline. We transform each 
32x32 CIFAR-10 image to 244x244 so we can use it with the ResNet-50 network 
designed for ImageNet. [CIFAR-10 specific ResNet 
networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet)
 exist but we use the more standard ImageNet variants in this example.
+
+
+```python
+def transform_fn(x):
+image = Image.fromarray(x.asnumpy())
+time.sleep(0.01)  # artificial slow-down
+image = image.resize(size=(244, 244), resample=Image.BICUBIC)
+return np.array(image).astype('float32').transpose((2, 0, 1))
+
+dataset = dataset.transform_first(transform_fn)
+```
+
+Setting our batch size to 16, we can create the `DataLoader`.
+
+
+```python
+batch_size = 16
+dataloader = mx.gluon.data.DataLoader(dataset,
+  batch_size=batch_size,
+  shuffle=True,
+  last_batch="discard")
+print('{} batches'.format(len(dataloader)))
+```
+
+3125 batches
+
+
+Up next, we create all of the other components required for training, such as 
the network, the loss function, the evaluation metric and parameter trainer.
+
+
+```python
+net = mx.gluon.model_zoo.vision.resnet50_v2(pretrained=False, ctx=ctx)
+net.initialize(mx.init.Xavier(magnitude=2.3), ctx=ctx)
+loss_fn = mx.gluon.loss.SoftmaxCrossEntropyLoss()
+metric = mx.metric.Accuracy()
+learning_rate = 0.001
+trainer = mx.gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 
learning_rate})
+```
+
+## Initial Benchmark
+
+As a starting point, let's benchmark the throughput of our training loop: 
calculating the average samples per second across 25 iterations, where each 
iteration is a batch of 16 samples. We'll run a single forward pass through the 
network before starting our benchmark timer to avoid including shape inference 
and lazy initialization in the throughput calculations.
+
+
+```python
+def single_forward(net, dataloader, dtype='float32'):
+data, label = next(iter(dataloader))
+data = data.astype(dtype)
+data = data.as_in_context(ctx)
+pred = net(data)
+pred.wait_to_read()
+```
+
+
+```python
+single_forward(net, dataloader)
+iters = 25
+num_samples = 0
+num_iters = 0
+start_time = time.time()
+for iter_idx, (data, label) in enumerate(dataloader):
+num_samples += data.shape[0]
+num_iters += 1
+data = data.as_in_context(ctx)
+label = label.as_in_context(ctx)
+with mx.autograd.record():
+pred = net(data)
+loss = loss_fn(pred, label)
+loss.backward()
+trainer.step(data.shape[0])
+metric.update(label, pred)
+print('.', end='')
+if num_iters >= ite

[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks

2019-07-02 Thread GitBox
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon 
performance tips and tricks
URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299711812
 
 

 ##
 File path: docs/tutorials/gluon/performance.md
 ##
 @@ -0,0 +1,485 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# Gluon Performance Tips & Tricks
+
+Compared to traditional machine learning methods, the field of deep-learning 
has increased model accuracy across a wide range of tasks, but it has also 
increased the amount of computation required for model training and inference. 
Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution 
of networks, but it can sometimes be hard to write code that uses the hardware 
to its full potential. We will be looking at a few simple tips and trick in 
this tutorial that you can use to speed up training and ultimately save on 
training costs.
+
+We'll start by writing some code to train an image classification network for 
the CIFAR-10 dataset, and then benchmark the throughput of the network in terms 
of samples processed per second. After some performance analysis, we'll 
identify the bottlenecks (i.e. the components limiting throughput) and improve 
the training speed step-by-step. We'll bring together all the tips and tricks 
at the end and evaluate our performance gains.
+
+
+```python
+from __future__ import print_function
+import multiprocessing
+import time
+import mxnet as mx
+import numpy as np
+from PIL import Image
+```
+
+An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this 
tutorial. You are likely to get difference results and find different 
bottlenecks on other hardware, but these tips and tricks should still help 
improve training speed for bottleneck components. A GPU is recommended for this 
example.
+
+
+```python
+ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu()
+print("Using {} context.".format(ctx))
+```
+
+Using gpu(0) context.
+
+
+We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon.
+
+
+```python
+dataset = mx.gluon.data.vision.CIFAR10(train=True)
+print('{} samples'.format(len(dataset)))
+```
+
+5 samples
+
+
+So we can learn how to identify training bottlenecks, let's intentionally 
introduce a short `sleep` into the data loading pipeline. We transform each 
32x32 CIFAR-10 image to 244x244 so we can use it with the ResNet-50 network 
designed for ImageNet. [CIFAR-10 specific ResNet 
networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet)
 exist but we use the more standard ImageNet variants in this example.
+
+
+```python
+def transform_fn(x):
+image = Image.fromarray(x.asnumpy())
+time.sleep(0.01)  # artificial slow-down
+image = image.resize(size=(244, 244), resample=Image.BICUBIC)
+return np.array(image).astype('float32').transpose((2, 0, 1))
+
+dataset = dataset.transform_first(transform_fn)
+```
+
+Setting our batch size to 16, we can create the `DataLoader`.
+
+
+```python
+batch_size = 16
+dataloader = mx.gluon.data.DataLoader(dataset,
+  batch_size=batch_size,
+  shuffle=True,
+  last_batch="discard")
+print('{} batches'.format(len(dataloader)))
+```
+
+3125 batches
+
+
+Up next, we create all of the other components required for training, such as 
the network, the loss function, the evaluation metric and parameter trainer.
+
+
+```python
+net = mx.gluon.model_zoo.vision.resnet50_v2(pretrained=False, ctx=ctx)
+net.initialize(mx.init.Xavier(magnitude=2.3), ctx=ctx)
+loss_fn = mx.gluon.loss.SoftmaxCrossEntropyLoss()
+metric = mx.metric.Accuracy()
+learning_rate = 0.001
+trainer = mx.gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 
learning_rate})
+```
+
+## Initial Benchmark
+
+As a starting point, let's benchmark the throughput of our training loop: 
calculating the average samples per second across 25 iterations, where each 
iteration is a batch of 16 samples. We'll run a single forward pass through the 
network before starting our benchmark timer to avoid including shape inference 
and lazy initialization in the throughput calculations.
+
+
+```python
+def single_forward(net, dataloader, dtype='float32'):
+data, label = next(iter(dataloader))
+data = data.astype(dtype)
+data = data.as_in_context(ctx)
+pred = net(data)
+pred.wait_to_read()
+```
+
+
+```python
+single_forward(net, dataloader)
+iters = 25
+num_samples = 0
+num_iters = 0
+start_time = time.time()
+for iter_idx, (data, label) in enumerate(dataloader):
+num_samples += data.shape[0]
+num_iters += 1
+data = data.as_in_context(ctx)
+label = label.as_in_context(ctx)
+with mx.autograd.record():
+pred = net(data)
+loss = loss_fn(pred, label)
+loss.backward()
+trainer.step(data.shape[0])
+metric.update(label, pred)
+print('.', end='')
+if num_iters >= ite

[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API

2019-07-02 Thread GitBox
sandeep-krishnamurthy commented on a change in pull request #15245: FP16 
Support for C Predict API
URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299706065
 
 

 ##
 File path: amalgamation/python/mxnet_predict.py
 ##
 @@ -160,10 +249,18 @@ def forward(self, **kwargs):
 >>> predictor.forward(data=mydata)
 >>> out = predictor.get_output(0)
 """
+if self.type_dict and len(self.type_dict) != len(kwargs.items()):
+raise ValueError("number of kwargs should be same as len of 
type_dict" \
+ "Please check your forward pass inputs" \
+ "or type_dict passed to Predictor instantiation")
+
 for k, v in kwargs.items():
 if not isinstance(v, np.ndarray):
 raise ValueError("Expect numpy ndarray as input")
-v = np.asarray(v, dtype=np.float32, order='C')
+if self.type_dict and k in self.type_dict:
+v = np.asarray(v, dtype=self.type_dict[k], order='C')
+else:
+v = np.asarray(v, dtype=np.float32, order='C')
 
 Review comment:
   if user provided type is not supported by MXNet, it silently converts to 
FP32, is this expected?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API

2019-07-02 Thread GitBox
sandeep-krishnamurthy commented on a change in pull request #15245: FP16 
Support for C Predict API
URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299709898
 
 

 ##
 File path: src/c_api/c_predict_api.cc
 ##
 @@ -444,6 +538,20 @@ int MXPredGetOutputShape(PredictorHandle handle,
   API_END();
 }
 
+int MXPredGetOutputType(PredictorHandle handle,
+mx_uint out_index,
+int* out_dtype) {
+  MXAPIPredictor* p = static_cast(handle);
+  API_BEGIN();
+  CHECK_LT(out_index, p->out_arrays.size())
+<< "Index exceed number of outputs";
 
 Review comment:
   nit: Can we make the message more easy to comprehend?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API

2019-07-02 Thread GitBox
sandeep-krishnamurthy commented on a change in pull request #15245: FP16 
Support for C Predict API
URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299709638
 
 

 ##
 File path: src/c_api/c_predict_api.cc
 ##
 @@ -210,19 +249,31 @@ int _CreatePartialOut(const char* symbol_json_str,
 
   std::vector arg_arrays, aux_arrays;
   for (size_t i = 0; i < arg_shapes.size(); ++i) {
-NDArray nd = NDArray(arg_shapes[i], ctx);
+NDArray nd;
+if (result_arg_types[i] != -1) {
+  nd = NDArray(arg_shapes[i], ctx, false, result_arg_types[i]);
+} else {
+  nd = NDArray(arg_shapes[i], ctx);
+}
 if (arg_params.count(arg_names[i]) != 0) {
   CopyFromTo(arg_params[arg_names[i]], &nd);
 }
 arg_arrays.push_back(nd);
   }
+
   for (size_t i = 0; i < aux_shapes.size(); ++i) {
-NDArray nd = NDArray(aux_shapes[i], ctx);
+NDArray nd;
+if (result_aux_types[i] != -1) {
 
 Review comment:
   Since we are doing such check on types throughout, can we otherwise think of 
a setting a default DType for all params if users are not providing arg types 
params? So we could get rid of all these checks and always types will be 
available. Thoughts?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API

2019-07-02 Thread GitBox
sandeep-krishnamurthy commented on a change in pull request #15245: FP16 
Support for C Predict API
URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299707221
 
 

 ##
 File path: include/mxnet/c_predict_api.h
 ##
 @@ -85,6 +85,44 @@ MXNET_DLL int MXPredCreate(const char* symbol_json_str,
const mx_uint* input_shape_data,
PredictorHandle* out);
 
+/*!
+ * \brief create a predictor
+ * \param symbol_json_str The JSON string of the symbol.
+ * \param param_bytes The in-memory raw bytes of parameter ndarray file.
+ * \param param_size The size of parameter ndarray file.
+ * \param dev_type The device type, 1: cpu, 2: gpu
+ * \param dev_id The device id of the predictor.
+ * \param num_input_nodes Number of input nodes to the net.
+ *For feedforward net, this is 1.
+ * \param input_keys The name of the input argument.
+ *For feedforward net, this is {"data"}
+ * \param input_shape_indptr Index pointer of shapes of each input node.
+ *The length of this array = num_input_nodes + 1.
+ *For feedforward net that takes 4 dimensional input, this is {0, 4}.
+ * \param input_shape_data A flattened data of shapes of each input node.
+ *For feedforward net that takes 4 dimensional input, this is the shape 
data.
+ * \param num_provided_arg_dtypes
+ *The length of provided_arg_dtypes.
+ * \param provided_arg_dtype_names
+ *The provided_arg_dtype_names the names of args for which dtypes are 
provided.
+ * \param provided_arg_dtypes
+ *The provided_arg_dtypes the dtype provided
+ * \param out The created predictor handle.
+ * \return 0 when success, -1 when failure.
+ */
+MXNET_DLL int MXPredCreateEx(const char* symbol_json_str,
+ const void* param_bytes,
+ int param_size,
+ int dev_type, int dev_id,
+ mx_uint num_input_nodes,
 
 Review comment:
   Why this params are non constant?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API

2019-07-02 Thread GitBox
sandeep-krishnamurthy commented on a change in pull request #15245: FP16 
Support for C Predict API
URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299706532
 
 

 ##
 File path: amalgamation/python/mxnet_predict.py
 ##
 @@ -160,10 +249,18 @@ def forward(self, **kwargs):
 >>> predictor.forward(data=mydata)
 >>> out = predictor.get_output(0)
 """
+if self.type_dict and len(self.type_dict) != len(kwargs.items()):
+raise ValueError("number of kwargs should be same as len of 
type_dict" \
+ "Please check your forward pass inputs" \
+ "or type_dict passed to Predictor instantiation")
+
 for k, v in kwargs.items():
 if not isinstance(v, np.ndarray):
 raise ValueError("Expect numpy ndarray as input")
-v = np.asarray(v, dtype=np.float32, order='C')
+if self.type_dict and k in self.type_dict:
+v = np.asarray(v, dtype=self.type_dict[k], order='C')
+else:
+v = np.asarray(v, dtype=np.float32, order='C')
 
 Review comment:
   nit: Will be better to keep all the dtype, including default ie., np.float32 
in the map you are maintaing and remove all explicit np.dtype ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API

2019-07-02 Thread GitBox
sandeep-krishnamurthy commented on a change in pull request #15245: FP16 
Support for C Predict API
URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299708932
 
 

 ##
 File path: src/c_api/c_predict_api.cc
 ##
 @@ -187,21 +206,41 @@ int _CreatePartialOut(const char* symbol_json_str,
 
   try {
 mxnet::ShapeVector in_shapes;
+nnvm::DTypeVector in_types;
 for (std::string key : sym.ListInputNames(Symbol::kAll)) {
   if (known_shape.count(key) != 0) {
 in_shapes.push_back(known_shape[key]);
   } else {
 in_shapes.emplace_back();
   }
 }
+
+for (std::string key : sym.ListInputNames(Symbol::kAll)) {
+  if (arg_types.count(key) != 0) {
+in_types.push_back(arg_types[key]);
+  } else if (aux_types.count(key) != 0) {
+in_types.push_back(aux_types[key]);
+  }
+}
 nnvm::Graph g; g.outputs = sym.outputs;
 g = mxnet::exec::InferShape(std::move(g), std::move(in_shapes), 
"__shape__");
+g = mxnet::exec::InferType(std::move(g), std::move(in_types), "__dtype__");
 bool infer_complete = (g.GetAttr("shape_num_unknown_nodes") == 0);
+// This is tricky for AMP Use case, for example, with only weights input 
types
+// cannot be inferred in AMP. Thus for AMP converted model type_dict will 
be
+// required
+bool infer_type_complete = (g.GetAttr("dtype_num_unknown_nodes") 
== 0);
 CHECK(infer_complete)
   << "The shape information of is not enough to get the shapes";
+CHECK(infer_type_complete)
+<< "The type information is not enough, please provide input arg_types 
"
+   "with provided_arg_dtype_names and provided_arg_dtypes";
 
 Review comment:
   I think this will not be clear for an MXNet user, he is not setting any 
provided_arg_dtype_names and provided_arg_dtypes parameters, so if something 
fails, how to debug?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API

2019-07-02 Thread GitBox
sandeep-krishnamurthy commented on a change in pull request #15245: FP16 
Support for C Predict API
URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299699493
 
 

 ##
 File path: amalgamation/python/mxnet_predict.py
 ##
 @@ -25,17 +25,74 @@
 
 import os
 import sys
+from array import array
 import ctypes
 import logging
 import numpy as np
 
+# pylint: disable= no-member
+_DTYPE_NP_TO_MX = {
+None: -1,
 
 Review comment:
   Should None be Float32 by default?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API

2019-07-02 Thread GitBox
sandeep-krishnamurthy commented on a change in pull request #15245: FP16 
Support for C Predict API
URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299705745
 
 

 ##
 File path: amalgamation/python/mxnet_predict.py
 ##
 @@ -160,10 +249,18 @@ def forward(self, **kwargs):
 >>> predictor.forward(data=mydata)
 >>> out = predictor.get_output(0)
 """
+if self.type_dict and len(self.type_dict) != len(kwargs.items()):
+raise ValueError("number of kwargs should be same as len of 
type_dict" \
+ "Please check your forward pass inputs" \
+ "or type_dict passed to Predictor instantiation")
+
 for k, v in kwargs.items():
 if not isinstance(v, np.ndarray):
 raise ValueError("Expect numpy ndarray as input")
-v = np.asarray(v, dtype=np.float32, order='C')
+if self.type_dict and k in self.type_dict:
+v = np.asarray(v, dtype=self.type_dict[k], order='C')
 
 Review comment:
   Can you help me understand importance of Column major here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API

2019-07-02 Thread GitBox
sandeep-krishnamurthy commented on a change in pull request #15245: FP16 
Support for C Predict API
URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299704452
 
 

 ##
 File path: amalgamation/python/mxnet_predict.py
 ##
 @@ -133,15 +199,38 @@ def __init__(self, symbol_file,
 handle = PredictorHandle()
 param_raw_bytes = bytearray(param_raw_bytes)
 ptr = (ctypes.c_char * 
len(param_raw_bytes)).from_buffer(param_raw_bytes)
-_check_call(_LIB.MXPredCreate(
+
+# data types
+num_provided_arg_types = 0
+# provided type argument names
+provided_arg_type_names = ctypes.POINTER(ctypes.c_char_p)()
+# provided types
+provided_arg_type_data = ctypes.POINTER(mx_uint)()
+if type_dict is not None:
+provided_arg_type_names = []
+provided_arg_type_data = []
+for k, v in type_dict.items():
+v = np.dtype(v).type
+if v in _DTYPE_NP_TO_MX:
+provided_arg_type_names.append(k)
 
 Review comment:
   Here we are depending on index of the element? I remember we had issues due 
to dependence on position in the past due to different in Python list 
maintaining elements in Python2 / 3. Should we use a map instead here (name -> 
type)?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] apeforest commented on issue #15288: [MXNET-978] Higher order gradient for sigmoid

2019-07-02 Thread GitBox
apeforest commented on issue #15288: [MXNET-978] Higher order gradient for 
sigmoid
URL: https://github.com/apache/incubator-mxnet/pull/15288#issuecomment-507862358
 
 
   @kshitij12345 could you approve the PR if everything looks good to you now? 
thx


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] IvyBazan opened a new issue #15447: C API doxygen broken links

2019-07-02 Thread GitBox
IvyBazan opened a new issue #15447: C API doxygen broken links
URL: https://github.com/apache/incubator-mxnet/issues/15447
 
 
   URL -  
https://mxnet.incubator.apache.org/versions/master/doxygen/group__c__api__attributes.html
   Broken Links
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_0.png 
(HTTP_404)
   
   URL -  
https://mxnet.incubator.apache.org/versions/master/doxygen/group__c__api__convolution.html
   Broken Links
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_1.png 
(HTTP_404)
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_2.png 
(HTTP_404)
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_3.png 
(HTTP_404)
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_4.png 
(HTTP_404)
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_5.png 
(HTTP_404)
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_6.png 
(HTTP_404)
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_7.png 
(HTTP_404)
   
   URL -  
https://mxnet.incubator.apache.org/versions/master/doxygen/group__c__api__softmax.html
   Broken Links
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_8.png 
(HTTP_404)
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_9.png 
(HTTP_404)
   
   URL -  
https://mxnet.incubator.apache.org/versions/master/doxygen/group__c__api__pooling.html
   Broken Links
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_10.png 
(HTTP_404)
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_11.png 
(HTTP_404)
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_7.png 
(HTTP_404)
   
   URL -  
https://mxnet.incubator.apache.org/versions/master/doxygen/group__c__api__lrn.html
   Broken Links
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_12.png 
(HTTP_404)
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_13.png 
(HTTP_404)
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_14.png 
(HTTP_404)
   
   URL -  
https://mxnet.incubator.apache.org/versions/master/doxygen/group__c__api__batch__normalization.html
   Broken Links
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_15.png 
(HTTP_404)
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_16.png 
(HTTP_404)
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_17.png 
(HTTP_404)
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_18.png 
(HTTP_404)
   
   URL -  
https://mxnet.incubator.apache.org/versions/master/doxygen/group__c__api__inner__product.html
   Broken Links
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_19.png 
(HTTP_404)
   
   URL -  
https://mxnet.incubator.apache.org/versions/master/doxygen/group__c__api__rnn.html
   Broken Links
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_0.png 
(HTTP_404)
   
   URL -  
https://mxnet.incubator.apache.org/versions/master/doxygen/group__c__api__types__generic.html
   Broken Links
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_20.png 
(HTTP_404)
   ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_21.png 
(HTTP_404)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] mxnet-label-bot commented on issue #15447: C API doxygen broken links

2019-07-02 Thread GitBox
mxnet-label-bot commented on issue #15447: C API doxygen broken links
URL: 
https://github.com/apache/incubator-mxnet/issues/15447#issuecomment-507857022
 
 
   Hey, this is the MXNet Label Bot. 
Thank you for submitting the issue! I will try and suggest some labels so 
that the appropriate MXNet community members can help resolve it. 
Here are my recommended labels: Doc


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] IvyBazan commented on issue #15446: Clojure NDArray broken link

2019-07-02 Thread GitBox
IvyBazan commented on issue #15446: Clojure NDArray broken link
URL: 
https://github.com/apache/incubator-mxnet/issues/15446#issuecomment-507856166
 
 
   URL -  
https://mxnet.incubator.apache.org/versions/master/api/clojure/docs/org.apache.clojure-mxnet.symbol-api.html
   Broken Links
   ─ https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html%3E 
(HTTP_404)
   ─ https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html%3E 
(HTTP_404)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] mxnet-label-bot commented on issue #15446: Clojure NDArray broken link

2019-07-02 Thread GitBox
mxnet-label-bot commented on issue #15446: Clojure NDArray broken link
URL: 
https://github.com/apache/incubator-mxnet/issues/15446#issuecomment-507856108
 
 
   Hey, this is the MXNet Label Bot. 
Thank you for submitting the issue! I will try and suggest some labels so 
that the appropriate MXNet community members can help resolve it. 
Here are my recommended labels: Doc


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] IvyBazan opened a new issue #15446: Clojure NDArray broken link

2019-07-02 Thread GitBox
IvyBazan opened a new issue #15446: Clojure NDArray broken link
URL: https://github.com/apache/incubator-mxnet/issues/15446
 
 
   URL -  
https://mxnet.incubator.apache.org/versions/master/api/clojure/docs/org.apache.clojure-mxnet.ndarray-api.html
   Broken Links
   ─ https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html%3E 
(HTTP_404)
   ─ https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html%3E 
(HTTP_404)
   
   ~~
   
   URL should be:
   -https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] mxnet-label-bot commented on issue #15445: MXNet export broken link

2019-07-02 Thread GitBox
mxnet-label-bot commented on issue #15445: MXNet export broken link
URL: 
https://github.com/apache/incubator-mxnet/issues/15445#issuecomment-507854516
 
 
   Hey, this is the MXNet Label Bot. 
Thank you for submitting the issue! I will try and suggest some labels so 
that the appropriate MXNet community members can help resolve it. 
Here are my recommended labels: Doc


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-mxnet] IvyBazan opened a new issue #15445: MXNet export broken link

2019-07-02 Thread GitBox
IvyBazan opened a new issue #15445: MXNet export broken link
URL: https://github.com/apache/incubator-mxnet/issues/15445
 
 
   URL -  
https://mxnet.incubator.apache.org/tutorials/onnx/export_mxnet_to_onnx.html
   Broken Links
   ─ http://data.mxnet.io/models/imagenet/ (HTTP_404)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


  1   2   3   >