[GitHub] [incubator-mxnet] hgt312 commented on a change in pull request #15292: Numpy-compatibe GCD operation
hgt312 commented on a change in pull request #15292: Numpy-compatibe GCD operation URL: https://github.com/apache/incubator-mxnet/pull/15292#discussion_r299801302 ## File path: python/mxnet/symbol/numpy/_symbol.py ## @@ -992,6 +993,9 @@ def minimum(x1, x2, out=None): def add(x1, x2, out=None): return _ufunc_helper(x1, x2, _npi.add, _np.add, _npi.add_scalar, None, out) +@set_module('mxnet.symbol.numpy') Review comment: More blank lines before and after `gcd` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] ckt624 commented on a change in pull request #15381: [Numpy] Add Documentations
ckt624 commented on a change in pull request #15381: [Numpy] Add Documentations URL: https://github.com/apache/incubator-mxnet/pull/15381#discussion_r299800327 ## File path: python/mxnet/_numpy_op_doc.py ## @@ -19,7 +19,6 @@ """Doc placeholder for numpy ops with prefix _np.""" - Review comment: Fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mikemwx commented on a change in pull request #15382: [numpy][doc-fix] sum, copy, tile, argmax, sign, log, degrees
mikemwx commented on a change in pull request #15382: [numpy][doc-fix] sum, copy, tile, argmax, sign, log, degrees URL: https://github.com/apache/incubator-mxnet/pull/15382#discussion_r299791680 ## File path: python/mxnet/numpy/multiarray.py ## @@ -2188,3 +2270,175 @@ def arctan(x, out=None, **kwargs): 0.7853981633974483 """ return _mx_nd_np.arctan(x, out=out, **kwargs) + +@set_module('mxnet.numpy') +def sign(x, out=None): +""" +sign(x, out=None) + +Returns an element-wise indication of the sign of a number. + +The `sign` function returns ``-1 if x < 0, 0 if x==0, 1 if x > 0``. Only supports real number. + +Parameters +-- +x : ndarray or a scalar +Input values. +out : ndarray or None, optional +A location into which the result is stored. +If provided, it must have the same shape and dtype as input ndarray. +If not provided or `None`, a freshly-allocated array is returned. + +Returns +--- +y : ndarray +The sign of `x`. +This is a scalar if `x` is a scalar. + +Note +--- +- Only supports real number as input elements. +- Input type does not support Python native iterables(list, tuple, ...). +- ``out`` param: cannot perform auto broadcasting. ``out`` ndarray's shape must be the same as the expected output. +- ``out`` param: cannot perform auto type cast. ``out`` ndarray's dtype must be the same as the expected output. +- ``out`` param does not support scalar input case. + +Examples + +>>> a = np.array([-5., 4.5]) +>>> np.sign(a) +array([-1., 1.]) + +Scalars as input: + +>>> np.sign(4.0) +1.0 +>>> np.sign(0) +0 + +Use ``out`` parameter: + +>>> b = np.zeros((2, )) +>>> np.sign(a, out=b) +array([-1., 1.]) +>>> b +array([-1., 1.]) + +""" +return _mx_nd_np.sign(x, out=out) + + +@set_module('mxnet.symbol.numpy') Review comment: Is there anything wrong with the namespace? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] haojin2 commented on a change in pull request #15381: [Numpy] Add Documentations
haojin2 commented on a change in pull request #15381: [Numpy] Add Documentations URL: https://github.com/apache/incubator-mxnet/pull/15381#discussion_r299786639 ## File path: python/mxnet/_numpy_op_doc.py ## @@ -19,7 +19,6 @@ """Doc placeholder for numpy ops with prefix _np.""" - Review comment: do not remove such lines, we require 2 lines between Python functions. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch master updated: [MXNET-978] Higher order gradient for sigmoid (#15288)
This is an automated email from the ASF dual-hosted git repository. apeforest pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 6a8d9eb [MXNET-978] Higher order gradient for sigmoid (#15288) 6a8d9eb is described below commit 6a8d9eb5fd4f7133c094149dc80a3a236534f223 Author: Lin Yuan AuthorDate: Tue Jul 2 22:53:39 2019 -0700 [MXNET-978] Higher order gradient for sigmoid (#15288) * try to add support some ops * add unit test for second order grad * implement grad for relu and add unit test * fix lint * register FGradient attribute for backward relu * resolve conflict * remove unused imports * change gradient using set_attr * remove higher order grad test for negative(x) * fix lint * reverse indent * remove unused backward operator * refactor backward for sin(x) and cos(x) * change value init to list init * change to list initialization * generate random shape in test * fix a bug in second order backward * fix lint * fix lint * address reviewer comment and renaming * test 2nd order gradient for sigmoid * higher order grads for sigmoid * add unit test * remove blank lines * update test * fix lint * fix third order gradient for sigmoid --- src/common/exec_utils.h | 5 ++--- src/imperative/imperative.cc| 4 src/operator/tensor/elemwise_unary_op_basic.cc | 30 - src/operator/tensor/elemwise_unary_op_trig.cc | 4 ++-- tests/python/unittest/test_higher_order_grad.py | 17 ++ 5 files changed, 54 insertions(+), 6 deletions(-) diff --git a/src/common/exec_utils.h b/src/common/exec_utils.h index 0551b42..d8b7a33 100644 --- a/src/common/exec_utils.h +++ b/src/common/exec_utils.h @@ -286,7 +286,6 @@ inline void LogMemoryPlan(const nnvm::Graph& g) { const auto &idx = g.indexed_graph(); const auto& vshape = g.GetAttr("shape"); const auto& vtype = g.GetAttr("dtype"); - const auto& vstorage = g.GetAttr("storage_id"); // find node range uint32_t node_start = 0, node_end = idx.num_nodes(); if (g.attrs.count("node_range")) { @@ -304,13 +303,13 @@ inline void LogMemoryPlan(const nnvm::Graph& g) { auto eid = idx.entry_id(e); size_t kilo_bytes = vshape[eid].Size() * mshadow::mshadow_sizeof(vtype[eid]) / 1024; LOG(INFO) << "\t\tinput " << eid << ": " << vshape[eid] << " (" - << kilo_bytes << " KB) -> " << storage_str(vstorage[eid]); + << kilo_bytes << " KB)"; } for (uint32_t index = 0; index < inode.source->num_outputs(); ++index) { uint32_t eid = idx.entry_id(nid, index); size_t kilo_bytes = vshape[eid].Size() * mshadow::mshadow_sizeof(vtype[eid]) / 1024; LOG(INFO) << "\t\toutput " << eid << ": " << vshape[eid] << " (" - << kilo_bytes << " KB) -> " << storage_str(vstorage[eid]); + << kilo_bytes << " KB)"; } } } diff --git a/src/imperative/imperative.cc b/src/imperative/imperative.cc index d8fba1c..e2c0c9d 100644 --- a/src/imperative/imperative.cc +++ b/src/imperative/imperative.cc @@ -501,6 +501,10 @@ std::vector Imperative::Backward( } } + if (dmlc::GetEnv("MXNET_MEM_PLAN_VERBOSE_LOGGING", false)) { +common::LogMemoryPlan(graph); + } + // Execution bool prev_recording = set_is_recording(create_graph); diff --git a/src/operator/tensor/elemwise_unary_op_basic.cc b/src/operator/tensor/elemwise_unary_op_basic.cc index 98dc8da..26c7408 100644 --- a/src/operator/tensor/elemwise_unary_op_basic.cc +++ b/src/operator/tensor/elemwise_unary_op_basic.cc @@ -121,7 +121,35 @@ The storage type of ``sigmoid`` output is always dense .set_attr("FGradient", ElemwiseGradUseOut{"_backward_sigmoid"}); MXNET_OPERATOR_REGISTER_BINARY_WITH_SPARSE_CPU(_backward_sigmoid, - unary_bwd); + unary_bwd) +.set_attr("FGradient", +[](const nnvm::NodePtr& n, const std::vector& ograds) { + // n->inputs[0] : y_grad + // n->inputs[1] : f(x) = sigmoid(x) + // ograds[0] : head_grads + // f''(x) = f'(x) * (1 - 2*f(x)) + // NodeEntry{n} : y_grad * f'(x) + auto ones = MakeNode("ones_like", n->attrs.name + "_grad_ones", {n->inputs[1]}, nullptr, &n); + const std::unordered_map args = {{"scalar", "2.0"}}; + auto two_y = MakeNode("_mul_scalar", n->attrs.name + "_mul_two", {n->inputs[1]}, &args, &n); + auto one_minus_two_y = MakeNode("elemwise_sub", n->attrs.name + "_grad_sub", +{nnvm::NodeEntry{ones}, nnvm::NodeEntry{two_y}}, null
[GitHub] [incubator-mxnet] apeforest merged pull request #15288: [MXNET-978] Higher order gradient for sigmoid
apeforest merged pull request #15288: [MXNET-978] Higher order gradient for sigmoid URL: https://github.com/apache/incubator-mxnet/pull/15288 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] thomelane commented on issue #15427: [TUTORIAL] Gluon performance tips and tricks
thomelane commented on issue #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#issuecomment-507934941 @pengzhao-intel thanks for the follow-up. made the changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299773533 ## File path: docs/tutorials/gluon/performance.md ## @@ -0,0 +1,483 @@ + + + + + + + + + + + + + + + + + +# Gluon Performance Tips & Tricks + +Compared to traditional machine learning methods, the field of deep-learning has increased model accuracy across a wide range of tasks, but it has also increased the amount of computation required for model training and inference. Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution of networks, but it can sometimes be hard to write code that uses the hardware to its full potential. We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs. + +We'll start by writing some code to train an image classification network for the CIFAR-10 dataset, and then benchmark the throughput of the network in terms of samples processed per second. After some performance analysis, we'll identify the bottlenecks (i.e. the components limiting throughput) and improve the training speed step-by-step. We'll bring together all the tips and tricks at the end and evaluate our performance gains. + + +```python +from __future__ import print_function +import multiprocessing +import time +import mxnet as mx +import numpy as np +``` + +An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this tutorial. You are likely to get different results and find different bottlenecks on other hardware, but these tips and tricks should still help improve training speed for bottleneck components. A GPU is recommended for this example. + + +```python +ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu() +print("Using {} context.".format(ctx)) +``` + +Using gpu(0) context. + + +We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon. + + +```python +dataset = mx.gluon.data.vision.CIFAR10(train=True) +print('{} samples'.format(len(dataset))) +``` + +5 samples + + +So we can learn how to identify training bottlenecks, let's intentionally introduce a short `sleep` into the data loading pipeline. We transform each 32x32 CIFAR-10 image to 224x224 so we can use it with the ResNet-50 network designed for ImageNet. [CIFAR-10 specific ResNet networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet) exist but we use the more standard ImageNet variants in this example. Review comment: It was added to make the data pipeline a bottleneck. Added another comment to clarify. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299773279 ## File path: docs/tutorials/gluon/performance.md ## @@ -0,0 +1,483 @@ + + + + + + + + + + + + + + + + + +# Gluon Performance Tips & Tricks + +Compared to traditional machine learning methods, the field of deep-learning has increased model accuracy across a wide range of tasks, but it has also increased the amount of computation required for model training and inference. Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution of networks, but it can sometimes be hard to write code that uses the hardware to its full potential. We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs. Review comment: Can use most of these tricks for inference too, added a comment for this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299772839 ## File path: docs/tutorials/gluon/performance.md ## @@ -0,0 +1,483 @@ + + + + + + + + + + + + + + + + + +# Gluon Performance Tips & Tricks + +Compared to traditional machine learning methods, the field of deep-learning has increased model accuracy across a wide range of tasks, but it has also increased the amount of computation required for model training and inference. Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution of networks, but it can sometimes be hard to write code that uses the hardware to its full potential. We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs. + +We'll start by writing some code to train an image classification network for the CIFAR-10 dataset, and then benchmark the throughput of the network in terms of samples processed per second. After some performance analysis, we'll identify the bottlenecks (i.e. the components limiting throughput) and improve the training speed step-by-step. We'll bring together all the tips and tricks at the end and evaluate our performance gains. + + +```python +from __future__ import print_function +import multiprocessing +import time +import mxnet as mx +import numpy as np +``` + +An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this tutorial. You are likely to get different results and find different bottlenecks on other hardware, but these tips and tricks should still help improve training speed for bottleneck components. A GPU is recommended for this example. + Review comment: Changed to 'tutorial' to be more clear on where the GPU is recommended. Many references to GPU throughout so it's recommended to be able to follow along. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299772496 ## File path: docs/tutorials/index.md ## @@ -91,6 +91,7 @@ Select API: * [Image similiarity search with InfoGAN](/tutorials/gluon/info_gan.html) * Practitioner Guides * [Gotchas using NumPy](/tutorials/gluon/gotchas_numpy_in_mxnet.html) +* [Performance Tips & Tricks](/tutorials/gluon/performance.html) Review comment: Although we're in the Gluon section so adds a bit of noise. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299772496 ## File path: docs/tutorials/index.md ## @@ -91,6 +91,7 @@ Select API: * [Image similiarity search with InfoGAN](/tutorials/gluon/info_gan.html) * Practitioner Guides * [Gotchas using NumPy](/tutorials/gluon/gotchas_numpy_in_mxnet.html) +* [Performance Tips & Tricks](/tutorials/gluon/performance.html) Review comment: Although we're in the Gluon section so adds a bit of noise. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299772230 ## File path: docs/tutorials/index.md ## @@ -91,6 +91,7 @@ Select API: * [Image similiarity search with InfoGAN](/tutorials/gluon/info_gan.html) * Practitioner Guides * [Gotchas using NumPy](/tutorials/gluon/gotchas_numpy_in_mxnet.html) +* [Performance Tips & Tricks](/tutorials/gluon/performance.html) Review comment: Updated, thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299771879 ## File path: tests/tutorials/test_tutorials.py ## @@ -114,6 +114,9 @@ def test_gluon_save_load_params(): def test_gluon_hybrid(): assert _test_tutorial_nb('gluon/hybrid') + +def test_gluon_hybrid(): Review comment: Updated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299771708 ## File path: tests/tutorials/test_tutorials.py ## @@ -114,6 +114,9 @@ def test_gluon_save_load_params(): def test_gluon_hybrid(): assert _test_tutorial_nb('gluon/hybrid') + +def test_gluon_hybrid(): Review comment: Great catch @wkcn! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch master updated: Remove mhard-float option. This is already deprecated by Google. (#15435)
This is an automated email from the ASF dual-hosted git repository. zhasheng pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 1547578 Remove mhard-float option. This is already deprecated by Google. (#15435) 1547578 is described below commit 15475788cee87eb6c6b08ddd0af245af7c05536f Author: Disi A AuthorDate: Wed Jul 3 00:05:24 2019 -0400 Remove mhard-float option. This is already deprecated by Google. (#15435) --- amalgamation/Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/amalgamation/Makefile b/amalgamation/Makefile index d4b2ee0..701c1f1 100644 --- a/amalgamation/Makefile +++ b/amalgamation/Makefile @@ -114,8 +114,8 @@ jni_libmxnet_predict.so: jni_libmxnet_predict.o ifneq ($(ANDROID), 1) android: else -CFLAGS+= -mhard-float -D_NDK_MATH_NO_SOFTFP=1 -O3 -LDFLAGS+= -Wl,--no-warn-mismatch -lm_hard +CFLAGS+= -O3 +LDFLAGS+= -Wl,--no-warn-mismatch -lm_hard android: jni_libmxnet_predict.so endif
[GitHub] [incubator-mxnet] szha merged pull request #15435: Remove mhard-float option when building Amalgamation for Android.
szha merged pull request #15435: Remove mhard-float option when building Amalgamation for Android. URL: https://github.com/apache/incubator-mxnet/pull/15435 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet] branch master updated (c6bb2ce -> 512a491)
This is an automated email from the ASF dual-hosted git repository. zhasheng pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from c6bb2ce Use omp threads for cpu data loader (#15379) add 512a491 Temporarily Commenting out Flaky Test (#15436) No new revisions were added by this update. Summary of changes: tests/python/unittest/test_profiler.py | 3 +++ 1 file changed, 3 insertions(+)
[GitHub] [incubator-mxnet] szha merged pull request #15436: Temporarily Commenting out Flaky Test
szha merged pull request #15436: Temporarily Commenting out Flaky Test URL: https://github.com/apache/incubator-mxnet/pull/15436 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] DickJC123 commented on issue #15449: cuda/cuDNN lib version checking. Force cuDNN v7 usage.
DickJC123 commented on issue #15449: cuda/cuDNN lib version checking. Force cuDNN v7 usage. URL: https://github.com/apache/incubator-mxnet/pull/15449#issuecomment-507915947 Versioning issues were recently discussed in the dev forum: https://lists.apache.org/thread.html/96d4a46a0a3c98ea1f3a3237de713ef5f40967fcb0817d661c18e950@%3Cdev.mxnet.apache.org%3E Although the PR as it stands does not preclude CUDA8, I propose to add STATIC_ASSERT_CUDA_VERSION_GE(9000) given enough consensus. Tagging @ptrendx @KellenSunderland @marcoabreu @larroy This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] iblis17 commented on issue #14720: [Flaky] Flaky Test test_clamp on windows/cpu
iblis17 commented on issue #14720: [Flaky] Flaky Test test_clamp on windows/cpu URL: https://github.com/apache/incubator-mxnet/issues/14720#issuecomment-507914739 Close via https://github.com/apache/incubator-mxnet/pull/14776. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] iblis17 closed issue #14720: [Flaky] Flaky Test test_clamp on windows/cpu
iblis17 closed issue #14720: [Flaky] Flaky Test test_clamp on windows/cpu URL: https://github.com/apache/incubator-mxnet/issues/14720 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] iblis17 commented on issue #15415: documentation link is broken, goes to spam site
iblis17 commented on issue #15415: documentation link is broken, goes to spam site URL: https://github.com/apache/incubator-mxnet/issues/15415#issuecomment-507914528 @aaronmarkham If you can get access to a CI worker, there is a dir `/work/mxnet/julia/docs/build`. > the julia docs are now down, I think it would be a good idea to go ahead and host these locally. Actually, it's hosted via GitHub page, I guess the problem is the domain CNAME setting has been pointed to somewhere. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] DickJC123 opened a new pull request #15449: cuda/cuDNN lib version checking. Force cuDNN v7 usage.
DickJC123 opened a new pull request #15449: cuda/cuDNN lib version checking. Force cuDNN v7 usage. URL: https://github.com/apache/incubator-mxnet/pull/15449 This PR addresses two issues: - rnn.cc of mxnet v1.5 does not compile against cudnn v6. This PR enforces systems that rebuild mxnet to have cudnn v7, and improves the error message for compiling against v6. - We are accumulating stale code that references no-longer-supported cuda/cudnn versions. This PR provides a means for cleaning out this code. This PR introduces both runtime and compile-time cuda and cuDNN version checking. The compile time checks are based on new macros: STATIC_ASSERT_CUDNN_VERSION_GE(min_version) and STATIC_ASSERT_CUDA_VERSION_GE(min_version). Example usage: Before PR: ``` #if MXNET_USE_CUDNN #if CUDNN_VERSION >= 7000 #elif CUDNN_VERSION >= 6000 #else LOG(FATAL) << "cuDNN too old."; #endif #endif // MXNET_USE_CUDNN ``` After PR (given the assumption that we're now requiring cuDNN v7): ``` #if MXNET_USE_CUDNN STATIC_ASSERT_CUDNN_VERSION_GE(7000); #endif // MXNET_USE_CUDNN Discussion continues in the comments section. ## Checklist ## ### Essentials ### Please feel free to remove inapplicable items for your PR. - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) created (except PRs with tiny changes) - [X ] Changes are complete (i.e. I finished coding on this PR) - [ ] All changes have test coverage: - Unit tests are added for small changes to verify correctness (e.g. adding a new operator) - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore) - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL) - [X ] Code is well-documented: - For user-facing API changes, API doc string has been updated. - For new C++ functions in header files, their functionalities and arguments are documented. - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable - Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html - [X ] To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change ### Changes ### - [ ] Feature1, tests, (and when applicable, API doc) - [ ] Feature2, tests, (and when applicable, API doc) ## Comments ## This PR improves the compile-time message to a user trying to build MXNet 1.5 against cuDNN v6. Before PR, only the missing library entrypoint is mentioned: ``` g++ ... -c src/operator/operator.cc -o build/src/operator/operator.o src/operator/rnn.cc: In function 'std::vector mxnet::op::RNNResourceEx(const nnvm::NodeAttrs&, int, mxnet::DispatchMode)': src/operator/rnn.cc:179:28: error: 'kCuDNNDropoutDesc' is not a member of 'mxnet::ResourceRequest' request.emplace_back(ResourceRequest::kCuDNNDropoutDesc); ``` After the PR, the error mentions the library version issue directly: ``` g++ ... -c src/operator/optimizer_op.cc -o build/src/operator/optimizer_op.o In file included from src/operator/././operator_common.h:42:0, from src/operator/./rnn-inl.h:45, from src/operator/rnn.cc:29: src/operator/rnn.cc: In function 'std::vector mxnet::op::RNNResourceEx(const nnvm::NodeAttrs&, int, mxnet::DispatchMode)': src/operator/././../common/cuda_utils.h:467:3: error: static assertion failed: Compiled-against cuDNN version 6021 is too old, please upgrade system to version 7000 or later. static_assert(CUDNN_VERSION >= min_version, "Compiled-against cuDNN version " \ ^ src/operator/rnn.cc:175:5: note: in expansion of macro 'STATIC_ASSERT_CUDNN_VERSION_GE' STATIC_ASSERT_CUDNN_VERSION_GE(7000); ^ src/operator/rnn.cc:180:28: error: 'kCuDNNDropoutDesc' is not a member of 'mxnet::ResourceRequest' request.emplace_back(ResourceRequest::kCuDNNDropoutDesc); ^ ``` This PR provides 2 runtime checks and issues a warning: - when the compiled-against cuda or cuDNN library version does not match the linked-against version, and - when the library versions are old w.r.t. the versions tested against by the MXNet CI. I built the PR against cuda 9 and cuDNN v7.1.4. Running any model will emit the warning: ``` [01:05:03] src/common/cuda_utils.cc:50: Upgrade advisory: this mxnet has been built against cuda library version 9000, which is older than the oldest version tested by CI (1). Set M
[GitHub] [incubator-mxnet] cyrusbehr commented on issue #15393: Unable to build mxnet with OpenCV4 on Raspberry Pi 3B
cyrusbehr commented on issue #15393: Unable to build mxnet with OpenCV4 on Raspberry Pi 3B URL: https://github.com/apache/incubator-mxnet/issues/15393#issuecomment-507912741 @larroy do the docker files built Mxnet with c++ support `USE_CPP_PACKAGE`? Or does it only support python right now? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] zoeygxy commented on a change in pull request #15390: [Numpy fix-doc]modify numpy doc
zoeygxy commented on a change in pull request #15390: [Numpy fix-doc]modify numpy doc URL: https://github.com/apache/incubator-mxnet/pull/15390#discussion_r299754141 ## File path: python/mxnet/symbol/numpy/_symbol.py ## @@ -1555,4 +1613,156 @@ def sqrt(x, out=None, **kwargs): return _unary_func_helper(x, _npi.sqrt, _np.sqrt, out=out, **kwargs) +@set_module('mxnet.symbol.numpy') +def ceil(x, out=None, **kwargs): +r""" +Return the ceiling of the input, element-wise. + +The ceil of the ndarray `x` is the smallest integer `i`, such that +`i >= x`. It is often denoted as :math:`\lceil x \rceil`. + +Parameters +-- +x : _Symbol or scalar +Input array. +out : _Symbol or None +A location into which the result is stored. If provided, it +must have a shape that the inputs broadcast to. If not provided +or None, a freshly-allocated array is returned. The dtype of the +output is the same as that of the input if the input is an ndarray. + +Returns +--- +y : +_Symbol or scalar +The ceiling of each element in `x`, with `float` dtype. +This is a scalar if `x` is a scalar. + +Examples + +>>> a = np.array([-1.7, -1.5, -0.2, 0.2, 1.5, 1.7, 2.0]) +>>> np.ceil(a) +array([-1., -1., -0., 1., 2., 2., 2.]) + +>>> #if you use parameter out, x and out must be ndarray. if not, you will get an error! +>>> a = np.array(1) +>>> np.ceil(np.array(3.5), a) +array(4.) +>>> a +array(4.) + +""" +return _unary_func_helper(x, _npi.ceil, _np.ceil, out=out, **kwargs) + + +@set_module('mxnet.symbol.numpy') +def log1p(x, out=None, **kwargs): +""" +Return the natural logarithm of one plus the input array, element-wise. + +Calculates ``log(1 + x)``. + +Parameters +-- +x : +_Symbol or scalar +Input array. +out : _Symbol or None +A location into which the result is stored. If provided, it +must have a shape that the inputs broadcast to. If not provided +or None, a freshly-allocated array is returned. The dtype of the +output is the same as that of the input if the input is an ndarray. + +Returns +--- +y : _Symbol or scalar +Natural logarithm of 1 + x, element-wise. This is a scalar +if x is a scalar. + +Notes +- + +For real-valued input, `log1p` is accurate also for `x` so small +that `1 + x == 1` in floating-point accuracy. + +Logarithm is a multivalued function: for each `x` there is an infinite +number of `z` such that `exp(z) = 1 + x`. The convention is to return +the `z` whose imaginary part lies in `[-pi, pi]`. + +For real-valued input data types, `log1p` always returns real output. +For each value that cannot be expressed as a real number or infinity, +it yields ``nan`` and sets the `invalid` floating point error flag. + +For complex-valued input, `log1p` is a complex analytical function that +has a branch cut `[-inf, -1]` and is continuous from above on it. +`log1p` handles the floating-point negative zero as an infinitesimal +negative number, conforming to the C99 standard. + +Examples + +>>> np.log1p(1e-99) +1e-99 + +""" +return _unary_func_helper(x, _npi.log1p, _np.log1p, out=out, **kwargs) + + +@set_module('mxnet.symbol.numpy') +def tanh(x, out=None, **kwargs): +""" +Compute hyperbolic tangent element-wise. + +Equivalent to ``np.sinh(x)/np.cosh(x)``. + +Parameters +-- +x : +_Symbol +Input array. +out : _Symbol or None +A location into which the result is stored. If provided, it +must have a shape that the inputs broadcast to. If not provided +or None, a freshly-allocated array is returned. The dtype of the +output is the same as that of the input if the input is an ndarray. +Returns +--- +y : _Symbol +The corresponding hyperbolic tangent values. + +Notes +- +If `out` is provided, the function writes the result into it, +and returns a reference to `out`. (See Examples) + +- Not support complex computation (like imaginary number) + +>>> np.tanh(np.pi*1j) +TypeError: type not supported + +Examples + +>>> np.tanh(np.array[0, np.pi])) Review comment: @gyshi This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] zoeygxy commented on a change in pull request #15390: [Numpy fix-doc]modify numpy doc
zoeygxy commented on a change in pull request #15390: [Numpy fix-doc]modify numpy doc URL: https://github.com/apache/incubator-mxnet/pull/15390#discussion_r299754108 ## File path: python/mxnet/_numpy_op_doc.py ## @@ -173,3 +212,73 @@ def _np_cumsum(a, axis=None, dtype=None, out=None): `axis` is not None or `a` is a 1-d array. """ pass + + +def _np_max(axis=None, keepdims=False, initial=None, out=None): +""" +Return the maximum of an array or maximum along an axis. + +Parameters +-- +a : ndarray +Input data. +axis : None or int or tuple of ints, optional +Axis or axes along which to operate. By default, flattened input is +used. + +If this is a tuple of ints, the maximum is selected over multiple axes, +instead of a single axis or all the axes as before. + +keepdims : bool, optional +If this is set to True, the axes which are reduced are left +in the result as dimensions with size one. With this option, +the result will broadcast correctly against the input array. + +If the default value is passed, then `keepdims` will not be +passed through to the `amax` method of sub-classes of +`ndarray`, however any non-default value will be. If the +sub-class' method does not implement `keepdims` any +exceptions will be raised. + +initial : +Parameter initial is not supported yet, we will support it in the future. +now it must be None. + +out : ndarray, optional Review comment: @gyshi Please add this in the `note` area. This is different from native numpy. I would also suggest giving a note here in `out` parameter description. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] pengzhao-intel commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
pengzhao-intel commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299751199 ## File path: docs/tutorials/gluon/performance.md ## @@ -0,0 +1,483 @@ + + + + + + + + + + + + + + + + + +# Gluon Performance Tips & Tricks + +Compared to traditional machine learning methods, the field of deep-learning has increased model accuracy across a wide range of tasks, but it has also increased the amount of computation required for model training and inference. Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution of networks, but it can sometimes be hard to write code that uses the hardware to its full potential. We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs. Review comment: "We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs." Does this document only cover for "training"? If so, how about change title to "Gluon Performance Tips ... for training"? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] pengzhao-intel commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
pengzhao-intel commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299751380 ## File path: docs/tutorials/gluon/performance.md ## @@ -0,0 +1,483 @@ + + + + + + + + + + + + + + + + + +# Gluon Performance Tips & Tricks + +Compared to traditional machine learning methods, the field of deep-learning has increased model accuracy across a wide range of tasks, but it has also increased the amount of computation required for model training and inference. Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution of networks, but it can sometimes be hard to write code that uses the hardware to its full potential. We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs. + +We'll start by writing some code to train an image classification network for the CIFAR-10 dataset, and then benchmark the throughput of the network in terms of samples processed per second. After some performance analysis, we'll identify the bottlenecks (i.e. the components limiting throughput) and improve the training speed step-by-step. We'll bring together all the tips and tricks at the end and evaluate our performance gains. + + +```python +from __future__ import print_function +import multiprocessing +import time +import mxnet as mx +import numpy as np +``` + +An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this tutorial. You are likely to get different results and find different bottlenecks on other hardware, but these tips and tricks should still help improve training speed for bottleneck components. A GPU is recommended for this example. + + +```python +ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu() +print("Using {} context.".format(ctx)) +``` + +Using gpu(0) context. + + +We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon. + + +```python +dataset = mx.gluon.data.vision.CIFAR10(train=True) +print('{} samples'.format(len(dataset))) +``` + +5 samples + + +So we can learn how to identify training bottlenecks, let's intentionally introduce a short `sleep` into the data loading pipeline. We transform each 32x32 CIFAR-10 image to 224x224 so we can use it with the ResNet-50 network designed for ImageNet. [CIFAR-10 specific ResNet networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet) exist but we use the more standard ImageNet variants in this example. Review comment: Why add a "sleep"? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] pengzhao-intel commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
pengzhao-intel commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299751165 ## File path: docs/tutorials/gluon/performance.md ## @@ -0,0 +1,483 @@ + + + + + + + + + + + + + + + + + +# Gluon Performance Tips & Tricks + +Compared to traditional machine learning methods, the field of deep-learning has increased model accuracy across a wide range of tasks, but it has also increased the amount of computation required for model training and inference. Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution of networks, but it can sometimes be hard to write code that uses the hardware to its full potential. We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs. + +We'll start by writing some code to train an image classification network for the CIFAR-10 dataset, and then benchmark the throughput of the network in terms of samples processed per second. After some performance analysis, we'll identify the bottlenecks (i.e. the components limiting throughput) and improve the training speed step-by-step. We'll bring together all the tips and tricks at the end and evaluate our performance gains. + + +```python +from __future__ import print_function +import multiprocessing +import time +import mxnet as mx +import numpy as np +``` + +An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this tutorial. You are likely to get different results and find different bottlenecks on other hardware, but these tips and tricks should still help improve training speed for bottleneck components. A GPU is recommended for this example. + Review comment: I think both CPU and GPU is fine for this small case. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] KellenSunderland commented on a change in pull request #15335: enable TensorRT integration with cpp api
KellenSunderland commented on a change in pull request #15335: enable TensorRT integration with cpp api URL: https://github.com/apache/incubator-mxnet/pull/15335#discussion_r299753358 ## File path: cpp-package/include/mxnet-cpp/contrib.h ## @@ -0,0 +1,116 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +/*! +* Copyright (c) 2019 by Contributors +* \file contrib.h +* \brief utility function to enable some contrib features +* \author Haohuan Wang +*/ +#ifndef MXNET_CPP_CONTRIB_H_ +#define MXNET_CPP_CONTRIB_H_ + +#include +#include +#include +#include +#include "mxnet-cpp/symbol.h" + +namespace mxnet { +namespace cpp { +namespace details { + + /*! + * split a string with the given delimiter + * @param str string to be parsed + * @param delimiter delimiter + * @return delimited list of string + */ + inline std::vector split(const std::string& str, const std::string& delimiter) { +std::vector splitted; +size_t last = 0; +size_t next = 0; +while ((next = str.find(delimiter, last)) != std::string::npos) { + splitted.push_back(str.substr(last, next - last)); + last = next + 1; +} +splitted.push_back(str.substr(last)); +return splitted; + } + +} // namespace details + +namespace contrib { Review comment: @szha Hey Sheng, have we had contrib namespaces in C++ before? The intent of this code basically aligns with the intent of having a contrib level python API. Does namespacing it out like this make sense to you? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] pengzhao-intel commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
pengzhao-intel commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299751199 ## File path: docs/tutorials/gluon/performance.md ## @@ -0,0 +1,483 @@ + + + + + + + + + + + + + + + + + +# Gluon Performance Tips & Tricks + +Compared to traditional machine learning methods, the field of deep-learning has increased model accuracy across a wide range of tasks, but it has also increased the amount of computation required for model training and inference. Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution of networks, but it can sometimes be hard to write code that uses the hardware to its full potential. We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs. Review comment: " but it can sometimes be hard to write code that uses the hardware to its full potential." The framework is the bridge between HW and SW so it supposes it will be easy to write the code. The performance is another thing. "We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs." Does this document only cover for "training"? If so, how about change tilte to "Gluon Performance Tips ... for training"? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] zoeygxy commented on a change in pull request #15390: [Numpy fix-doc]modify numpy doc
zoeygxy commented on a change in pull request #15390: [Numpy fix-doc]modify numpy doc URL: https://github.com/apache/incubator-mxnet/pull/15390#discussion_r299753290 ## File path: python/mxnet/_numpy_op_doc.py ## @@ -32,24 +33,70 @@ def _np_reshape(a, newshape, order='C'): an integer, then the result will be a 1-D array of that length. One shape dimension can be -1. In this case, the value is inferred from the length of the array and remaining dimensions. -order : {'C'}, optional +order : {'C', 'F', 'A'}, optional Read the elements of `a` using this index order, and place the elements into the reshaped array using this index order. 'C' means to read / write the elements using C-like index order, with the last axis index changing fastest, back to the first -axis index changing slowest. Other order types such as 'F'/'A' -may be added in the future. +axis index changing slowest. 'F' means to read / write the +elements using Fortran-like index order, with the first index +changing fastest, and the last index changing slowest. Note that +the 'C' and 'F' options take no account of the memory layout of +the underlying array, and only refer to the order of indexing. +'A' means to read / write the elements in Fortran-like index +order if `a` is Fortran *contiguous* in memory, C-like order +otherwise. Returns --- reshaped_array : ndarray -It will be always a copy of the original array. This behavior is different -from the official NumPy package where views of the original array may be -generated. +This will be a new view object if possible; otherwise, it will +be a copy. Note there is no guarantee of the *memory layout* (C- or +Fortran- contiguous) of the returned array. -See Also + +Notes +- +It is not always possible to change the shape of an array without +copying the data. If you want an error to be raised when the data is copied, +you should assign the new shape to the shape attribute of the array:: + + >>> a = np.zeros((10, 2)) + # A transpose makes the array non-contiguous + >>> b = a.T + # Taking a view makes it possible to modify the shape without modifying + # the initial object. + +>>> a = np.arange(6).reshape((3, 2)) +>>> a +array([[0., 1.], + [2., 3.], + [4., 5.]]) + +You can think of reshaping as first raveling the array (using the given +index order), then inserting the elements from the raveled array into the +new array using the same kind of index ordering as was used for the +raveling. + +>>> np.reshape(a, (2, 3)) # C-like index ordering +array([[0., 1., 2.], + [3., 4., 5.]]) + +- order only support C-order +- input not support scalar +- not support zero-size shape + +Examples -ndarray.reshape : Equivalent method. +>>> a = np.array([[1,2,3], [4,5,6]]) +>>> np.reshape(a, 6) +array([1., 2., 3., 4., 5., 6.]) + +>>> np.reshape(a, (3,-1)) # the unspecified value is inferred to be 2 Review comment: Have you tested this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] pengzhao-intel commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
pengzhao-intel commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299749824 ## File path: docs/tutorials/index.md ## @@ -91,6 +91,7 @@ Select API: * [Image similiarity search with InfoGAN](/tutorials/gluon/info_gan.html) * Practitioner Guides * [Gotchas using NumPy](/tutorials/gluon/gotchas_numpy_in_mxnet.html) +* [Performance Tips & Tricks](/tutorials/gluon/performance.html) Review comment: Better to align with the name in the page "Gluon Performance Tips & Tricks" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] xinyu-intel opened a new pull request #15448: [MKLDNN]Enhance Quantization APIs and Tutorial
xinyu-intel opened a new pull request #15448: [MKLDNN]Enhance Quantization APIs and Tutorial URL: https://github.com/apache/incubator-mxnet/pull/15448 ## Description ## - Create a MKL-DNN specific user-level api `quantize_model_mkldnn` which combines fusion and quantization. - Enable `resnet50_v1b` quantized model. - Split `quantize_model` API into three parts to make it flexible for users to integrate quantization flow into their project: 1)`quantize_graph`: quantize fp32 model to int8 model w/o calibration and return a collector for collecting calibration information in the next step. 2)[outside api]: users need only add a few lines together with mod.forward for collecting calibration information. 3)`calib_graph`: generate calibrated model based on filled collector. - Draft a tutorial to introduce **How to quantize custom models for production-level inference with MKL-DNN backend**. @pengzhao-intel @TaoLv @ZhennanQin @ciyongch ## Checklist ## ### Essentials ### Please feel free to remove inapplicable items for your PR. - [ ] The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant [JIRA issue](https://issues.apache.org/jira/projects/MXNET/issues) created (except PRs with tiny changes) - [ ] Changes are complete (i.e. I finished coding on this PR) - [ ] All changes have test coverage: - Unit tests are added for small changes to verify correctness (e.g. adding a new operator) - Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore) - Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL) - [ ] Code is well-documented: - For user-facing API changes, API doc string has been updated. - For new C++ functions in header files, their functionalities and arguments are documented. - For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable - Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html - [ ] To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change ### Changes ### - [ ] Feature1, tests, (and when applicable, API doc) - [ ] Feature2, tests, (and when applicable, API doc) ## Comments ## - If this change is a backward incompatible change, why must this change be made. - Interesting edge cases to note here This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy commented on issue #15369: Fix build with system's openmp
larroy commented on issue #15369: Fix build with system's openmp URL: https://github.com/apache/incubator-mxnet/pull/15369#issuecomment-507901448 @szha understood. it would be beneficial to be able to replicate that with CMake by having a selector for the openmp lib until then. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy commented on issue #15424: fixed config.mk and Makefile bugs for installing mkl
larroy commented on issue #15424: fixed config.mk and Makefile bugs for installing mkl URL: https://github.com/apache/incubator-mxnet/pull/15424#issuecomment-507901172 Shouldn't USE_STATIC_MKL appear in the Makefile directly then? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[incubator-mxnet-site] branch asf-site updated: Bump the publish timestamp.
This is an automated email from the ASF dual-hosted git repository. marcoabreu pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-mxnet-site.git The following commit(s) were added to refs/heads/asf-site by this push: new 3884f6a Bump the publish timestamp. 3884f6a is described below commit 3884f6abde723dfdf2b43299fd4baa9ff5038f7e Author: mxnet-ci AuthorDate: Wed Jul 3 01:17:26 2019 + Bump the publish timestamp. --- date.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/date.txt b/date.txt new file mode 100644 index 000..e835a92 --- /dev/null +++ b/date.txt @@ -0,0 +1 @@ +Wed Jul 3 01:17:26 UTC 2019
[GitHub] [incubator-mxnet] Kangzf1996 removed a comment on issue #15392: Fails to make -j4
Kangzf1996 removed a comment on issue #15392: Fails to make -j4 URL: https://github.com/apache/incubator-mxnet/issues/15392#issuecomment-506820105 > @Kangzf1996 , Wow. This seems to be using MXNet 0.10.0 which as released in May 2017. Isn't this project compatible with latest MXNet? @vdantu Hi, there are also some errors when I using MXNet1.4.0. when I run the make -j4, the error is following: compilation terminated. make: *** [Makefile:461: build/src/operator/nn/mkldnn/mkldnn_copy.o] Error 1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] Kangzf1996 commented on issue #15392: Fails to make -j4
Kangzf1996 commented on issue #15392: Fails to make -j4 URL: https://github.com/apache/incubator-mxnet/issues/15392#issuecomment-507890195 > @Kangzf1996 , Wow. This seems to be using MXNet 0.10.0 which as released in May 2017. Isn't this project compatible with latest MXNet? @vdantu Hi, there are also some errors when I using MXNet1.4.0. when I run the make -j4, the error is following: compilation terminated. make: *** [Makefile:461: build/src/operator/nn/mkldnn/mkldnn_copy.o] Error 1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] TaoLv commented on issue #15424: fixed config.mk and Makefile bugs for installing mkl
TaoLv commented on issue #15424: fixed config.mk and Makefile bugs for installing mkl URL: https://github.com/apache/incubator-mxnet/pull/15424#issuecomment-507889995 I think the lower cases BLAS check in Makefile was added for runtime feature detection. It's not really used for BLAS linkage. Please correct me if I'm wrong @larroy . I tried USE_BLAS=mkl in make command line and can find MKL `.a` files in the link line and there is no MKL `.so` files in ldd output. Can you double check? @nuslq This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] larroy commented on issue #15405: Fix memory leak in NaiveEngine
larroy commented on issue #15405: Fix memory leak in NaiveEngine URL: https://github.com/apache/incubator-mxnet/pull/15405#issuecomment-507887002 @mxnet-label-bot add [pr-awaiting-merge] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] wkcn commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
wkcn commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299727333 ## File path: tests/tutorials/test_tutorials.py ## @@ -114,6 +114,9 @@ def test_gluon_save_load_params(): def test_gluon_hybrid(): assert _test_tutorial_nb('gluon/hybrid') + +def test_gluon_hybrid(): Review comment: Thanks for your contribution! It seems to be test_gluon_performance. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] IvyBazan removed a comment on issue #15445: MXNet export broken link
IvyBazan removed a comment on issue #15445: MXNet export broken link URL: https://github.com/apache/incubator-mxnet/issues/15445#issuecomment-507884878 @mxnet-label-bot Website This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] IvyBazan edited a comment on issue #15445: MXNet export broken link
IvyBazan edited a comment on issue #15445: MXNet export broken link URL: https://github.com/apache/incubator-mxnet/issues/15445#issuecomment-507884878 @mxnet-label-bot Website This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] IvyBazan commented on issue #15445: MXNet export broken link
IvyBazan commented on issue #15445: MXNet export broken link URL: https://github.com/apache/incubator-mxnet/issues/15445#issuecomment-507884878 @mxnet-label-bot add Website This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] Caenorst commented on a change in pull request #15335: enable TensorRT integration with cpp api
Caenorst commented on a change in pull request #15335: enable TensorRT integration with cpp api URL: https://github.com/apache/incubator-mxnet/pull/15335#discussion_r299727739 ## File path: cpp-package/include/mxnet-cpp/contrib.h ## @@ -0,0 +1,116 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +/*! +* Copyright (c) 2019 by Contributors +* \file contrib.h +* \brief utility function to enable some contrib features +* \author Haohuan Wang +*/ +#ifndef MXNET_CPP_CONTRIB_H_ +#define MXNET_CPP_CONTRIB_H_ + +#include +#include +#include +#include +#include "mxnet-cpp/symbol.h" + +namespace mxnet { +namespace cpp { +namespace details { + + /*! + * split a string with the given delimiter + * @param str string to be parsed + * @param delimiter delimiter + * @return delimited list of string + */ + inline std::vector split(const std::string& str, const std::string& delimiter) { +std::vector splitted; +size_t last = 0; +size_t next = 0; +while ((next = str.find(delimiter, last)) != std::string::npos) { + splitted.push_back(str.substr(last, next - last)); + last = next + 1; +} +splitted.push_back(str.substr(last)); +return splitted; + } + +} // namespace details + +namespace contrib { + + // needs to be same with + // https://github.com/apache/incubator-mxnet/blob/1c874cfc807cee755c38f6486e8e0f4d94416cd8/src/operator/subgraph/tensorrt/tensorrt-inl.h#L190 + static const std::string TENSORRT_SUBGRAPH_PARAM_IDENTIFIER = "subgraph_params_names"; + // needs to be same with + // https://github.com/apache/incubator-mxnet/blob/master/src/operator/subgraph/tensorrt/tensorrt.cc#L244 + static const std::string TENSORRT_SUBGRAPH_PARAM_PREFIX = "subgraph_param_"; Review comment: I see, ignore my original comment then This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] thomelane commented on issue #15396: [TUTORIAL] Gluon and Sparse NDArray
thomelane commented on issue #15396: [TUTORIAL] Gluon and Sparse NDArray URL: https://github.com/apache/incubator-mxnet/pull/15396#issuecomment-507884716 thanks for the reviews @eric-haibin-lin @aaronmarkham. updates made. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API
anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299717110 ## File path: amalgamation/python/mxnet_predict.py ## @@ -133,15 +199,38 @@ def __init__(self, symbol_file, handle = PredictorHandle() param_raw_bytes = bytearray(param_raw_bytes) ptr = (ctypes.c_char * len(param_raw_bytes)).from_buffer(param_raw_bytes) -_check_call(_LIB.MXPredCreate( + +# data types +num_provided_arg_types = 0 +# provided type argument names +provided_arg_type_names = ctypes.POINTER(ctypes.c_char_p)() +# provided types +provided_arg_type_data = ctypes.POINTER(mx_uint)() +if type_dict is not None: +provided_arg_type_names = [] +provided_arg_type_data = [] +for k, v in type_dict.items(): +v = np.dtype(v).type +if v in _DTYPE_NP_TO_MX: +provided_arg_type_names.append(k) Review comment: I don't think there is an easy way to pass a map to the C API and this is how we pass a map of name value to the C API today. This is also how it is done at other places in MXNet. Would be interested to see the issue. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API
anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299719073 ## File path: amalgamation/python/mxnet_predict.py ## @@ -160,10 +249,18 @@ def forward(self, **kwargs): >>> predictor.forward(data=mydata) >>> out = predictor.get_output(0) """ +if self.type_dict and len(self.type_dict) != len(kwargs.items()): +raise ValueError("number of kwargs should be same as len of type_dict" \ + "Please check your forward pass inputs" \ + "or type_dict passed to Predictor instantiation") + for k, v in kwargs.items(): if not isinstance(v, np.ndarray): raise ValueError("Expect numpy ndarray as input") -v = np.asarray(v, dtype=np.float32, order='C') +if self.type_dict and k in self.type_dict: +v = np.asarray(v, dtype=self.type_dict[k], order='C') +else: +v = np.asarray(v, dtype=np.float32, order='C') Review comment: if not supported type is used it will go into if clause and fail. it will silently convert to FP32 to if not provided. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API
anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299722242 ## File path: include/mxnet/c_predict_api.h ## @@ -85,6 +85,44 @@ MXNET_DLL int MXPredCreate(const char* symbol_json_str, const mx_uint* input_shape_data, PredictorHandle* out); +/*! + * \brief create a predictor + * \param symbol_json_str The JSON string of the symbol. + * \param param_bytes The in-memory raw bytes of parameter ndarray file. + * \param param_size The size of parameter ndarray file. + * \param dev_type The device type, 1: cpu, 2: gpu + * \param dev_id The device id of the predictor. + * \param num_input_nodes Number of input nodes to the net. + *For feedforward net, this is 1. + * \param input_keys The name of the input argument. + *For feedforward net, this is {"data"} + * \param input_shape_indptr Index pointer of shapes of each input node. + *The length of this array = num_input_nodes + 1. + *For feedforward net that takes 4 dimensional input, this is {0, 4}. + * \param input_shape_data A flattened data of shapes of each input node. + *For feedforward net that takes 4 dimensional input, this is the shape data. + * \param num_provided_arg_dtypes + *The length of provided_arg_dtypes. + * \param provided_arg_dtype_names + *The provided_arg_dtype_names the names of args for which dtypes are provided. + * \param provided_arg_dtypes + *The provided_arg_dtypes the dtype provided + * \param out The created predictor handle. + * \return 0 when success, -1 when failure. + */ +MXNET_DLL int MXPredCreateEx(const char* symbol_json_str, + const void* param_bytes, + int param_size, + int dev_type, int dev_id, + mx_uint num_input_nodes, Review comment: This is just to keep the same interface as MXPredCreate. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API
anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299725608 ## File path: src/c_api/c_predict_api.cc ## @@ -210,19 +249,31 @@ int _CreatePartialOut(const char* symbol_json_str, std::vector arg_arrays, aux_arrays; for (size_t i = 0; i < arg_shapes.size(); ++i) { -NDArray nd = NDArray(arg_shapes[i], ctx); +NDArray nd; +if (result_arg_types[i] != -1) { + nd = NDArray(arg_shapes[i], ctx, false, result_arg_types[i]); +} else { + nd = NDArray(arg_shapes[i], ctx); +} if (arg_params.count(arg_names[i]) != 0) { CopyFromTo(arg_params[arg_names[i]], &nd); } arg_arrays.push_back(nd); } + for (size_t i = 0; i < aux_shapes.size(); ++i) { -NDArray nd = NDArray(aux_shapes[i], ctx); +NDArray nd; +if (result_aux_types[i] != -1) { Review comment: This was added as I was foreseeing the AMP change. and didnt even have the ``` CHECK(infer_type_complete) << "The type information is not enough, please provide input arg_types " "with provided_arg_dtype_names and provided_arg_dtypes"; ``` Will add a test for AMP for C Predict API and then get back here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API
anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299718649 ## File path: amalgamation/python/mxnet_predict.py ## @@ -160,10 +249,18 @@ def forward(self, **kwargs): >>> predictor.forward(data=mydata) >>> out = predictor.get_output(0) """ +if self.type_dict and len(self.type_dict) != len(kwargs.items()): +raise ValueError("number of kwargs should be same as len of type_dict" \ + "Please check your forward pass inputs" \ + "or type_dict passed to Predictor instantiation") + for k, v in kwargs.items(): if not isinstance(v, np.ndarray): raise ValueError("Expect numpy ndarray as input") -v = np.asarray(v, dtype=np.float32, order='C') +if self.type_dict and k in self.type_dict: +v = np.asarray(v, dtype=self.type_dict[k], order='C') Review comment: I think this row-major only. Converting all inputs to row major format, which mxnet expects. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API
anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299725759 ## File path: src/c_api/c_predict_api.cc ## @@ -444,6 +538,20 @@ int MXPredGetOutputShape(PredictorHandle handle, API_END(); } +int MXPredGetOutputType(PredictorHandle handle, +mx_uint out_index, +int* out_dtype) { + MXAPIPredictor* p = static_cast(handle); + API_BEGIN(); + CHECK_LT(out_index, p->out_arrays.size()) +<< "Index exceed number of outputs"; Review comment: okay sure. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API
anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299719401 ## File path: amalgamation/python/mxnet_predict.py ## @@ -160,10 +249,18 @@ def forward(self, **kwargs): >>> predictor.forward(data=mydata) >>> out = predictor.get_output(0) """ +if self.type_dict and len(self.type_dict) != len(kwargs.items()): +raise ValueError("number of kwargs should be same as len of type_dict" \ + "Please check your forward pass inputs" \ + "or type_dict passed to Predictor instantiation") + for k, v in kwargs.items(): if not isinstance(v, np.ndarray): raise ValueError("Expect numpy ndarray as input") -v = np.asarray(v, dtype=np.float32, order='C') +if self.type_dict and k in self.type_dict: +v = np.asarray(v, dtype=self.type_dict[k], order='C') +else: +v = np.asarray(v, dtype=np.float32, order='C') Review comment: not sure i understand. can you elaborate. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API
anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299716688 ## File path: amalgamation/python/mxnet_predict.py ## @@ -25,17 +25,74 @@ import os import sys +from array import array import ctypes import logging import numpy as np +# pylint: disable= no-member +_DTYPE_NP_TO_MX = { +None: -1, Review comment: this is similar to symbol API and NDarray API where None is mapped to -1. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API
anirudh2290 commented on a change in pull request #15245: FP16 Support for C Predict API URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299722802 ## File path: src/c_api/c_predict_api.cc ## @@ -187,21 +206,41 @@ int _CreatePartialOut(const char* symbol_json_str, try { mxnet::ShapeVector in_shapes; +nnvm::DTypeVector in_types; for (std::string key : sym.ListInputNames(Symbol::kAll)) { if (known_shape.count(key) != 0) { in_shapes.push_back(known_shape[key]); } else { in_shapes.emplace_back(); } } + +for (std::string key : sym.ListInputNames(Symbol::kAll)) { + if (arg_types.count(key) != 0) { +in_types.push_back(arg_types[key]); + } else if (aux_types.count(key) != 0) { +in_types.push_back(aux_types[key]); + } +} nnvm::Graph g; g.outputs = sym.outputs; g = mxnet::exec::InferShape(std::move(g), std::move(in_shapes), "__shape__"); +g = mxnet::exec::InferType(std::move(g), std::move(in_types), "__dtype__"); bool infer_complete = (g.GetAttr("shape_num_unknown_nodes") == 0); +// This is tricky for AMP Use case, for example, with only weights input types +// cannot be inferred in AMP. Thus for AMP converted model type_dict will be +// required +bool infer_type_complete = (g.GetAttr("dtype_num_unknown_nodes") == 0); CHECK(infer_complete) << "The shape information of is not enough to get the shapes"; +CHECK(infer_type_complete) +<< "The type information is not enough, please provide input arg_types " + "with provided_arg_dtype_names and provided_arg_dtypes"; Review comment: hmm, i thought about this, but this is not limited to the python user, C Predict API can be used from Python, CPP etc, calling _CreatePartialOut and therefore i used provided_arg_dtype_names and provided_arg_dtypes which are the API param names. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] thomelane commented on issue #15427: [TUTORIAL] Gluon performance tips and tricks
thomelane commented on issue #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#issuecomment-507882866 thanks for all the reviews @aaronmarkham @pengzhao-intel @ptrendx. updates have been made. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299724871 ## File path: docs/tutorials/gluon/performance.md ## @@ -0,0 +1,483 @@ + + + + + + + + + + + + + + + + + +# Gluon Performance Tips & Tricks + +Compared to traditional machine learning methods, the field of deep-learning has increased model accuracy across a wide range of tasks, but it has also increased the amount of computation required for model training and inference. Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution of networks, but it can sometimes be hard to write code that uses the hardware to its full potential. We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs. + +We'll start by writing some code to train an image classification network for the CIFAR-10 dataset, and then benchmark the throughput of the network in terms of samples processed per second. After some performance analysis, we'll identify the bottlenecks (i.e. the components limiting throughput) and improve the training speed step-by-step. We'll bring together all the tips and tricks at the end and evaluate our performance gains. + + +```python +from __future__ import print_function +import multiprocessing +import time +import mxnet as mx +import numpy as np +``` + +An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this tutorial. You are likely to get difference results and find different bottlenecks on other hardware, but these tips and tricks should still help improve training speed for bottleneck components. A GPU is recommended for this example. + + +```python +ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu() +print("Using {} context.".format(ctx)) +``` + +Using gpu(0) context. + + +We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon. + + +```python +dataset = mx.gluon.data.vision.CIFAR10(train=True) +print('{} samples'.format(len(dataset))) +``` + +5 samples + + +So we can learn how to identify training bottlenecks, let's intentionally introduce a short `sleep` into the data loading pipeline. We transform each 32x32 CIFAR-10 image to 244x244 so we can use it with the ResNet-50 network designed for ImageNet. [CIFAR-10 specific ResNet networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet) exist but we use the more standard ImageNet variants in this example. + + +```python +def transform_fn(x): +time.sleep(0.01) # artificial slow-down +image = mx.image.imresize(x, w=244, h=244) +return image.astype('float32').transpose((2, 0, 1)) + +dataset = dataset.transform_first(transform_fn) +``` + +Setting our batch size to 16, we can create the `DataLoader`. + + +```python +batch_size = 16 +dataloader = mx.gluon.data.DataLoader(dataset, + batch_size=batch_size, + shuffle=True, + last_batch="discard") +print('{} batches'.format(len(dataloader))) +``` + +3125 batches + + +Up next, we create all of the other components required for training, such as the network, the loss function, the evaluation metric and parameter trainer. + + +```python +net = mx.gluon.model_zoo.vision.resnet50_v2(pretrained=False, ctx=ctx) +net.initialize(mx.init.Xavier(magnitude=2.3), ctx=ctx) +loss_fn = mx.gluon.loss.SoftmaxCrossEntropyLoss() +metric = mx.metric.Accuracy() +learning_rate = 0.001 +trainer = mx.gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': learning_rate}) +``` + +## Initial Benchmark + +As a starting point, let's benchmark the throughput of our training loop: calculating the average samples per second across 25 iterations, where each iteration is a batch of 16 samples. We'll run a single forward pass through the network before starting our benchmark timer to avoid including shape inference and lazy initialization in the throughput calculations. + + +```python +def single_forward(net, dataloader, dtype='float32'): +data, label = next(iter(dataloader)) +data = data.astype(dtype) +data = data.as_in_context(ctx) +pred = net(data) +pred.wait_to_read() +``` + + +```python +single_forward(net, dataloader) +iters = 25 +num_samples = 0 +num_iters = 0 +start_time = time.time() +for iter_idx, (data, label) in enumerate(dataloader): +num_samples += data.shape[0] +num_iters += 1 +data = data.as_in_context(ctx) +label = label.as_in_context(ctx) +with mx.autograd.record(): +pred = net(data) +loss = loss_fn(pred, label) +loss.backward() +trainer.step(data.shape[0]) +metric.update(label, pred) +print('.', end='') +if num_iters >= iters: +break +mx.nd.waitall() +end_time = time.time() +total_time = end_time - start_tim
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299724835 ## File path: docs/tutorials/gluon/performance.md ## @@ -0,0 +1,483 @@ + + + + + + + + + + + + + + + + + +# Gluon Performance Tips & Tricks + +Compared to traditional machine learning methods, the field of deep-learning has increased model accuracy across a wide range of tasks, but it has also increased the amount of computation required for model training and inference. Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution of networks, but it can sometimes be hard to write code that uses the hardware to its full potential. We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs. + +We'll start by writing some code to train an image classification network for the CIFAR-10 dataset, and then benchmark the throughput of the network in terms of samples processed per second. After some performance analysis, we'll identify the bottlenecks (i.e. the components limiting throughput) and improve the training speed step-by-step. We'll bring together all the tips and tricks at the end and evaluate our performance gains. + + +```python +from __future__ import print_function +import multiprocessing +import time +import mxnet as mx +import numpy as np +``` + +An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this tutorial. You are likely to get difference results and find different bottlenecks on other hardware, but these tips and tricks should still help improve training speed for bottleneck components. A GPU is recommended for this example. + + +```python +ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu() +print("Using {} context.".format(ctx)) +``` + +Using gpu(0) context. + + +We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon. + + +```python +dataset = mx.gluon.data.vision.CIFAR10(train=True) +print('{} samples'.format(len(dataset))) +``` + +5 samples + + +So we can learn how to identify training bottlenecks, let's intentionally introduce a short `sleep` into the data loading pipeline. We transform each 32x32 CIFAR-10 image to 244x244 so we can use it with the ResNet-50 network designed for ImageNet. [CIFAR-10 specific ResNet networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet) exist but we use the more standard ImageNet variants in this example. + + +```python +def transform_fn(x): +time.sleep(0.01) # artificial slow-down +image = mx.image.imresize(x, w=244, h=244) +return image.astype('float32').transpose((2, 0, 1)) + +dataset = dataset.transform_first(transform_fn) +``` + +Setting our batch size to 16, we can create the `DataLoader`. + + +```python +batch_size = 16 +dataloader = mx.gluon.data.DataLoader(dataset, + batch_size=batch_size, + shuffle=True, + last_batch="discard") +print('{} batches'.format(len(dataloader))) +``` + +3125 batches + + +Up next, we create all of the other components required for training, such as the network, the loss function, the evaluation metric and parameter trainer. + + +```python +net = mx.gluon.model_zoo.vision.resnet50_v2(pretrained=False, ctx=ctx) +net.initialize(mx.init.Xavier(magnitude=2.3), ctx=ctx) +loss_fn = mx.gluon.loss.SoftmaxCrossEntropyLoss() +metric = mx.metric.Accuracy() +learning_rate = 0.001 +trainer = mx.gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': learning_rate}) +``` + +## Initial Benchmark + +As a starting point, let's benchmark the throughput of our training loop: calculating the average samples per second across 25 iterations, where each iteration is a batch of 16 samples. We'll run a single forward pass through the network before starting our benchmark timer to avoid including shape inference and lazy initialization in the throughput calculations. + + +```python +def single_forward(net, dataloader, dtype='float32'): +data, label = next(iter(dataloader)) +data = data.astype(dtype) +data = data.as_in_context(ctx) +pred = net(data) +pred.wait_to_read() +``` + + +```python +single_forward(net, dataloader) +iters = 25 +num_samples = 0 +num_iters = 0 +start_time = time.time() +for iter_idx, (data, label) in enumerate(dataloader): +num_samples += data.shape[0] +num_iters += 1 +data = data.as_in_context(ctx) +label = label.as_in_context(ctx) +with mx.autograd.record(): +pred = net(data) +loss = loss_fn(pred, label) +loss.backward() +trainer.step(data.shape[0]) +metric.update(label, pred) +print('.', end='') +if num_iters >= iters: +break +mx.nd.waitall() +end_time = time.time() +total_time = end_time - start_tim
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299723669 ## File path: docs/tutorials/gluon/performance.md ## @@ -0,0 +1,483 @@ + + + + + + + + + + + + + + + + + +# Gluon Performance Tips & Tricks + +Compared to traditional machine learning methods, the field of deep-learning has increased model accuracy across a wide range of tasks, but it has also increased the amount of computation required for model training and inference. Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution of networks, but it can sometimes be hard to write code that uses the hardware to its full potential. We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs. + +We'll start by writing some code to train an image classification network for the CIFAR-10 dataset, and then benchmark the throughput of the network in terms of samples processed per second. After some performance analysis, we'll identify the bottlenecks (i.e. the components limiting throughput) and improve the training speed step-by-step. We'll bring together all the tips and tricks at the end and evaluate our performance gains. + + +```python +from __future__ import print_function +import multiprocessing +import time +import mxnet as mx +import numpy as np +``` + +An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this tutorial. You are likely to get difference results and find different bottlenecks on other hardware, but these tips and tricks should still help improve training speed for bottleneck components. A GPU is recommended for this example. + + +```python +ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu() +print("Using {} context.".format(ctx)) +``` + +Using gpu(0) context. + + +We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon. + + +```python +dataset = mx.gluon.data.vision.CIFAR10(train=True) +print('{} samples'.format(len(dataset))) +``` + +5 samples + + +So we can learn how to identify training bottlenecks, let's intentionally introduce a short `sleep` into the data loading pipeline. We transform each 32x32 CIFAR-10 image to 244x244 so we can use it with the ResNet-50 network designed for ImageNet. [CIFAR-10 specific ResNet networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet) exist but we use the more standard ImageNet variants in this example. + + +```python +def transform_fn(x): +time.sleep(0.01) # artificial slow-down +image = mx.image.imresize(x, w=244, h=244) +return image.astype('float32').transpose((2, 0, 1)) + +dataset = dataset.transform_first(transform_fn) +``` + +Setting our batch size to 16, we can create the `DataLoader`. + + +```python +batch_size = 16 +dataloader = mx.gluon.data.DataLoader(dataset, + batch_size=batch_size, + shuffle=True, + last_batch="discard") +print('{} batches'.format(len(dataloader))) +``` + +3125 batches + + +Up next, we create all of the other components required for training, such as the network, the loss function, the evaluation metric and parameter trainer. + + +```python +net = mx.gluon.model_zoo.vision.resnet50_v2(pretrained=False, ctx=ctx) +net.initialize(mx.init.Xavier(magnitude=2.3), ctx=ctx) +loss_fn = mx.gluon.loss.SoftmaxCrossEntropyLoss() +metric = mx.metric.Accuracy() +learning_rate = 0.001 +trainer = mx.gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': learning_rate}) +``` + +## Initial Benchmark + +As a starting point, let's benchmark the throughput of our training loop: calculating the average samples per second across 25 iterations, where each iteration is a batch of 16 samples. We'll run a single forward pass through the network before starting our benchmark timer to avoid including shape inference and lazy initialization in the throughput calculations. + + +```python +def single_forward(net, dataloader, dtype='float32'): +data, label = next(iter(dataloader)) +data = data.astype(dtype) +data = data.as_in_context(ctx) +pred = net(data) +pred.wait_to_read() +``` + + +```python +single_forward(net, dataloader) +iters = 25 +num_samples = 0 +num_iters = 0 +start_time = time.time() +for iter_idx, (data, label) in enumerate(dataloader): +num_samples += data.shape[0] +num_iters += 1 +data = data.as_in_context(ctx) +label = label.as_in_context(ctx) +with mx.autograd.record(): +pred = net(data) +loss = loss_fn(pred, label) +loss.backward() +trainer.step(data.shape[0]) +metric.update(label, pred) +print('.', end='') +if num_iters >= iters: +break +mx.nd.waitall() +end_time = time.time() +total_time = end_time - start_tim
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299722331 ## File path: docs/tutorials/gluon/performance.md ## @@ -0,0 +1,483 @@ + + + + + + + + + + + + + + + + + +# Gluon Performance Tips & Tricks + +Compared to traditional machine learning methods, the field of deep-learning has increased model accuracy across a wide range of tasks, but it has also increased the amount of computation required for model training and inference. Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution of networks, but it can sometimes be hard to write code that uses the hardware to its full potential. We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs. + +We'll start by writing some code to train an image classification network for the CIFAR-10 dataset, and then benchmark the throughput of the network in terms of samples processed per second. After some performance analysis, we'll identify the bottlenecks (i.e. the components limiting throughput) and improve the training speed step-by-step. We'll bring together all the tips and tricks at the end and evaluate our performance gains. + + +```python +from __future__ import print_function +import multiprocessing +import time +import mxnet as mx +import numpy as np +``` + +An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this tutorial. You are likely to get difference results and find different bottlenecks on other hardware, but these tips and tricks should still help improve training speed for bottleneck components. A GPU is recommended for this example. Review comment: Good catch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299722411 ## File path: docs/tutorials/gluon/performance.md ## @@ -0,0 +1,483 @@ + + + + + + + + + + + + + + + + + +# Gluon Performance Tips & Tricks + +Compared to traditional machine learning methods, the field of deep-learning has increased model accuracy across a wide range of tasks, but it has also increased the amount of computation required for model training and inference. Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution of networks, but it can sometimes be hard to write code that uses the hardware to its full potential. We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs. + +We'll start by writing some code to train an image classification network for the CIFAR-10 dataset, and then benchmark the throughput of the network in terms of samples processed per second. After some performance analysis, we'll identify the bottlenecks (i.e. the components limiting throughput) and improve the training speed step-by-step. We'll bring together all the tips and tricks at the end and evaluate our performance gains. + + +```python +from __future__ import print_function +import multiprocessing +import time +import mxnet as mx +import numpy as np +``` + +An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this tutorial. You are likely to get difference results and find different bottlenecks on other hardware, but these tips and tricks should still help improve training speed for bottleneck components. A GPU is recommended for this example. + + +```python +ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu() +print("Using {} context.".format(ctx)) +``` + +Using gpu(0) context. + + +We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon. + + +```python +dataset = mx.gluon.data.vision.CIFAR10(train=True) +print('{} samples'.format(len(dataset))) +``` + +5 samples + + +So we can learn how to identify training bottlenecks, let's intentionally introduce a short `sleep` into the data loading pipeline. We transform each 32x32 CIFAR-10 image to 244x244 so we can use it with the ResNet-50 network designed for ImageNet. [CIFAR-10 specific ResNet networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet) exist but we use the more standard ImageNet variants in this example. Review comment: Good catch. Changed, rerun notebook and updated stats. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] aaronmarkham commented on issue #15433: remove comments from nano instructions
aaronmarkham commented on issue #15433: remove comments from nano instructions URL: https://github.com/apache/incubator-mxnet/pull/15433#issuecomment-507879412 > Shouldn't you also remove the Scala&Java instructions from the building from source part if you do not include prerequisites for Java install and remove the `These instructions also cover how to setup MXNet's Java Inference API.` remark in the introduction? Yep! Good catch. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] roywei commented on issue #15429: Operator Performance Regression on CPU
roywei commented on issue #15429: Operator Performance Regression on CPU URL: https://github.com/apache/incubator-mxnet/issues/15429#issuecomment-507878408 Thanks @ciyongch , setting the environment variables did reduce the variances. I have updated the document(1st sheet): https://docs.google.com/spreadsheets/d/1_eezNWbrBAm3s3i6G1m0Rd3YYdTEnmKlYtn4klqdyN0/edit#gid=196553607 With the current data Dot and Dropout is not a big concern now. Relu's regression is something we have to accept as otherwise it could lead to nans and bugs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray
thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r299717760 ## File path: docs/tutorials/sparse/train_gluon.md ## @@ -0,0 +1,469 @@ + + + + + + + + + + + + + + + + + + +# Sparse NDArrays with Gluon + +When working on machine learning problems, you may encounter situations where the input data is sparse (i.e. the majority of values are zero). One example of this is in recommendation systems. You could have millions of user and product features, but only a few of these features are present for each sample. Without special treatment, the sheer magnitude of the feature space can lead to out-of-memory situations and cause significant slowdowns when training and making predictions. + +MXNet supports a number of sparse storage types (often called 'stype' for short) for these situations. In this tutorial, we'll start by generating some sparse data, write it to disk in the LibSVM format and then read back using the [`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for training. We use the Gluon API to train the model and leverage sparse storage types such as [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) and [`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray) to maximise performance and memory efficiency. + + +```python +import mxnet as mx +import numpy as np +import time +``` + +### Generating Sparse Data + +You will most likely have a sparse dataset in mind already if you're reading this tutorial, but let's create a dummy dataset to use in the examples that follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 features of which 99.999% of values will be zero (i.e. 10 non-zero features for each sample). We take this as our input data for training and calculate a label based on an arbitrary rule: whether the feature sum is higher than average. + + +```python +num_samples = 1000 +num_features = 100 +data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', density=0.1) +# generate label: 1 if row sum above average, 0 otherwise. +label = data.sum(axis=1) > data.sum(axis=1).mean() +``` + + +```python +print(type(data)) +print(data[:10].asnumpy()) +print('{:,.0f} elements'.format(np.product(data.shape))) +print('{:,.0f} non-zero elements'.format(data.data.size)) +``` + + +[[0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + ... + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.]] +1,000,000,000 elements +10,000 non-zero elements + + +Our storage type is CSR (Compressed Sparse Row) which is the ideal type for sparse data along multiple axes. See [this in-depth tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html) for more information. Just to confirm the generation process ran correctly, we can see that the vast majority of values are indeed zero. One of the first questions to ask would be how much memory is saved by storing this data in a [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) versus a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray). Since sparse arrays are constructed from many components (e.g. `data`, `indices` and `indptr`) we define a function called `get_nbytes` to calculate the number of bytes taken in memory to store an array. We compare the same data stored in a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray) (with `data.tostype('default')`) to the [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray). + + +```python +def get_nbytes(array): +fn = lambda a: a.size * np.dtype(a).itemsize +if isinstance(array, mx.ndarray.sparse.CSRNDArray): +return fn(array.data) + fn(array.indices) + fn(array.indptr) +elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray): +return fn(array.data) + fn(array.indices) +elif isinstance(array, mx.ndarray.NDArray): +return fn(array) +else: +TypeError('{} not supported'.format(type(array))) +``` + + +```python +print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs') +print('CSRNDArray', get_nbytes(data)/100, 'MBs') +``` + +NDarray: 4000.0 MBs +CSRNDArray 0.128008 MBs + + +Given the extremely high sparsity of the data, we observe a huge memory saving here! 0.13 MBs versus 4 GBs: ~30,000 times
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray
thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r299716855 ## File path: docs/tutorials/sparse/train_gluon.md ## @@ -0,0 +1,469 @@ + + + + + + + + + + + + + + + + + + +# Sparse NDArrays with Gluon + +When working on machine learning problems, you may encounter situations where the input data is sparse (i.e. the majority of values are zero). One example of this is in recommendation systems. You could have millions of user and product features, but only a few of these features are present for each sample. Without special treatment, the sheer magnitude of the feature space can lead to out-of-memory situations and cause significant slowdowns when training and making predictions. + +MXNet supports a number of sparse storage types (often called 'stype' for short) for these situations. In this tutorial, we'll start by generating some sparse data, write it to disk in the LibSVM format and then read back using the [`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for training. We use the Gluon API to train the model and leverage sparse storage types such as [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) and [`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray) to maximise performance and memory efficiency. + + +```python +import mxnet as mx +import numpy as np +import time +``` + +### Generating Sparse Data + +You will most likely have a sparse dataset in mind already if you're reading this tutorial, but let's create a dummy dataset to use in the examples that follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 features of which 99.999% of values will be zero (i.e. 10 non-zero features for each sample). We take this as our input data for training and calculate a label based on an arbitrary rule: whether the feature sum is higher than average. + + +```python +num_samples = 1000 +num_features = 100 +data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', density=0.1) +# generate label: 1 if row sum above average, 0 otherwise. +label = data.sum(axis=1) > data.sum(axis=1).mean() +``` + + +```python +print(type(data)) +print(data[:10].asnumpy()) +print('{:,.0f} elements'.format(np.product(data.shape))) +print('{:,.0f} non-zero elements'.format(data.data.size)) +``` + + +[[0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + ... + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.]] +1,000,000,000 elements +10,000 non-zero elements + + +Our storage type is CSR (Compressed Sparse Row) which is the ideal type for sparse data along multiple axes. See [this in-depth tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html) for more information. Just to confirm the generation process ran correctly, we can see that the vast majority of values are indeed zero. One of the first questions to ask would be how much memory is saved by storing this data in a [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) versus a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray). Since sparse arrays are constructed from many components (e.g. `data`, `indices` and `indptr`) we define a function called `get_nbytes` to calculate the number of bytes taken in memory to store an array. We compare the same data stored in a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray) (with `data.tostype('default')`) to the [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray). + + +```python +def get_nbytes(array): +fn = lambda a: a.size * np.dtype(a).itemsize +if isinstance(array, mx.ndarray.sparse.CSRNDArray): +return fn(array.data) + fn(array.indices) + fn(array.indptr) +elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray): +return fn(array.data) + fn(array.indices) +elif isinstance(array, mx.ndarray.NDArray): +return fn(array) +else: +TypeError('{} not supported'.format(type(array))) +``` + + +```python +print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs') +print('CSRNDArray', get_nbytes(data)/100, 'MBs') +``` + +NDarray: 4000.0 MBs +CSRNDArray 0.128008 MBs + + +Given the extremely high sparsity of the data, we observe a huge memory saving here! 0.13 MBs versus 4 GBs: ~30,000 times
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray
thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r299716886 ## File path: docs/tutorials/sparse/train_gluon.md ## @@ -0,0 +1,469 @@ + + + + + + + + + + + + + + + + + + +# Sparse NDArrays with Gluon + +When working on machine learning problems, you may encounter situations where the input data is sparse (i.e. the majority of values are zero). One example of this is in recommendation systems. You could have millions of user and product features, but only a few of these features are present for each sample. Without special treatment, the sheer magnitude of the feature space can lead to out-of-memory situations and cause significant slowdowns when training and making predictions. + +MXNet supports a number of sparse storage types (often called 'stype' for short) for these situations. In this tutorial, we'll start by generating some sparse data, write it to disk in the LibSVM format and then read back using the [`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for training. We use the Gluon API to train the model and leverage sparse storage types such as [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) and [`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray) to maximise performance and memory efficiency. + + +```python +import mxnet as mx +import numpy as np +import time +``` + +### Generating Sparse Data + +You will most likely have a sparse dataset in mind already if you're reading this tutorial, but let's create a dummy dataset to use in the examples that follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 features of which 99.999% of values will be zero (i.e. 10 non-zero features for each sample). We take this as our input data for training and calculate a label based on an arbitrary rule: whether the feature sum is higher than average. + + +```python +num_samples = 1000 +num_features = 100 +data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', density=0.1) +# generate label: 1 if row sum above average, 0 otherwise. +label = data.sum(axis=1) > data.sum(axis=1).mean() +``` + + +```python +print(type(data)) +print(data[:10].asnumpy()) +print('{:,.0f} elements'.format(np.product(data.shape))) +print('{:,.0f} non-zero elements'.format(data.data.size)) +``` + + +[[0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + ... + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.]] +1,000,000,000 elements +10,000 non-zero elements + + +Our storage type is CSR (Compressed Sparse Row) which is the ideal type for sparse data along multiple axes. See [this in-depth tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html) for more information. Just to confirm the generation process ran correctly, we can see that the vast majority of values are indeed zero. One of the first questions to ask would be how much memory is saved by storing this data in a [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) versus a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray). Since sparse arrays are constructed from many components (e.g. `data`, `indices` and `indptr`) we define a function called `get_nbytes` to calculate the number of bytes taken in memory to store an array. We compare the same data stored in a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray) (with `data.tostype('default')`) to the [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray). + + +```python +def get_nbytes(array): +fn = lambda a: a.size * np.dtype(a).itemsize +if isinstance(array, mx.ndarray.sparse.CSRNDArray): +return fn(array.data) + fn(array.indices) + fn(array.indptr) +elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray): +return fn(array.data) + fn(array.indices) +elif isinstance(array, mx.ndarray.NDArray): +return fn(array) +else: +TypeError('{} not supported'.format(type(array))) +``` + + +```python +print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs') +print('CSRNDArray', get_nbytes(data)/100, 'MBs') +``` + +NDarray: 4000.0 MBs +CSRNDArray 0.128008 MBs + + +Given the extremely high sparsity of the data, we observe a huge memory saving here! 0.13 MBs versus 4 GBs: ~30,000 times
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray
thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r299716928 ## File path: docs/tutorials/sparse/train_gluon.md ## @@ -0,0 +1,469 @@ + + + + + + + + + + + + + + + + + + +# Sparse NDArrays with Gluon + +When working on machine learning problems, you may encounter situations where the input data is sparse (i.e. the majority of values are zero). One example of this is in recommendation systems. You could have millions of user and product features, but only a few of these features are present for each sample. Without special treatment, the sheer magnitude of the feature space can lead to out-of-memory situations and cause significant slowdowns when training and making predictions. + +MXNet supports a number of sparse storage types (often called 'stype' for short) for these situations. In this tutorial, we'll start by generating some sparse data, write it to disk in the LibSVM format and then read back using the [`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for training. We use the Gluon API to train the model and leverage sparse storage types such as [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) and [`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray) to maximise performance and memory efficiency. + + +```python +import mxnet as mx +import numpy as np +import time +``` + +### Generating Sparse Data + +You will most likely have a sparse dataset in mind already if you're reading this tutorial, but let's create a dummy dataset to use in the examples that follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 features of which 99.999% of values will be zero (i.e. 10 non-zero features for each sample). We take this as our input data for training and calculate a label based on an arbitrary rule: whether the feature sum is higher than average. + + +```python +num_samples = 1000 +num_features = 100 +data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', density=0.1) +# generate label: 1 if row sum above average, 0 otherwise. +label = data.sum(axis=1) > data.sum(axis=1).mean() +``` + + +```python +print(type(data)) +print(data[:10].asnumpy()) +print('{:,.0f} elements'.format(np.product(data.shape))) +print('{:,.0f} non-zero elements'.format(data.data.size)) +``` + + +[[0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + ... + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.]] +1,000,000,000 elements +10,000 non-zero elements + + +Our storage type is CSR (Compressed Sparse Row) which is the ideal type for sparse data along multiple axes. See [this in-depth tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html) for more information. Just to confirm the generation process ran correctly, we can see that the vast majority of values are indeed zero. One of the first questions to ask would be how much memory is saved by storing this data in a [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) versus a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray). Since sparse arrays are constructed from many components (e.g. `data`, `indices` and `indptr`) we define a function called `get_nbytes` to calculate the number of bytes taken in memory to store an array. We compare the same data stored in a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray) (with `data.tostype('default')`) to the [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray). + + +```python +def get_nbytes(array): +fn = lambda a: a.size * np.dtype(a).itemsize +if isinstance(array, mx.ndarray.sparse.CSRNDArray): +return fn(array.data) + fn(array.indices) + fn(array.indptr) +elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray): +return fn(array.data) + fn(array.indices) +elif isinstance(array, mx.ndarray.NDArray): +return fn(array) +else: +TypeError('{} not supported'.format(type(array))) +``` + + +```python +print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs') +print('CSRNDArray', get_nbytes(data)/100, 'MBs') +``` + +NDarray: 4000.0 MBs +CSRNDArray 0.128008 MBs + + +Given the extremely high sparsity of the data, we observe a huge memory saving here! 0.13 MBs versus 4 GBs: ~30,000 times
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray
thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r299716683 ## File path: docs/tutorials/sparse/train_gluon.md ## @@ -0,0 +1,469 @@ + + + + + + + + + + + + + + + + + + +# Sparse NDArrays with Gluon + +When working on machine learning problems, you may encounter situations where the input data is sparse (i.e. the majority of values are zero). One example of this is in recommendation systems. You could have millions of user and product features, but only a few of these features are present for each sample. Without special treatment, the sheer magnitude of the feature space can lead to out-of-memory situations and cause significant slowdowns when training and making predictions. + +MXNet supports a number of sparse storage types (often called 'stype' for short) for these situations. In this tutorial, we'll start by generating some sparse data, write it to disk in the LibSVM format and then read back using the [`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for training. We use the Gluon API to train the model and leverage sparse storage types such as [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) and [`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray) to maximise performance and memory efficiency. + + +```python +import mxnet as mx +import numpy as np +import time +``` + +### Generating Sparse Data + +You will most likely have a sparse dataset in mind already if you're reading this tutorial, but let's create a dummy dataset to use in the examples that follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 features of which 99.999% of values will be zero (i.e. 10 non-zero features for each sample). We take this as our input data for training and calculate a label based on an arbitrary rule: whether the feature sum is higher than average. + + +```python +num_samples = 1000 +num_features = 100 +data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', density=0.1) +# generate label: 1 if row sum above average, 0 otherwise. +label = data.sum(axis=1) > data.sum(axis=1).mean() +``` + + +```python +print(type(data)) +print(data[:10].asnumpy()) +print('{:,.0f} elements'.format(np.product(data.shape))) +print('{:,.0f} non-zero elements'.format(data.data.size)) +``` + + +[[0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + ... + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.]] +1,000,000,000 elements +10,000 non-zero elements + + +Our storage type is CSR (Compressed Sparse Row) which is the ideal type for sparse data along multiple axes. See [this in-depth tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html) for more information. Just to confirm the generation process ran correctly, we can see that the vast majority of values are indeed zero. One of the first questions to ask would be how much memory is saved by storing this data in a [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) versus a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray). Since sparse arrays are constructed from many components (e.g. `data`, `indices` and `indptr`) we define a function called `get_nbytes` to calculate the number of bytes taken in memory to store an array. We compare the same data stored in a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray) (with `data.tostype('default')`) to the [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray). + + +```python +def get_nbytes(array): +fn = lambda a: a.size * np.dtype(a).itemsize +if isinstance(array, mx.ndarray.sparse.CSRNDArray): +return fn(array.data) + fn(array.indices) + fn(array.indptr) +elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray): +return fn(array.data) + fn(array.indices) +elif isinstance(array, mx.ndarray.NDArray): +return fn(array) +else: +TypeError('{} not supported'.format(type(array))) +``` + + +```python +print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs') +print('CSRNDArray', get_nbytes(data)/100, 'MBs') +``` + +NDarray: 4000.0 MBs +CSRNDArray 0.128008 MBs + + +Given the extremely high sparsity of the data, we observe a huge memory saving here! 0.13 MBs versus 4 GBs: ~30,000 times
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray
thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r299716459 ## File path: docs/tutorials/sparse/train_gluon.md ## @@ -0,0 +1,469 @@ + + + + + + + + + + + + + + + + + + +# Sparse NDArrays with Gluon + +When working on machine learning problems, you may encounter situations where the input data is sparse (i.e. the majority of values are zero). One example of this is in recommendation systems. You could have millions of user and product features, but only a few of these features are present for each sample. Without special treatment, the sheer magnitude of the feature space can lead to out-of-memory situations and cause significant slowdowns when training and making predictions. + +MXNet supports a number of sparse storage types (often called 'stype' for short) for these situations. In this tutorial, we'll start by generating some sparse data, write it to disk in the LibSVM format and then read back using the [`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for training. We use the Gluon API to train the model and leverage sparse storage types such as [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) and [`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray) to maximise performance and memory efficiency. + + +```python +import mxnet as mx +import numpy as np +import time +``` + +### Generating Sparse Data + +You will most likely have a sparse dataset in mind already if you're reading this tutorial, but let's create a dummy dataset to use in the examples that follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 features of which 99.999% of values will be zero (i.e. 10 non-zero features for each sample). We take this as our input data for training and calculate a label based on an arbitrary rule: whether the feature sum is higher than average. + + +```python +num_samples = 1000 +num_features = 100 +data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', density=0.1) +# generate label: 1 if row sum above average, 0 otherwise. +label = data.sum(axis=1) > data.sum(axis=1).mean() +``` + + +```python +print(type(data)) +print(data[:10].asnumpy()) +print('{:,.0f} elements'.format(np.product(data.shape))) +print('{:,.0f} non-zero elements'.format(data.data.size)) +``` + + +[[0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + ... + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.]] +1,000,000,000 elements +10,000 non-zero elements + + +Our storage type is CSR (Compressed Sparse Row) which is the ideal type for sparse data along multiple axes. See [this in-depth tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html) for more information. Just to confirm the generation process ran correctly, we can see that the vast majority of values are indeed zero. One of the first questions to ask would be how much memory is saved by storing this data in a [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) versus a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray). Since sparse arrays are constructed from many components (e.g. `data`, `indices` and `indptr`) we define a function called `get_nbytes` to calculate the number of bytes taken in memory to store an array. We compare the same data stored in a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray) (with `data.tostype('default')`) to the [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray). + + +```python +def get_nbytes(array): +fn = lambda a: a.size * np.dtype(a).itemsize +if isinstance(array, mx.ndarray.sparse.CSRNDArray): +return fn(array.data) + fn(array.indices) + fn(array.indptr) +elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray): +return fn(array.data) + fn(array.indices) +elif isinstance(array, mx.ndarray.NDArray): +return fn(array) +else: +TypeError('{} not supported'.format(type(array))) +``` + + +```python +print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs') +print('CSRNDArray', get_nbytes(data)/100, 'MBs') +``` + +NDarray: 4000.0 MBs +CSRNDArray 0.128008 MBs + + +Given the extremely high sparsity of the data, we observe a huge memory saving here! 0.13 MBs versus 4 GBs: ~30,000 times
[GitHub] [incubator-mxnet] pengzhao-intel commented on issue #15303: Fix amalgamation failure.
pengzhao-intel commented on issue #15303: Fix amalgamation failure. URL: https://github.com/apache/incubator-mxnet/pull/15303#issuecomment-507873807 @ZhennanQin @TaoLv to help review this change :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray
thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r299716247 ## File path: docs/tutorials/sparse/train_gluon.md ## @@ -0,0 +1,469 @@ + + + + + + + + + + + + + + + + + + +# Sparse NDArrays with Gluon + +When working on machine learning problems, you may encounter situations where the input data is sparse (i.e. the majority of values are zero). One example of this is in recommendation systems. You could have millions of user and product features, but only a few of these features are present for each sample. Without special treatment, the sheer magnitude of the feature space can lead to out-of-memory situations and cause significant slowdowns when training and making predictions. + +MXNet supports a number of sparse storage types (often called 'stype' for short) for these situations. In this tutorial, we'll start by generating some sparse data, write it to disk in the LibSVM format and then read back using the [`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for training. We use the Gluon API to train the model and leverage sparse storage types such as [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) and [`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray) to maximise performance and memory efficiency. + + +```python +import mxnet as mx +import numpy as np +import time +``` + +### Generating Sparse Data + +You will most likely have a sparse dataset in mind already if you're reading this tutorial, but let's create a dummy dataset to use in the examples that follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 features of which 99.999% of values will be zero (i.e. 10 non-zero features for each sample). We take this as our input data for training and calculate a label based on an arbitrary rule: whether the feature sum is higher than average. + + +```python +num_samples = 1000 +num_features = 100 +data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', density=0.1) +# generate label: 1 if row sum above average, 0 otherwise. +label = data.sum(axis=1) > data.sum(axis=1).mean() +``` + + +```python +print(type(data)) +print(data[:10].asnumpy()) +print('{:,.0f} elements'.format(np.product(data.shape))) +print('{:,.0f} non-zero elements'.format(data.data.size)) +``` + + +[[0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + ... + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.]] +1,000,000,000 elements +10,000 non-zero elements + + +Our storage type is CSR (Compressed Sparse Row) which is the ideal type for sparse data along multiple axes. See [this in-depth tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html) for more information. Just to confirm the generation process ran correctly, we can see that the vast majority of values are indeed zero. One of the first questions to ask would be how much memory is saved by storing this data in a [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) versus a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray). Since sparse arrays are constructed from many components (e.g. `data`, `indices` and `indptr`) we define a function called `get_nbytes` to calculate the number of bytes taken in memory to store an array. We compare the same data stored in a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray) (with `data.tostype('default')`) to the [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray). + + +```python +def get_nbytes(array): +fn = lambda a: a.size * np.dtype(a).itemsize +if isinstance(array, mx.ndarray.sparse.CSRNDArray): +return fn(array.data) + fn(array.indices) + fn(array.indptr) +elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray): +return fn(array.data) + fn(array.indices) +elif isinstance(array, mx.ndarray.NDArray): +return fn(array) +else: +TypeError('{} not supported'.format(type(array))) +``` + + +```python +print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs') +print('CSRNDArray', get_nbytes(data)/100, 'MBs') +``` + +NDarray: 4000.0 MBs +CSRNDArray 0.128008 MBs + + +Given the extremely high sparsity of the data, we observe a huge memory saving here! 0.13 MBs versus 4 GBs: ~30,000 times
[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
ptrendx commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299716110 ## File path: docs/tutorials/gluon/performance.md ## @@ -0,0 +1,483 @@ + + + + + + + + + + + + + + + + + +# Gluon Performance Tips & Tricks + +Compared to traditional machine learning methods, the field of deep-learning has increased model accuracy across a wide range of tasks, but it has also increased the amount of computation required for model training and inference. Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution of networks, but it can sometimes be hard to write code that uses the hardware to its full potential. We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs. + +We'll start by writing some code to train an image classification network for the CIFAR-10 dataset, and then benchmark the throughput of the network in terms of samples processed per second. After some performance analysis, we'll identify the bottlenecks (i.e. the components limiting throughput) and improve the training speed step-by-step. We'll bring together all the tips and tricks at the end and evaluate our performance gains. + + +```python +from __future__ import print_function +import multiprocessing +import time +import mxnet as mx +import numpy as np +``` + +An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this tutorial. You are likely to get difference results and find different bottlenecks on other hardware, but these tips and tricks should still help improve training speed for bottleneck components. A GPU is recommended for this example. + + +```python +ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu() +print("Using {} context.".format(ctx)) +``` + +Using gpu(0) context. + + +We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon. + + +```python +dataset = mx.gluon.data.vision.CIFAR10(train=True) +print('{} samples'.format(len(dataset))) +``` + +5 samples + + +So we can learn how to identify training bottlenecks, let's intentionally introduce a short `sleep` into the data loading pipeline. We transform each 32x32 CIFAR-10 image to 244x244 so we can use it with the ResNet-50 network designed for ImageNet. [CIFAR-10 specific ResNet networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet) exist but we use the more standard ImageNet variants in this example. + + +```python +def transform_fn(x): +time.sleep(0.01) # artificial slow-down +image = mx.image.imresize(x, w=244, h=244) +return image.astype('float32').transpose((2, 0, 1)) + +dataset = dataset.transform_first(transform_fn) +``` + +Setting our batch size to 16, we can create the `DataLoader`. + + +```python +batch_size = 16 +dataloader = mx.gluon.data.DataLoader(dataset, + batch_size=batch_size, + shuffle=True, + last_batch="discard") +print('{} batches'.format(len(dataloader))) +``` + +3125 batches + + +Up next, we create all of the other components required for training, such as the network, the loss function, the evaluation metric and parameter trainer. + + +```python +net = mx.gluon.model_zoo.vision.resnet50_v2(pretrained=False, ctx=ctx) +net.initialize(mx.init.Xavier(magnitude=2.3), ctx=ctx) +loss_fn = mx.gluon.loss.SoftmaxCrossEntropyLoss() +metric = mx.metric.Accuracy() +learning_rate = 0.001 +trainer = mx.gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': learning_rate}) +``` + +## Initial Benchmark + +As a starting point, let's benchmark the throughput of our training loop: calculating the average samples per second across 25 iterations, where each iteration is a batch of 16 samples. We'll run a single forward pass through the network before starting our benchmark timer to avoid including shape inference and lazy initialization in the throughput calculations. + + +```python +def single_forward(net, dataloader, dtype='float32'): +data, label = next(iter(dataloader)) +data = data.astype(dtype) +data = data.as_in_context(ctx) +pred = net(data) +pred.wait_to_read() +``` + + +```python +single_forward(net, dataloader) +iters = 25 +num_samples = 0 +num_iters = 0 +start_time = time.time() +for iter_idx, (data, label) in enumerate(dataloader): +num_samples += data.shape[0] +num_iters += 1 +data = data.as_in_context(ctx) +label = label.as_in_context(ctx) +with mx.autograd.record(): +pred = net(data) +loss = loss_fn(pred, label) +loss.backward() +trainer.step(data.shape[0]) +metric.update(label, pred) +print('.', end='') +if num_iters >= iters: +break +mx.nd.waitall() +end_time = time.time() +total_time = end_time - start_time
[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
ptrendx commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299715652 ## File path: docs/tutorials/gluon/performance.md ## @@ -0,0 +1,483 @@ + + + + + + + + + + + + + + + + + +# Gluon Performance Tips & Tricks + +Compared to traditional machine learning methods, the field of deep-learning has increased model accuracy across a wide range of tasks, but it has also increased the amount of computation required for model training and inference. Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution of networks, but it can sometimes be hard to write code that uses the hardware to its full potential. We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs. + +We'll start by writing some code to train an image classification network for the CIFAR-10 dataset, and then benchmark the throughput of the network in terms of samples processed per second. After some performance analysis, we'll identify the bottlenecks (i.e. the components limiting throughput) and improve the training speed step-by-step. We'll bring together all the tips and tricks at the end and evaluate our performance gains. + + +```python +from __future__ import print_function +import multiprocessing +import time +import mxnet as mx +import numpy as np +``` + +An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this tutorial. You are likely to get difference results and find different bottlenecks on other hardware, but these tips and tricks should still help improve training speed for bottleneck components. A GPU is recommended for this example. + + +```python +ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu() +print("Using {} context.".format(ctx)) +``` + +Using gpu(0) context. + + +We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon. + + +```python +dataset = mx.gluon.data.vision.CIFAR10(train=True) +print('{} samples'.format(len(dataset))) +``` + +5 samples + + +So we can learn how to identify training bottlenecks, let's intentionally introduce a short `sleep` into the data loading pipeline. We transform each 32x32 CIFAR-10 image to 244x244 so we can use it with the ResNet-50 network designed for ImageNet. [CIFAR-10 specific ResNet networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet) exist but we use the more standard ImageNet variants in this example. + + +```python +def transform_fn(x): +time.sleep(0.01) # artificial slow-down +image = mx.image.imresize(x, w=244, h=244) +return image.astype('float32').transpose((2, 0, 1)) + +dataset = dataset.transform_first(transform_fn) +``` + +Setting our batch size to 16, we can create the `DataLoader`. + + +```python +batch_size = 16 +dataloader = mx.gluon.data.DataLoader(dataset, + batch_size=batch_size, + shuffle=True, + last_batch="discard") +print('{} batches'.format(len(dataloader))) +``` + +3125 batches + + +Up next, we create all of the other components required for training, such as the network, the loss function, the evaluation metric and parameter trainer. + + +```python +net = mx.gluon.model_zoo.vision.resnet50_v2(pretrained=False, ctx=ctx) +net.initialize(mx.init.Xavier(magnitude=2.3), ctx=ctx) +loss_fn = mx.gluon.loss.SoftmaxCrossEntropyLoss() +metric = mx.metric.Accuracy() +learning_rate = 0.001 +trainer = mx.gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': learning_rate}) +``` + +## Initial Benchmark + +As a starting point, let's benchmark the throughput of our training loop: calculating the average samples per second across 25 iterations, where each iteration is a batch of 16 samples. We'll run a single forward pass through the network before starting our benchmark timer to avoid including shape inference and lazy initialization in the throughput calculations. + + +```python +def single_forward(net, dataloader, dtype='float32'): +data, label = next(iter(dataloader)) +data = data.astype(dtype) +data = data.as_in_context(ctx) +pred = net(data) +pred.wait_to_read() +``` + + +```python +single_forward(net, dataloader) +iters = 25 +num_samples = 0 +num_iters = 0 +start_time = time.time() +for iter_idx, (data, label) in enumerate(dataloader): +num_samples += data.shape[0] +num_iters += 1 +data = data.as_in_context(ctx) +label = label.as_in_context(ctx) +with mx.autograd.record(): +pred = net(data) +loss = loss_fn(pred, label) +loss.backward() +trainer.step(data.shape[0]) +metric.update(label, pred) +print('.', end='') +if num_iters >= iters: +break +mx.nd.waitall() +end_time = time.time() +total_time = end_time - start_time
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray
thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r299715639 ## File path: docs/tutorials/sparse/train_gluon.md ## @@ -0,0 +1,469 @@ + + + + + + + + + + + + + + + + + + +# Sparse NDArrays with Gluon + +When working on machine learning problems, you may encounter situations where the input data is sparse (i.e. the majority of values are zero). One example of this is in recommendation systems. You could have millions of user and product features, but only a few of these features are present for each sample. Without special treatment, the sheer magnitude of the feature space can lead to out-of-memory situations and cause significant slowdowns when training and making predictions. + +MXNet supports a number of sparse storage types (often called 'stype' for short) for these situations. In this tutorial, we'll start by generating some sparse data, write it to disk in the LibSVM format and then read back using the [`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for training. We use the Gluon API to train the model and leverage sparse storage types such as [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) and [`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray) to maximise performance and memory efficiency. + + +```python +import mxnet as mx +import numpy as np +import time +``` + +### Generating Sparse Data + +You will most likely have a sparse dataset in mind already if you're reading this tutorial, but let's create a dummy dataset to use in the examples that follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 features of which 99.999% of values will be zero (i.e. 10 non-zero features for each sample). We take this as our input data for training and calculate a label based on an arbitrary rule: whether the feature sum is higher than average. + + +```python +num_samples = 1000 +num_features = 100 +data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', density=0.1) +# generate label: 1 if row sum above average, 0 otherwise. +label = data.sum(axis=1) > data.sum(axis=1).mean() +``` + + +```python +print(type(data)) +print(data[:10].asnumpy()) +print('{:,.0f} elements'.format(np.product(data.shape))) +print('{:,.0f} non-zero elements'.format(data.data.size)) +``` + + +[[0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + ... + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.]] +1,000,000,000 elements +10,000 non-zero elements + + +Our storage type is CSR (Compressed Sparse Row) which is the ideal type for sparse data along multiple axes. See [this in-depth tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html) for more information. Just to confirm the generation process ran correctly, we can see that the vast majority of values are indeed zero. One of the first questions to ask would be how much memory is saved by storing this data in a [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) versus a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray). Since sparse arrays are constructed from many components (e.g. `data`, `indices` and `indptr`) we define a function called `get_nbytes` to calculate the number of bytes taken in memory to store an array. We compare the same data stored in a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray) (with `data.tostype('default')`) to the [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray). + + +```python +def get_nbytes(array): +fn = lambda a: a.size * np.dtype(a).itemsize +if isinstance(array, mx.ndarray.sparse.CSRNDArray): +return fn(array.data) + fn(array.indices) + fn(array.indptr) +elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray): +return fn(array.data) + fn(array.indices) +elif isinstance(array, mx.ndarray.NDArray): +return fn(array) +else: +TypeError('{} not supported'.format(type(array))) +``` + + +```python +print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs') +print('CSRNDArray', get_nbytes(data)/100, 'MBs') +``` + +NDarray: 4000.0 MBs +CSRNDArray 0.128008 MBs + + +Given the extremely high sparsity of the data, we observe a huge memory saving here! 0.13 MBs versus 4 GBs: ~30,000 times
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray
thomelane commented on a change in pull request #15396: [TUTORIAL] Gluon and Sparse NDArray URL: https://github.com/apache/incubator-mxnet/pull/15396#discussion_r299715527 ## File path: docs/tutorials/sparse/train_gluon.md ## @@ -0,0 +1,469 @@ + + + + + + + + + + + + + + + + + + +# Sparse NDArrays with Gluon + +When working on machine learning problems, you may encounter situations where the input data is sparse (i.e. the majority of values are zero). One example of this is in recommendation systems. You could have millions of user and product features, but only a few of these features are present for each sample. Without special treatment, the sheer magnitude of the feature space can lead to out-of-memory situations and cause significant slowdowns when training and making predictions. + +MXNet supports a number of sparse storage types (often called 'stype' for short) for these situations. In this tutorial, we'll start by generating some sparse data, write it to disk in the LibSVM format and then read back using the [`LibSVMIter`](https://mxnet.incubator.apache.org/api/python/io/io.html) for training. We use the Gluon API to train the model and leverage sparse storage types such as [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) and [`RowSparseNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=rowsparsendarray#mxnet.ndarray.sparse.RowSparseNDArray) to maximise performance and memory efficiency. + + +```python +import mxnet as mx +import numpy as np +import time +``` + +### Generating Sparse Data + +You will most likely have a sparse dataset in mind already if you're reading this tutorial, but let's create a dummy dataset to use in the examples that follow. Using `rand_ndarray` we will generate 1000 samples, each with 1,000,000 features of which 99.999% of values will be zero (i.e. 10 non-zero features for each sample). We take this as our input data for training and calculate a label based on an arbitrary rule: whether the feature sum is higher than average. + + +```python +num_samples = 1000 +num_features = 100 +data = mx.test_utils.rand_ndarray((num_samples, num_features), stype='csr', density=0.1) +# generate label: 1 if row sum above average, 0 otherwise. +label = data.sum(axis=1) > data.sum(axis=1).mean() +``` + + +```python +print(type(data)) +print(data[:10].asnumpy()) +print('{:,.0f} elements'.format(np.product(data.shape))) +print('{:,.0f} non-zero elements'.format(data.data.size)) +``` + + +[[0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + ... + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.] + [0. 0. 0. ... 0. 0. 0.]] +1,000,000,000 elements +10,000 non-zero elements + + +Our storage type is CSR (Compressed Sparse Row) which is the ideal type for sparse data along multiple axes. See [this in-depth tutorial](https://mxnet.incubator.apache.org/versions/master/tutorials/sparse/csr.html) for more information. Just to confirm the generation process ran correctly, we can see that the vast majority of values are indeed zero. One of the first questions to ask would be how much memory is saved by storing this data in a [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray) versus a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray). Since sparse arrays are constructed from many components (e.g. `data`, `indices` and `indptr`) we define a function called `get_nbytes` to calculate the number of bytes taken in memory to store an array. We compare the same data stored in a standard [`NDArray`](https://mxnet.incubator.apache.org/versions/master/api/python/ndarray/sparse.html?highlight=ndarray#module-mxnet.ndarray) (with `data.tostype('default')`) to the [`CSRNDArray`](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html?highlight=csrndarray#mxnet.ndarray.sparse.CSRNDArray). + + +```python +def get_nbytes(array): +fn = lambda a: a.size * np.dtype(a).itemsize +if isinstance(array, mx.ndarray.sparse.CSRNDArray): +return fn(array.data) + fn(array.indices) + fn(array.indptr) +elif isinstance(array, mx.ndarray.sparse.RowSparseNDArray): +return fn(array.data) + fn(array.indices) +elif isinstance(array, mx.ndarray.NDArray): +return fn(array) +else: +TypeError('{} not supported'.format(type(array))) +``` + + +```python +print('NDarray:', get_nbytes(data.tostype('default'))/100, 'MBs') +print('CSRNDArray', get_nbytes(data)/100, 'MBs') +``` + +NDarray: 4000.0 MBs +CSRNDArray 0.128008 MBs + + +Given the extremely high sparsity of the data, we observe a huge memory saving here! 0.13 MBs versus 4 GBs: ~30,000 times
[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
ptrendx commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299715062 ## File path: docs/tutorials/gluon/performance.md ## @@ -0,0 +1,483 @@ + + + + + + + + + + + + + + + + + +# Gluon Performance Tips & Tricks + +Compared to traditional machine learning methods, the field of deep-learning has increased model accuracy across a wide range of tasks, but it has also increased the amount of computation required for model training and inference. Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution of networks, but it can sometimes be hard to write code that uses the hardware to its full potential. We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs. + +We'll start by writing some code to train an image classification network for the CIFAR-10 dataset, and then benchmark the throughput of the network in terms of samples processed per second. After some performance analysis, we'll identify the bottlenecks (i.e. the components limiting throughput) and improve the training speed step-by-step. We'll bring together all the tips and tricks at the end and evaluate our performance gains. + + +```python +from __future__ import print_function +import multiprocessing +import time +import mxnet as mx +import numpy as np +``` + +An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this tutorial. You are likely to get difference results and find different bottlenecks on other hardware, but these tips and tricks should still help improve training speed for bottleneck components. A GPU is recommended for this example. + + +```python +ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu() +print("Using {} context.".format(ctx)) +``` + +Using gpu(0) context. + + +We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon. + + +```python +dataset = mx.gluon.data.vision.CIFAR10(train=True) +print('{} samples'.format(len(dataset))) +``` + +5 samples + + +So we can learn how to identify training bottlenecks, let's intentionally introduce a short `sleep` into the data loading pipeline. We transform each 32x32 CIFAR-10 image to 244x244 so we can use it with the ResNet-50 network designed for ImageNet. [CIFAR-10 specific ResNet networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet) exist but we use the more standard ImageNet variants in this example. + + +```python +def transform_fn(x): +time.sleep(0.01) # artificial slow-down +image = mx.image.imresize(x, w=244, h=244) +return image.astype('float32').transpose((2, 0, 1)) + +dataset = dataset.transform_first(transform_fn) +``` + +Setting our batch size to 16, we can create the `DataLoader`. + + +```python +batch_size = 16 +dataloader = mx.gluon.data.DataLoader(dataset, + batch_size=batch_size, + shuffle=True, + last_batch="discard") +print('{} batches'.format(len(dataloader))) +``` + +3125 batches + + +Up next, we create all of the other components required for training, such as the network, the loss function, the evaluation metric and parameter trainer. + + +```python +net = mx.gluon.model_zoo.vision.resnet50_v2(pretrained=False, ctx=ctx) +net.initialize(mx.init.Xavier(magnitude=2.3), ctx=ctx) +loss_fn = mx.gluon.loss.SoftmaxCrossEntropyLoss() +metric = mx.metric.Accuracy() +learning_rate = 0.001 +trainer = mx.gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': learning_rate}) +``` + +## Initial Benchmark + +As a starting point, let's benchmark the throughput of our training loop: calculating the average samples per second across 25 iterations, where each iteration is a batch of 16 samples. We'll run a single forward pass through the network before starting our benchmark timer to avoid including shape inference and lazy initialization in the throughput calculations. + + +```python +def single_forward(net, dataloader, dtype='float32'): +data, label = next(iter(dataloader)) +data = data.astype(dtype) +data = data.as_in_context(ctx) +pred = net(data) +pred.wait_to_read() +``` + + +```python +single_forward(net, dataloader) +iters = 25 +num_samples = 0 +num_iters = 0 +start_time = time.time() +for iter_idx, (data, label) in enumerate(dataloader): +num_samples += data.shape[0] +num_iters += 1 +data = data.as_in_context(ctx) +label = label.as_in_context(ctx) +with mx.autograd.record(): +pred = net(data) +loss = loss_fn(pred, label) +loss.backward() +trainer.step(data.shape[0]) +metric.update(label, pred) +print('.', end='') +if num_iters >= iters: +break +mx.nd.waitall() +end_time = time.time() +total_time = end_time - start_time
[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
ptrendx commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299714081 ## File path: docs/tutorials/gluon/performance.md ## @@ -0,0 +1,483 @@ + + + + + + + + + + + + + + + + + +# Gluon Performance Tips & Tricks + +Compared to traditional machine learning methods, the field of deep-learning has increased model accuracy across a wide range of tasks, but it has also increased the amount of computation required for model training and inference. Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution of networks, but it can sometimes be hard to write code that uses the hardware to its full potential. We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs. + +We'll start by writing some code to train an image classification network for the CIFAR-10 dataset, and then benchmark the throughput of the network in terms of samples processed per second. After some performance analysis, we'll identify the bottlenecks (i.e. the components limiting throughput) and improve the training speed step-by-step. We'll bring together all the tips and tricks at the end and evaluate our performance gains. + + +```python +from __future__ import print_function +import multiprocessing +import time +import mxnet as mx +import numpy as np +``` + +An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this tutorial. You are likely to get difference results and find different bottlenecks on other hardware, but these tips and tricks should still help improve training speed for bottleneck components. A GPU is recommended for this example. + + +```python +ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu() +print("Using {} context.".format(ctx)) +``` + +Using gpu(0) context. + + +We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon. + + +```python +dataset = mx.gluon.data.vision.CIFAR10(train=True) +print('{} samples'.format(len(dataset))) +``` + +5 samples + + +So we can learn how to identify training bottlenecks, let's intentionally introduce a short `sleep` into the data loading pipeline. We transform each 32x32 CIFAR-10 image to 244x244 so we can use it with the ResNet-50 network designed for ImageNet. [CIFAR-10 specific ResNet networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet) exist but we use the more standard ImageNet variants in this example. Review comment: The original ImageNet ResNet actually work on 224x224 images, not 244x244. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] ptrendx commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
ptrendx commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299713848 ## File path: docs/tutorials/gluon/performance.md ## @@ -0,0 +1,483 @@ + + + + + + + + + + + + + + + + + +# Gluon Performance Tips & Tricks + +Compared to traditional machine learning methods, the field of deep-learning has increased model accuracy across a wide range of tasks, but it has also increased the amount of computation required for model training and inference. Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution of networks, but it can sometimes be hard to write code that uses the hardware to its full potential. We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs. + +We'll start by writing some code to train an image classification network for the CIFAR-10 dataset, and then benchmark the throughput of the network in terms of samples processed per second. After some performance analysis, we'll identify the bottlenecks (i.e. the components limiting throughput) and improve the training speed step-by-step. We'll bring together all the tips and tricks at the end and evaluate our performance gains. + + +```python +from __future__ import print_function +import multiprocessing +import time +import mxnet as mx +import numpy as np +``` + +An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this tutorial. You are likely to get difference results and find different bottlenecks on other hardware, but these tips and tricks should still help improve training speed for bottleneck components. A GPU is recommended for this example. Review comment: ```suggestion An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this tutorial. You are likely to get different results and find different bottlenecks on other hardware, but these tips and tricks should still help improve training speed for bottleneck components. A GPU is recommended for this example. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299712376 ## File path: docs/tutorials/gluon/performance.md ## @@ -0,0 +1,485 @@ + + + + + + + + + + + + + + + + + +# Gluon Performance Tips & Tricks + +Compared to traditional machine learning methods, the field of deep-learning has increased model accuracy across a wide range of tasks, but it has also increased the amount of computation required for model training and inference. Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution of networks, but it can sometimes be hard to write code that uses the hardware to its full potential. We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs. + +We'll start by writing some code to train an image classification network for the CIFAR-10 dataset, and then benchmark the throughput of the network in terms of samples processed per second. After some performance analysis, we'll identify the bottlenecks (i.e. the components limiting throughput) and improve the training speed step-by-step. We'll bring together all the tips and tricks at the end and evaluate our performance gains. + + +```python +from __future__ import print_function +import multiprocessing +import time +import mxnet as mx +import numpy as np +from PIL import Image Review comment: Great catch! Added that to perform a slow rotation augmentation (that isn't an MXNet transform), but changed to a `sleep` instead. Switched out for MXNet function, removed dependency, and re-ran the notebook. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299711755 ## File path: docs/tutorials/gluon/performance.md ## @@ -0,0 +1,485 @@ + + + + + + + + + + + + + + + + + +# Gluon Performance Tips & Tricks + +Compared to traditional machine learning methods, the field of deep-learning has increased model accuracy across a wide range of tasks, but it has also increased the amount of computation required for model training and inference. Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution of networks, but it can sometimes be hard to write code that uses the hardware to its full potential. We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs. + +We'll start by writing some code to train an image classification network for the CIFAR-10 dataset, and then benchmark the throughput of the network in terms of samples processed per second. After some performance analysis, we'll identify the bottlenecks (i.e. the components limiting throughput) and improve the training speed step-by-step. We'll bring together all the tips and tricks at the end and evaluate our performance gains. + + +```python +from __future__ import print_function +import multiprocessing +import time +import mxnet as mx +import numpy as np +from PIL import Image +``` + +An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this tutorial. You are likely to get difference results and find different bottlenecks on other hardware, but these tips and tricks should still help improve training speed for bottleneck components. A GPU is recommended for this example. + + +```python +ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu() +print("Using {} context.".format(ctx)) +``` + +Using gpu(0) context. + + +We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon. + + +```python +dataset = mx.gluon.data.vision.CIFAR10(train=True) +print('{} samples'.format(len(dataset))) +``` + +5 samples + + +So we can learn how to identify training bottlenecks, let's intentionally introduce a short `sleep` into the data loading pipeline. We transform each 32x32 CIFAR-10 image to 244x244 so we can use it with the ResNet-50 network designed for ImageNet. [CIFAR-10 specific ResNet networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet) exist but we use the more standard ImageNet variants in this example. + + +```python +def transform_fn(x): +image = Image.fromarray(x.asnumpy()) +time.sleep(0.01) # artificial slow-down +image = image.resize(size=(244, 244), resample=Image.BICUBIC) +return np.array(image).astype('float32').transpose((2, 0, 1)) + +dataset = dataset.transform_first(transform_fn) +``` + +Setting our batch size to 16, we can create the `DataLoader`. + + +```python +batch_size = 16 +dataloader = mx.gluon.data.DataLoader(dataset, + batch_size=batch_size, + shuffle=True, + last_batch="discard") +print('{} batches'.format(len(dataloader))) +``` + +3125 batches + + +Up next, we create all of the other components required for training, such as the network, the loss function, the evaluation metric and parameter trainer. + + +```python +net = mx.gluon.model_zoo.vision.resnet50_v2(pretrained=False, ctx=ctx) +net.initialize(mx.init.Xavier(magnitude=2.3), ctx=ctx) +loss_fn = mx.gluon.loss.SoftmaxCrossEntropyLoss() +metric = mx.metric.Accuracy() +learning_rate = 0.001 +trainer = mx.gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': learning_rate}) +``` + +## Initial Benchmark + +As a starting point, let's benchmark the throughput of our training loop: calculating the average samples per second across 25 iterations, where each iteration is a batch of 16 samples. We'll run a single forward pass through the network before starting our benchmark timer to avoid including shape inference and lazy initialization in the throughput calculations. + + +```python +def single_forward(net, dataloader, dtype='float32'): +data, label = next(iter(dataloader)) +data = data.astype(dtype) +data = data.as_in_context(ctx) +pred = net(data) +pred.wait_to_read() +``` + + +```python +single_forward(net, dataloader) +iters = 25 +num_samples = 0 +num_iters = 0 +start_time = time.time() +for iter_idx, (data, label) in enumerate(dataloader): +num_samples += data.shape[0] +num_iters += 1 +data = data.as_in_context(ctx) +label = label.as_in_context(ctx) +with mx.autograd.record(): +pred = net(data) +loss = loss_fn(pred, label) +loss.backward() +trainer.step(data.shape[0]) +metric.update(label, pred) +print('.', end='') +if num_iters >= ite
[GitHub] [incubator-mxnet] thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks
thomelane commented on a change in pull request #15427: [TUTORIAL] Gluon performance tips and tricks URL: https://github.com/apache/incubator-mxnet/pull/15427#discussion_r299711812 ## File path: docs/tutorials/gluon/performance.md ## @@ -0,0 +1,485 @@ + + + + + + + + + + + + + + + + + +# Gluon Performance Tips & Tricks + +Compared to traditional machine learning methods, the field of deep-learning has increased model accuracy across a wide range of tasks, but it has also increased the amount of computation required for model training and inference. Specialised hardware chips, such as GPUs and FPGAs, can speed up the execution of networks, but it can sometimes be hard to write code that uses the hardware to its full potential. We will be looking at a few simple tips and trick in this tutorial that you can use to speed up training and ultimately save on training costs. + +We'll start by writing some code to train an image classification network for the CIFAR-10 dataset, and then benchmark the throughput of the network in terms of samples processed per second. After some performance analysis, we'll identify the bottlenecks (i.e. the components limiting throughput) and improve the training speed step-by-step. We'll bring together all the tips and tricks at the end and evaluate our performance gains. + + +```python +from __future__ import print_function +import multiprocessing +import time +import mxnet as mx +import numpy as np +from PIL import Image +``` + +An Amazon EC2 p3.2xlarge instance was used to benchmark the code in this tutorial. You are likely to get difference results and find different bottlenecks on other hardware, but these tips and tricks should still help improve training speed for bottleneck components. A GPU is recommended for this example. + + +```python +ctx = mx.gpu() if mx.test_utils.list_gpus() else mx.cpu() +print("Using {} context.".format(ctx)) +``` + +Using gpu(0) context. + + +We'll use the `CIFAR10` dataset provided out-of-the-box with Gluon. + + +```python +dataset = mx.gluon.data.vision.CIFAR10(train=True) +print('{} samples'.format(len(dataset))) +``` + +5 samples + + +So we can learn how to identify training bottlenecks, let's intentionally introduce a short `sleep` into the data loading pipeline. We transform each 32x32 CIFAR-10 image to 244x244 so we can use it with the ResNet-50 network designed for ImageNet. [CIFAR-10 specific ResNet networks](https://gluon-cv.mxnet.io/api/model_zoo.html#gluoncv.model_zoo.get_cifar_resnet) exist but we use the more standard ImageNet variants in this example. + + +```python +def transform_fn(x): +image = Image.fromarray(x.asnumpy()) +time.sleep(0.01) # artificial slow-down +image = image.resize(size=(244, 244), resample=Image.BICUBIC) +return np.array(image).astype('float32').transpose((2, 0, 1)) + +dataset = dataset.transform_first(transform_fn) +``` + +Setting our batch size to 16, we can create the `DataLoader`. + + +```python +batch_size = 16 +dataloader = mx.gluon.data.DataLoader(dataset, + batch_size=batch_size, + shuffle=True, + last_batch="discard") +print('{} batches'.format(len(dataloader))) +``` + +3125 batches + + +Up next, we create all of the other components required for training, such as the network, the loss function, the evaluation metric and parameter trainer. + + +```python +net = mx.gluon.model_zoo.vision.resnet50_v2(pretrained=False, ctx=ctx) +net.initialize(mx.init.Xavier(magnitude=2.3), ctx=ctx) +loss_fn = mx.gluon.loss.SoftmaxCrossEntropyLoss() +metric = mx.metric.Accuracy() +learning_rate = 0.001 +trainer = mx.gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': learning_rate}) +``` + +## Initial Benchmark + +As a starting point, let's benchmark the throughput of our training loop: calculating the average samples per second across 25 iterations, where each iteration is a batch of 16 samples. We'll run a single forward pass through the network before starting our benchmark timer to avoid including shape inference and lazy initialization in the throughput calculations. + + +```python +def single_forward(net, dataloader, dtype='float32'): +data, label = next(iter(dataloader)) +data = data.astype(dtype) +data = data.as_in_context(ctx) +pred = net(data) +pred.wait_to_read() +``` + + +```python +single_forward(net, dataloader) +iters = 25 +num_samples = 0 +num_iters = 0 +start_time = time.time() +for iter_idx, (data, label) in enumerate(dataloader): +num_samples += data.shape[0] +num_iters += 1 +data = data.as_in_context(ctx) +label = label.as_in_context(ctx) +with mx.autograd.record(): +pred = net(data) +loss = loss_fn(pred, label) +loss.backward() +trainer.step(data.shape[0]) +metric.update(label, pred) +print('.', end='') +if num_iters >= ite
[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API
sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299706065 ## File path: amalgamation/python/mxnet_predict.py ## @@ -160,10 +249,18 @@ def forward(self, **kwargs): >>> predictor.forward(data=mydata) >>> out = predictor.get_output(0) """ +if self.type_dict and len(self.type_dict) != len(kwargs.items()): +raise ValueError("number of kwargs should be same as len of type_dict" \ + "Please check your forward pass inputs" \ + "or type_dict passed to Predictor instantiation") + for k, v in kwargs.items(): if not isinstance(v, np.ndarray): raise ValueError("Expect numpy ndarray as input") -v = np.asarray(v, dtype=np.float32, order='C') +if self.type_dict and k in self.type_dict: +v = np.asarray(v, dtype=self.type_dict[k], order='C') +else: +v = np.asarray(v, dtype=np.float32, order='C') Review comment: if user provided type is not supported by MXNet, it silently converts to FP32, is this expected? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API
sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299709898 ## File path: src/c_api/c_predict_api.cc ## @@ -444,6 +538,20 @@ int MXPredGetOutputShape(PredictorHandle handle, API_END(); } +int MXPredGetOutputType(PredictorHandle handle, +mx_uint out_index, +int* out_dtype) { + MXAPIPredictor* p = static_cast(handle); + API_BEGIN(); + CHECK_LT(out_index, p->out_arrays.size()) +<< "Index exceed number of outputs"; Review comment: nit: Can we make the message more easy to comprehend? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API
sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299709638 ## File path: src/c_api/c_predict_api.cc ## @@ -210,19 +249,31 @@ int _CreatePartialOut(const char* symbol_json_str, std::vector arg_arrays, aux_arrays; for (size_t i = 0; i < arg_shapes.size(); ++i) { -NDArray nd = NDArray(arg_shapes[i], ctx); +NDArray nd; +if (result_arg_types[i] != -1) { + nd = NDArray(arg_shapes[i], ctx, false, result_arg_types[i]); +} else { + nd = NDArray(arg_shapes[i], ctx); +} if (arg_params.count(arg_names[i]) != 0) { CopyFromTo(arg_params[arg_names[i]], &nd); } arg_arrays.push_back(nd); } + for (size_t i = 0; i < aux_shapes.size(); ++i) { -NDArray nd = NDArray(aux_shapes[i], ctx); +NDArray nd; +if (result_aux_types[i] != -1) { Review comment: Since we are doing such check on types throughout, can we otherwise think of a setting a default DType for all params if users are not providing arg types params? So we could get rid of all these checks and always types will be available. Thoughts? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API
sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299707221 ## File path: include/mxnet/c_predict_api.h ## @@ -85,6 +85,44 @@ MXNET_DLL int MXPredCreate(const char* symbol_json_str, const mx_uint* input_shape_data, PredictorHandle* out); +/*! + * \brief create a predictor + * \param symbol_json_str The JSON string of the symbol. + * \param param_bytes The in-memory raw bytes of parameter ndarray file. + * \param param_size The size of parameter ndarray file. + * \param dev_type The device type, 1: cpu, 2: gpu + * \param dev_id The device id of the predictor. + * \param num_input_nodes Number of input nodes to the net. + *For feedforward net, this is 1. + * \param input_keys The name of the input argument. + *For feedforward net, this is {"data"} + * \param input_shape_indptr Index pointer of shapes of each input node. + *The length of this array = num_input_nodes + 1. + *For feedforward net that takes 4 dimensional input, this is {0, 4}. + * \param input_shape_data A flattened data of shapes of each input node. + *For feedforward net that takes 4 dimensional input, this is the shape data. + * \param num_provided_arg_dtypes + *The length of provided_arg_dtypes. + * \param provided_arg_dtype_names + *The provided_arg_dtype_names the names of args for which dtypes are provided. + * \param provided_arg_dtypes + *The provided_arg_dtypes the dtype provided + * \param out The created predictor handle. + * \return 0 when success, -1 when failure. + */ +MXNET_DLL int MXPredCreateEx(const char* symbol_json_str, + const void* param_bytes, + int param_size, + int dev_type, int dev_id, + mx_uint num_input_nodes, Review comment: Why this params are non constant? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API
sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299706532 ## File path: amalgamation/python/mxnet_predict.py ## @@ -160,10 +249,18 @@ def forward(self, **kwargs): >>> predictor.forward(data=mydata) >>> out = predictor.get_output(0) """ +if self.type_dict and len(self.type_dict) != len(kwargs.items()): +raise ValueError("number of kwargs should be same as len of type_dict" \ + "Please check your forward pass inputs" \ + "or type_dict passed to Predictor instantiation") + for k, v in kwargs.items(): if not isinstance(v, np.ndarray): raise ValueError("Expect numpy ndarray as input") -v = np.asarray(v, dtype=np.float32, order='C') +if self.type_dict and k in self.type_dict: +v = np.asarray(v, dtype=self.type_dict[k], order='C') +else: +v = np.asarray(v, dtype=np.float32, order='C') Review comment: nit: Will be better to keep all the dtype, including default ie., np.float32 in the map you are maintaing and remove all explicit np.dtype ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API
sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299708932 ## File path: src/c_api/c_predict_api.cc ## @@ -187,21 +206,41 @@ int _CreatePartialOut(const char* symbol_json_str, try { mxnet::ShapeVector in_shapes; +nnvm::DTypeVector in_types; for (std::string key : sym.ListInputNames(Symbol::kAll)) { if (known_shape.count(key) != 0) { in_shapes.push_back(known_shape[key]); } else { in_shapes.emplace_back(); } } + +for (std::string key : sym.ListInputNames(Symbol::kAll)) { + if (arg_types.count(key) != 0) { +in_types.push_back(arg_types[key]); + } else if (aux_types.count(key) != 0) { +in_types.push_back(aux_types[key]); + } +} nnvm::Graph g; g.outputs = sym.outputs; g = mxnet::exec::InferShape(std::move(g), std::move(in_shapes), "__shape__"); +g = mxnet::exec::InferType(std::move(g), std::move(in_types), "__dtype__"); bool infer_complete = (g.GetAttr("shape_num_unknown_nodes") == 0); +// This is tricky for AMP Use case, for example, with only weights input types +// cannot be inferred in AMP. Thus for AMP converted model type_dict will be +// required +bool infer_type_complete = (g.GetAttr("dtype_num_unknown_nodes") == 0); CHECK(infer_complete) << "The shape information of is not enough to get the shapes"; +CHECK(infer_type_complete) +<< "The type information is not enough, please provide input arg_types " + "with provided_arg_dtype_names and provided_arg_dtypes"; Review comment: I think this will not be clear for an MXNet user, he is not setting any provided_arg_dtype_names and provided_arg_dtypes parameters, so if something fails, how to debug? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API
sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299699493 ## File path: amalgamation/python/mxnet_predict.py ## @@ -25,17 +25,74 @@ import os import sys +from array import array import ctypes import logging import numpy as np +# pylint: disable= no-member +_DTYPE_NP_TO_MX = { +None: -1, Review comment: Should None be Float32 by default? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API
sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299705745 ## File path: amalgamation/python/mxnet_predict.py ## @@ -160,10 +249,18 @@ def forward(self, **kwargs): >>> predictor.forward(data=mydata) >>> out = predictor.get_output(0) """ +if self.type_dict and len(self.type_dict) != len(kwargs.items()): +raise ValueError("number of kwargs should be same as len of type_dict" \ + "Please check your forward pass inputs" \ + "or type_dict passed to Predictor instantiation") + for k, v in kwargs.items(): if not isinstance(v, np.ndarray): raise ValueError("Expect numpy ndarray as input") -v = np.asarray(v, dtype=np.float32, order='C') +if self.type_dict and k in self.type_dict: +v = np.asarray(v, dtype=self.type_dict[k], order='C') Review comment: Can you help me understand importance of Column major here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API
sandeep-krishnamurthy commented on a change in pull request #15245: FP16 Support for C Predict API URL: https://github.com/apache/incubator-mxnet/pull/15245#discussion_r299704452 ## File path: amalgamation/python/mxnet_predict.py ## @@ -133,15 +199,38 @@ def __init__(self, symbol_file, handle = PredictorHandle() param_raw_bytes = bytearray(param_raw_bytes) ptr = (ctypes.c_char * len(param_raw_bytes)).from_buffer(param_raw_bytes) -_check_call(_LIB.MXPredCreate( + +# data types +num_provided_arg_types = 0 +# provided type argument names +provided_arg_type_names = ctypes.POINTER(ctypes.c_char_p)() +# provided types +provided_arg_type_data = ctypes.POINTER(mx_uint)() +if type_dict is not None: +provided_arg_type_names = [] +provided_arg_type_data = [] +for k, v in type_dict.items(): +v = np.dtype(v).type +if v in _DTYPE_NP_TO_MX: +provided_arg_type_names.append(k) Review comment: Here we are depending on index of the element? I remember we had issues due to dependence on position in the past due to different in Python list maintaining elements in Python2 / 3. Should we use a map instead here (name -> type)? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] apeforest commented on issue #15288: [MXNET-978] Higher order gradient for sigmoid
apeforest commented on issue #15288: [MXNET-978] Higher order gradient for sigmoid URL: https://github.com/apache/incubator-mxnet/pull/15288#issuecomment-507862358 @kshitij12345 could you approve the PR if everything looks good to you now? thx This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] IvyBazan opened a new issue #15447: C API doxygen broken links
IvyBazan opened a new issue #15447: C API doxygen broken links URL: https://github.com/apache/incubator-mxnet/issues/15447 URL - https://mxnet.incubator.apache.org/versions/master/doxygen/group__c__api__attributes.html Broken Links ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_0.png (HTTP_404) URL - https://mxnet.incubator.apache.org/versions/master/doxygen/group__c__api__convolution.html Broken Links ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_1.png (HTTP_404) ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_2.png (HTTP_404) ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_3.png (HTTP_404) ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_4.png (HTTP_404) ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_5.png (HTTP_404) ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_6.png (HTTP_404) ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_7.png (HTTP_404) URL - https://mxnet.incubator.apache.org/versions/master/doxygen/group__c__api__softmax.html Broken Links ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_8.png (HTTP_404) ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_9.png (HTTP_404) URL - https://mxnet.incubator.apache.org/versions/master/doxygen/group__c__api__pooling.html Broken Links ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_10.png (HTTP_404) ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_11.png (HTTP_404) ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_7.png (HTTP_404) URL - https://mxnet.incubator.apache.org/versions/master/doxygen/group__c__api__lrn.html Broken Links ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_12.png (HTTP_404) ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_13.png (HTTP_404) ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_14.png (HTTP_404) URL - https://mxnet.incubator.apache.org/versions/master/doxygen/group__c__api__batch__normalization.html Broken Links ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_15.png (HTTP_404) ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_16.png (HTTP_404) ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_17.png (HTTP_404) ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_18.png (HTTP_404) URL - https://mxnet.incubator.apache.org/versions/master/doxygen/group__c__api__inner__product.html Broken Links ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_19.png (HTTP_404) URL - https://mxnet.incubator.apache.org/versions/master/doxygen/group__c__api__rnn.html Broken Links ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_0.png (HTTP_404) URL - https://mxnet.incubator.apache.org/versions/master/doxygen/group__c__api__types__generic.html Broken Links ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_20.png (HTTP_404) ─ https://mxnet.incubator.apache.org/versions/master/doxygen/form_21.png (HTTP_404) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-label-bot commented on issue #15447: C API doxygen broken links
mxnet-label-bot commented on issue #15447: C API doxygen broken links URL: https://github.com/apache/incubator-mxnet/issues/15447#issuecomment-507857022 Hey, this is the MXNet Label Bot. Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it. Here are my recommended labels: Doc This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] IvyBazan commented on issue #15446: Clojure NDArray broken link
IvyBazan commented on issue #15446: Clojure NDArray broken link URL: https://github.com/apache/incubator-mxnet/issues/15446#issuecomment-507856166 URL - https://mxnet.incubator.apache.org/versions/master/api/clojure/docs/org.apache.clojure-mxnet.symbol-api.html Broken Links ─ https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html%3E (HTTP_404) ─ https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html%3E (HTTP_404) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-label-bot commented on issue #15446: Clojure NDArray broken link
mxnet-label-bot commented on issue #15446: Clojure NDArray broken link URL: https://github.com/apache/incubator-mxnet/issues/15446#issuecomment-507856108 Hey, this is the MXNet Label Bot. Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it. Here are my recommended labels: Doc This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] IvyBazan opened a new issue #15446: Clojure NDArray broken link
IvyBazan opened a new issue #15446: Clojure NDArray broken link URL: https://github.com/apache/incubator-mxnet/issues/15446 URL - https://mxnet.incubator.apache.org/versions/master/api/clojure/docs/org.apache.clojure-mxnet.ndarray-api.html Broken Links ─ https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html%3E (HTTP_404) ─ https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html%3E (HTTP_404) ~~ URL should be: -https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] mxnet-label-bot commented on issue #15445: MXNet export broken link
mxnet-label-bot commented on issue #15445: MXNet export broken link URL: https://github.com/apache/incubator-mxnet/issues/15445#issuecomment-507854516 Hey, this is the MXNet Label Bot. Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it. Here are my recommended labels: Doc This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-mxnet] IvyBazan opened a new issue #15445: MXNet export broken link
IvyBazan opened a new issue #15445: MXNet export broken link URL: https://github.com/apache/incubator-mxnet/issues/15445 URL - https://mxnet.incubator.apache.org/tutorials/onnx/export_mxnet_to_onnx.html Broken Links ─ http://data.mxnet.io/models/imagenet/ (HTTP_404) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services