[incubator-mxnet] branch master updated: fixed practioners to practitioners (#8534)

2017-11-03 Thread jxie
This is an automated email from the ASF dual-hosted git repository.

jxie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 53e891b  fixed practioners to practitioners (#8534)
53e891b is described below

commit 53e891b7bd9b417a95ccc8a0ed72dfd676397ed4
Author: thinksanky <31976455+thinksa...@users.noreply.github.com>
AuthorDate: Fri Nov 3 22:17:39 2017 -0700

fixed practioners to practitioners (#8534)
---
 docs/_static/mxnet-theme/index.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/_static/mxnet-theme/index.html 
b/docs/_static/mxnet-theme/index.html
index 7e645ef..da5cea6 100644
--- a/docs/_static/mxnet-theme/index.html
+++ b/docs/_static/mxnet-theme/index.html
@@ -6,7 +6,7 @@
 is a flexible and efficient library for deep 
learning.
 
 Building a high-performance deep learning library requires 
many system-level design decisions. In this design note, we share the rationale 
for the specific 
-choices made when designing MXNet. We imagine that these 
insights may be useful to both deep learning practioners and builders of other 
deep learning systems.
+choices made when designing MXNet. We imagine that these 
insights may be useful to both deep learning practitioners and builders of 
other deep learning systems.
 
 
 

-- 
To stop receiving notification emails like this one, please contact
['"comm...@mxnet.apache.org" '].


[GitHub] piiswrong closed pull request #8534: fixed practioners to practitioners

2017-11-03 Thread GitBox
piiswrong closed pull request #8534: fixed practioners to practitioners
URL: https://github.com/apache/incubator-mxnet/pull/8534
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/_static/mxnet-theme/index.html 
b/docs/_static/mxnet-theme/index.html
index 7e645efd8d..da5cea6f95 100644
--- a/docs/_static/mxnet-theme/index.html
+++ b/docs/_static/mxnet-theme/index.html
@@ -6,7 +6,7 @@
 is a flexible and efficient library for deep 
learning.
 
 Building a high-performance deep learning library requires 
many system-level design decisions. In this design note, we share the rationale 
for the specific 
-choices made when designing MXNet. We imagine that these 
insights may be useful to both deep learning practioners and builders of other 
deep learning systems.
+choices made when designing MXNet. We imagine that these 
insights may be useful to both deep learning practitioners and builders of 
other deep learning systems.
 
 
 


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhanghang1989 commented on issue #8494: Autograd bug in mxnet-cu80: 0.12

2017-11-03 Thread GitBox
zhanghang1989 commented on issue #8494: Autograd bug in mxnet-cu80: 0.12
URL: 
https://github.com/apache/incubator-mxnet/issues/8494#issuecomment-341869939
 
 
   Put all steps that requires grads in this scope including model forward
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] roggiezhang-nv commented on issue #8494: Autograd bug in mxnet-cu80: 0.12

2017-11-03 Thread GitBox
roggiezhang-nv commented on issue #8494: Autograd bug in mxnet-cu80: 0.12
URL: 
https://github.com/apache/incubator-mxnet/issues/8494#issuecomment-341868942
 
 
   I tried and that won't help.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] cjolivier01 closed pull request #8492: Fix for issue #8491, elemwise_mul nan behavior

2017-11-03 Thread GitBox
cjolivier01 closed pull request #8492: Fix for issue #8491, elemwise_mul nan 
behavior
URL: https://github.com/apache/incubator-mxnet/pull/8492
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/src/operator/tensor/elemwise_binary_op.cc 
b/src/operator/tensor/elemwise_binary_op.cc
index 00c5e10f24..931132b425 100644
--- a/src/operator/tensor/elemwise_binary_op.cc
+++ b/src/operator/tensor/elemwise_binary_op.cc
@@ -86,50 +86,5 @@ bool ElemwiseBinaryOp::BackwardUseInStorageType(const 
nnvm::NodeAttrs& attrs,
   return true;
 }
 
-bool ElemwiseBinaryOp::AllowLRDenseInputWithSparseOutputStorageType(const 
nnvm::NodeAttrs& attrs,
-const int 
dev_mask,
-
DispatchMode* dispatch_mode,
-
std::vector *in_attrs,
-
std::vector *out_attrs) {
-  CHECK_EQ(in_attrs->size(), 2U) << " in operator " << attrs.name;
-  CHECK_EQ(out_attrs->size(), 1U) << " in operator " << attrs.name;
-  const auto& lhs_stype = in_attrs->at(0);
-  const auto& rhs_stype = in_attrs->at(1);
-  auto& out_stype = out_attrs->at(0);
-  bool dispatched = false;
-  const bool invalid_ctx = dev_mask != mshadow::cpu::kDevMask;
-  const auto dispatch_ex = invalid_ctx ? DispatchMode::kFComputeFallback :
-   DispatchMode::kFComputeEx;
-  if (!dispatched && lhs_stype == kDefaultStorage && rhs_stype == 
kDefaultStorage) {
-// dns, dns -> dns
-dispatched = storage_type_assign(_stype, kDefaultStorage,
- dispatch_mode, DispatchMode::kFCompute);
-  }
-  if (!dispatched) {
-if ((lhs_stype == kRowSparseStorage && rhs_stype == kRowSparseStorage) ||
-(lhs_stype == kRowSparseStorage && rhs_stype == kDefaultStorage) ||
-(lhs_stype == kDefaultStorage && rhs_stype == kRowSparseStorage)) {
-  // rsp, rsp -> rsp
-  // rsp, dns -> rsp
-  // dns, rsp -> rsp
-  dispatched = storage_type_assign(_stype, kRowSparseStorage,
-   dispatch_mode, dispatch_ex);
-} else if (lhs_stype == kCSRStorage && rhs_stype == kCSRStorage) {
-  dispatched = storage_type_assign(_stype, kCSRStorage,
-   dispatch_mode, dispatch_ex);
-} else if (lhs_stype == kCSRStorage || rhs_stype == kCSRStorage) {
-  dispatched = storage_type_assign(_stype, kCSRStorage,
-   dispatch_mode, 
DispatchMode::kFComputeFallback);
-}
-  }
-  if (!dispatched) {
-dispatch_fallback(out_attrs, dispatch_mode);
-  }
-  if (*dispatch_mode == DispatchMode::kFComputeFallback) {
-LogStorageFallback(attrs, dev_mask, in_attrs, out_attrs);
-  }
-  return true;
-}
-
 }  // namespace op
 }  // namespace mxnet
diff --git a/src/operator/tensor/elemwise_binary_op.h 
b/src/operator/tensor/elemwise_binary_op.h
index 6a1cc02c87..b8b5bd1390 100644
--- a/src/operator/tensor/elemwise_binary_op.h
+++ b/src/operator/tensor/elemwise_binary_op.h
@@ -273,11 +273,55 @@ class ElemwiseBinaryOp : public OpBase {
* \param out_attrs Output storage attributes
* \return true if handled
*/
+  template
   static bool AllowLRDenseInputWithSparseOutputStorageType(const 
nnvm::NodeAttrs& attrs,
int dev_mask,
DispatchMode* 
dispatch_mode,
std::vector 
*in_attrs,
-   std::vector 
*out_attrs);
+   std::vector 
*out_attrs) {
+CHECK_EQ(in_attrs->size(), 2U) << " in operator " << attrs.name;
+CHECK_EQ(out_attrs->size(), 1U) << " in operator " << attrs.name;
+const auto& lhs_stype = in_attrs->at(0);
+const auto& rhs_stype = in_attrs->at(1);
+auto& out_stype = out_attrs->at(0);
+bool dispatched = false;
+const bool invalid_ctx = dev_mask != mshadow::cpu::kDevMask;
+const auto dispatch_ex = invalid_ctx ? DispatchMode::kFComputeFallback :
+ DispatchMode::kFComputeEx;
+if (!dispatched && lhs_stype == kDefaultStorage && rhs_stype == 
kDefaultStorage) {
+  // dns, dns -> dns
+  dispatched = storage_type_assign(_stype, kDefaultStorage,
+   dispatch_mode, DispatchMode::kFCompute);
+}
+if (!dispatched) {
+  if ((lhs_stype == kRowSparseStorage && rhs_stype == kRowSparseStorage) ||
+  (rhs_dense_ok && lhs_stype == 

[incubator-mxnet] branch master updated: Fix for issue #8491, elemwise_mul nan behavior (#8492)

2017-11-03 Thread cjolivier01
This is an automated email from the ASF dual-hosted git repository.

cjolivier01 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 52d809c  Fix for issue #8491, elemwise_mul nan behavior (#8492)
52d809c is described below

commit 52d809c9288e70927534dc783c439dfd65df7de7
Author: Chris Olivier 
AuthorDate: Fri Nov 3 20:29:06 2017 -0700

Fix for issue #8491, elemwise_mul nan behavior (#8492)

* Fix for issue #8491, elemwise_mul nan behavior
https://github.com/apache/incubator-mxnet/issues/8491

* Also make nan * 0 -> dense

* Also make nan * 0 -> dense

* Add Scala package dev tools for deploy (#8498)

* [scala] do not print op definition during compiling by default

* [scala] add dev tools for changing artifactId

* [scala] add scala make clean

* Getting read of maybe_uninitialized warnings (#8318)

* Fix of log10_grad, log2_grad (#8502)

* add default hash to ndarray (#8476)

* bump up version (#8488)

* fix makenonlossgrad bug (#8508)

* fix expand_dims if axis< 0 (#8489)

* fix expand_dims if axis< 0

* Update test_operator.py

* Correct Initialization Description in Finetune Tutorial. (#8517)

* Trigger rebuild
---
 src/operator/tensor/elemwise_binary_op.cc   | 45 
 src/operator/tensor/elemwise_binary_op.h| 46 -
 src/operator/tensor/elemwise_binary_op_basic.cc |  7 ++--
 tests/python/unittest/test_sparse_operator.py   | 11 --
 4 files changed, 57 insertions(+), 52 deletions(-)

diff --git a/src/operator/tensor/elemwise_binary_op.cc 
b/src/operator/tensor/elemwise_binary_op.cc
index 00c5e10..931132b 100644
--- a/src/operator/tensor/elemwise_binary_op.cc
+++ b/src/operator/tensor/elemwise_binary_op.cc
@@ -86,50 +86,5 @@ bool ElemwiseBinaryOp::BackwardUseInStorageType(const 
nnvm::NodeAttrs& attrs,
   return true;
 }
 
-bool ElemwiseBinaryOp::AllowLRDenseInputWithSparseOutputStorageType(const 
nnvm::NodeAttrs& attrs,
-const int 
dev_mask,
-
DispatchMode* dispatch_mode,
-
std::vector *in_attrs,
-
std::vector *out_attrs) {
-  CHECK_EQ(in_attrs->size(), 2U) << " in operator " << attrs.name;
-  CHECK_EQ(out_attrs->size(), 1U) << " in operator " << attrs.name;
-  const auto& lhs_stype = in_attrs->at(0);
-  const auto& rhs_stype = in_attrs->at(1);
-  auto& out_stype = out_attrs->at(0);
-  bool dispatched = false;
-  const bool invalid_ctx = dev_mask != mshadow::cpu::kDevMask;
-  const auto dispatch_ex = invalid_ctx ? DispatchMode::kFComputeFallback :
-   DispatchMode::kFComputeEx;
-  if (!dispatched && lhs_stype == kDefaultStorage && rhs_stype == 
kDefaultStorage) {
-// dns, dns -> dns
-dispatched = storage_type_assign(_stype, kDefaultStorage,
- dispatch_mode, DispatchMode::kFCompute);
-  }
-  if (!dispatched) {
-if ((lhs_stype == kRowSparseStorage && rhs_stype == kRowSparseStorage) ||
-(lhs_stype == kRowSparseStorage && rhs_stype == kDefaultStorage) ||
-(lhs_stype == kDefaultStorage && rhs_stype == kRowSparseStorage)) {
-  // rsp, rsp -> rsp
-  // rsp, dns -> rsp
-  // dns, rsp -> rsp
-  dispatched = storage_type_assign(_stype, kRowSparseStorage,
-   dispatch_mode, dispatch_ex);
-} else if (lhs_stype == kCSRStorage && rhs_stype == kCSRStorage) {
-  dispatched = storage_type_assign(_stype, kCSRStorage,
-   dispatch_mode, dispatch_ex);
-} else if (lhs_stype == kCSRStorage || rhs_stype == kCSRStorage) {
-  dispatched = storage_type_assign(_stype, kCSRStorage,
-   dispatch_mode, 
DispatchMode::kFComputeFallback);
-}
-  }
-  if (!dispatched) {
-dispatch_fallback(out_attrs, dispatch_mode);
-  }
-  if (*dispatch_mode == DispatchMode::kFComputeFallback) {
-LogStorageFallback(attrs, dev_mask, in_attrs, out_attrs);
-  }
-  return true;
-}
-
 }  // namespace op
 }  // namespace mxnet
diff --git a/src/operator/tensor/elemwise_binary_op.h 
b/src/operator/tensor/elemwise_binary_op.h
index 6a1cc02..b8b5bd1 100644
--- a/src/operator/tensor/elemwise_binary_op.h
+++ b/src/operator/tensor/elemwise_binary_op.h
@@ -273,11 +273,55 @@ class ElemwiseBinaryOp : public OpBase {
* \param out_attrs Output storage attributes
* \return true if handled
*/
+  template
   static bool AllowLRDenseInputWithSparseOutputStorageType(const 
nnvm::NodeAttrs& attrs,
 

[incubator-mxnet] branch master updated: Move OpenMP calculations into its own class (#8475)

2017-11-03 Thread cjolivier01
This is an automated email from the ASF dual-hosted git repository.

cjolivier01 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new 90ba62a  Move OpenMP calculations into its own class (#8475)
90ba62a is described below

commit 90ba62a9eae9061d3763d594a552a367e550fe4f
Author: Chris Olivier 
AuthorDate: Fri Nov 3 20:28:41 2017 -0700

Move OpenMP calculations into its own class (#8475)

* Different calls into OMP class

* Copy from 'tuner' branch

* Fix for _OPENMP not defined, class still exists

* fix bracket

* Added more comments

* fix condition when CSRNDArray is used in NDArrayIter with shuffle=True 
(#8374)

* fix condition when CSRNDArray is used in NDArrayIter with shuffle=True

* fix indent

* add unittest for invalid inputs

* replace AssertErr with NotImplErr

* update comment

* trigger ci

* Fix urllib bug affecting Python 3.x in the finetune tutorial (#8229)

* changed url references from dmlc to apache/incubator-mxnet

* added python 3 support for dataset download with urllib

* added python 3 support for dataset download with urllib

* removed unintended edits

* Build fix, add more documentation

* lint

* commiting v12 changess (#8478)

* Simplified unary/binary math operators (#8361) (#8361)

* bump up version (#8488)

* fix makenonlossgrad bug (#8508)

* fix expand_dims if axis< 0 (#8489)

* fix expand_dims if axis< 0

* Update test_operator.py

* Trigger build
---
 include/mxnet/engine.h  |  5 --
 src/engine/naive_engine.cc  | 14 +-
 src/engine/openmp.cc| 83 
 src/engine/openmp.h | 85 +
 src/engine/thread_pool.h|  4 +-
 src/engine/threaded_engine.h| 15 +-
 src/engine/threaded_engine_perdevice.cc |  6 +--
 src/nnvm/legacy_op_util.cc  |  6 ++-
 8 files changed, 181 insertions(+), 37 deletions(-)

diff --git a/include/mxnet/engine.h b/include/mxnet/engine.h
index 30aecbd..4048d5a 100644
--- a/include/mxnet/engine.h
+++ b/include/mxnet/engine.h
@@ -272,11 +272,6 @@ class MXNET_API Engine {
* \return Number of OMP threads that should be used per worker
*/
   virtual int num_omp_threads_per_worker() const = 0;
-
-  /*! \brief Set the number of OMP threads that should be used per worker
-   * \param num_threads_per_worker Number of OMP threads to be used per worker
-   */
-  virtual void set_num_omp_threads_per_worker(int num_omp_threads_per_worker) 
= 0;
 };  // class Engine
 #endif  // DMLC_USE_CXX11
 }  // namespace mxnet
diff --git a/src/engine/naive_engine.cc b/src/engine/naive_engine.cc
index a31101c..7e3554a 100644
--- a/src/engine/naive_engine.cc
+++ b/src/engine/naive_engine.cc
@@ -26,7 +26,7 @@
 #include 
 #include "./engine_impl.h"
 #include "./profiler.h"
-#include "threaded_engine.h"
+#include "./openmp.h"
 
 namespace mxnet {
 namespace engine {
@@ -47,7 +47,6 @@ class NaiveEngine final : public Engine {
   };
 
   NaiveEngine() {
-
set_num_omp_threads_per_worker(ThreadedEngine::DefaultOMPThreadsPerWorker());
   }
   // virtual destructor
   virtual ~NaiveEngine() {
@@ -193,14 +192,7 @@ class NaiveEngine final : public Engine {
* \return Number of OMP threads that should be used per worker
*/
   int num_omp_threads_per_worker() const override {
-return num_omp_threads_per_worker_;
-  }
-
-  /*! \brief Set the number of OMP threads that should be used per worker
-   * \param num_threads_per_worker Number of OMP threads to be used per worker
-   */
-  void set_num_omp_threads_per_worker(int num_threads_per_worker) override {
-num_omp_threads_per_worker_ = num_threads_per_worker;
+return OpenMP::Get()->GetRecommendedOMPThreadCount();
   }
 
  private:
@@ -218,8 +210,6 @@ class NaiveEngine final : public Engine {
   mshadow::Stream cpu_stream_;
   // GPU streams
   std::vector streams_;
-  /*! \brief Number of OMP threads to be used per worker */
-  int num_omp_threads_per_worker_{0};
 };  // class NaiveEngine
 
 
diff --git a/src/engine/openmp.cc b/src/engine/openmp.cc
new file mode 100644
index 000..a605f97
--- /dev/null
+++ b/src/engine/openmp.cc
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the 

[GitHub] cjolivier01 closed pull request #8475: Move OpenMP calculations into its own class

2017-11-03 Thread GitBox
cjolivier01 closed pull request #8475: Move OpenMP calculations into its own 
class
URL: https://github.com/apache/incubator-mxnet/pull/8475
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/include/mxnet/engine.h b/include/mxnet/engine.h
index 30aecbdd8c..4048d5a1a3 100644
--- a/include/mxnet/engine.h
+++ b/include/mxnet/engine.h
@@ -272,11 +272,6 @@ class MXNET_API Engine {
* \return Number of OMP threads that should be used per worker
*/
   virtual int num_omp_threads_per_worker() const = 0;
-
-  /*! \brief Set the number of OMP threads that should be used per worker
-   * \param num_threads_per_worker Number of OMP threads to be used per worker
-   */
-  virtual void set_num_omp_threads_per_worker(int num_omp_threads_per_worker) 
= 0;
 };  // class Engine
 #endif  // DMLC_USE_CXX11
 }  // namespace mxnet
diff --git a/src/engine/naive_engine.cc b/src/engine/naive_engine.cc
index a31101c353..7e3554ab1c 100644
--- a/src/engine/naive_engine.cc
+++ b/src/engine/naive_engine.cc
@@ -26,7 +26,7 @@
 #include 
 #include "./engine_impl.h"
 #include "./profiler.h"
-#include "threaded_engine.h"
+#include "./openmp.h"
 
 namespace mxnet {
 namespace engine {
@@ -47,7 +47,6 @@ class NaiveEngine final : public Engine {
   };
 
   NaiveEngine() {
-
set_num_omp_threads_per_worker(ThreadedEngine::DefaultOMPThreadsPerWorker());
   }
   // virtual destructor
   virtual ~NaiveEngine() {
@@ -193,14 +192,7 @@ class NaiveEngine final : public Engine {
* \return Number of OMP threads that should be used per worker
*/
   int num_omp_threads_per_worker() const override {
-return num_omp_threads_per_worker_;
-  }
-
-  /*! \brief Set the number of OMP threads that should be used per worker
-   * \param num_threads_per_worker Number of OMP threads to be used per worker
-   */
-  void set_num_omp_threads_per_worker(int num_threads_per_worker) override {
-num_omp_threads_per_worker_ = num_threads_per_worker;
+return OpenMP::Get()->GetRecommendedOMPThreadCount();
   }
 
  private:
@@ -218,8 +210,6 @@ class NaiveEngine final : public Engine {
   mshadow::Stream cpu_stream_;
   // GPU streams
   std::vector streams_;
-  /*! \brief Number of OMP threads to be used per worker */
-  int num_omp_threads_per_worker_{0};
 };  // class NaiveEngine
 
 
diff --git a/src/engine/openmp.cc b/src/engine/openmp.cc
new file mode 100644
index 00..a605f977b6
--- /dev/null
+++ b/src/engine/openmp.cc
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+#include 
+#include 
+#include 
+#include 
+#include "./openmp.h"
+
+namespace mxnet {
+namespace engine {
+
+#if defined(__i386__) || defined(_M_X86) || defined(_M_X64) || 
defined(__x86_64__)
+#define ARCH_IS_INTEL_X86
+#endif
+
+OpenMP *OpenMP::Get() {
+  static OpenMP openMP;
+  return 
+}
+
+OpenMP::OpenMP()
+  : omp_num_threads_set_in_environment(dmlc::GetEnv("OMP_NUM_THREADS", 
INT_MIN) == INT_MIN) {
+#ifdef _OPENMP
+  if (!omp_num_threads_set_in_environment) {
+omp_set_nested(true);
+omp_set_dynamic(false);
+  }
+  const int max = dmlc::GetEnv("MXNET_OMP_MAX_THREADS", INT_MIN);
+  if (max != INT_MIN) {
+omp_thread_max_ = max;
+  } else {
+#ifdef ARCH_IS_INTEL_X86
+omp_thread_max_ = omp_get_num_procs() >> 1;
+#endif
+  }
+#else
+  enabled_ = false;
+  omp_thread_max_ = 1;
+#endif
+}
+
+int OpenMP::GetRecommendedOMPThreadCount() const {
+#ifdef _OPENMP
+  if (omp_num_threads_set_in_environment) {
+return omp_get_max_threads();
+  }
+  if (enabled_) {
+#ifdef ARCH_IS_INTEL_X86
+// x86 does hyperthreading, but do to cache issues, it's faster to only 
use # true CPUs
+const int thread_count = omp_get_max_threads() >> 1;
+#else
+const int thread_count = omp_get_max_threads();
+#endif
+if (!omp_thread_max_ || thread_count < omp_thread_max_) {
+  return thread_count;
+}
+return omp_thread_max_;
+  }
+  return 1;
+#else
+  return 1;
+#endif
+}
+
+}  // namespace engine
+}  // namespace mxnet
+

[GitHub] squidszyd commented on issue #8531: How to get the correlation result of two feature maps?

2017-11-03 Thread GitBox
squidszyd commented on issue #8531: How to get the correlation result of two 
feature maps?
URL: 
https://github.com/apache/incubator-mxnet/issues/8531#issuecomment-341863402
 
 
   @zhreshold By correlation, I mean F2 acts like a correlation kernel (or 
convolution kernel) that slides on F1. For example,
   ```
1 1 1 2 2
   F1 = 2 3 4 1 1
0 0 0 2 3 
   
0 1 0
   F2 = 1 0 1
0 1 0
   ``` 
   Then, the correlation result should be
   ```
   R = F1 * F2 = 7 5 9
   ```
   where 
   ```
   7 = 1 + 2 + 4 + 0
   5 = 1 + 3 + 1 + 0
   9 = 2 + 4 + 1 + 2
   ```
   In the above example, stride = 1, pad = 0, dilate = 0, thus there are three 
correlation positions at F1


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] squidszyd commented on issue #8531: How to get the correlation result of two feature maps?

2017-11-03 Thread GitBox
squidszyd commented on issue #8531: How to get the correlation result of two 
feature maps?
URL: 
https://github.com/apache/incubator-mxnet/issues/8531#issuecomment-341863402
 
 
   @zhreshold By correlation, I mean F2 acts like a correlation kernel (or 
convolution kernel) that slides on F1. For example,
   ```
1 1 1 2 2
   F1 = 2 3 4 1 1
0 0 0 2 3 
   
0 1 0
   F2 = 1 0 1
0 1 0
   ``` 
   Then, the correlation result should be
   ```
   R = F1 * F2 = 7 5 9
   ```
   where 
   ```
   7 = 1 + 2 + 4 + 0
   5 = 1 + 3 + 1 + 0
   9 = 2 + 4 + 1 + 2
   ```
   In the above example, stride = 1, pad = 0, dilate = 0, thus there are three 
correlation position at F1


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] squidszyd commented on issue #8531: How to get the correlation result of two feature maps?

2017-11-03 Thread GitBox
squidszyd commented on issue #8531: How to get the correlation result of two 
feature maps?
URL: 
https://github.com/apache/incubator-mxnet/issues/8531#issuecomment-341863402
 
 
   @zhreshold By correlation, I mean F2 acts like a correlation kernel (or 
convolution kernel) that slides on F1. For example,
   ```
1 1 1 2 2
   F2 = 2 3 4 1 1
0 0 0 2 3 
   
0 1 0
   F1 = 1 0 1
0 1 0
   ``` 
   Then, the correlation result should be
   ```
   R = F1 * F2 = 7 5 9
   ```
   where 
   ```
   7 = 1 + 2 + 4 + 0
   5 = 1 + 3 + 1 + 0
   9 = 2 + 4 + 1 + 2


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] squidszyd commented on issue #8531: How to get the correlation result of two feature maps?

2017-11-03 Thread GitBox
squidszyd commented on issue #8531: How to get the correlation result of two 
feature maps?
URL: 
https://github.com/apache/incubator-mxnet/issues/8531#issuecomment-341863402
 
 
   @zhreshold By correlation, I mean F2 acts like a correlation kernel (or 
convolution kernel) that slides on F1. For example,
   ```
1 1 1 2 2
   F2 = 2 3 4 1 1
0 0 0 2 3 
   
0 1 0
   F1 = 1 0 1
0 1 0
   ``` 
   Then, the correlation result should be
   ```
   R = F1 * F2 = 7 5 9
   ```
   where 
   ```
   7 = 1 + 2 + 4 +?
   5 = 1 + 3 +?+ 0
   9 = 2 + 4 + 1 + 2


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] masahi commented on issue #8537: [Gluon] F.pad(...) doesn't accept symbol input even in HybridBlock?

2017-11-03 Thread GitBox
masahi commented on issue #8537: [Gluon] F.pad(...) doesn't accept symbol input 
even in HybridBlock?
URL: 
https://github.com/apache/incubator-mxnet/issues/8537#issuecomment-341861303
 
 
   cc @zhanghang1989 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] masahi commented on issue #8537: [Gluon] F.pad(...) doesn't accept symbol input even in HybridBlock?

2017-11-03 Thread GitBox
masahi commented on issue #8537: [Gluon] F.pad(...) doesn't accept symbol input 
even in HybridBlock?
URL: 
https://github.com/apache/incubator-mxnet/issues/8537#issuecomment-341861303
 
 
   @zhanghang1989 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] masahi opened a new issue #8537: [Gluon] F.pad(...) doesn't accept symbol input even in HybridBlock?

2017-11-03 Thread GitBox
masahi opened a new issue #8537: [Gluon] F.pad(...) doesn't accept symbol input 
even in HybridBlock?
URL: https://github.com/apache/incubator-mxnet/issues/8537
 
 
   Hi, I am trying to dump a json string from 
[this](https://github.com/apache/incubator-mxnet/tree/master/example/gluon/style_transfer)
 very nice style transfer model, written in gluon API.
   
   I modified the net definition so that all blocks inherent from HybridBlock 
and use HybridSequential. The modified one is 
[here](https://gist.github.com/masahi/9c70bfc86fe0d63b1b690d991f5e8916).
   
   I think this is enough to dump a json from gluon model. But when I try this,
   
   ```
   style_model = net.Net(ngf=args.ngf)
   x = mx.sym.var("x")
   y = style_model(x)
   y.save("style_model.json")
   ```
   
   It gives me the following error:
   ```
   Traceback (most recent call last):
 File "main.py", line 234, in 
   main()
 File "main.py", line 223, in main
   evaluate(args)
 File "main.py", line 146, in evaluate
   y = style_model(x)
 File 
"C:\Users\masah\Anaconda3\lib\site-packages\mxnet-0.12.0-py3.6.egg\mxnet\gluon\block.py",
 line 290, in __call__
   return self.forward(*args)
 File "D:\project\dev\incubator-mxnet\example\gluon\style_transfer\net.py", 
line 231, in forward
   return self.model(input)
 File 
"C:\Users\masah\Anaconda3\lib\site-packages\mxnet-0.12.0-py3.6.egg\mxnet\gluon\block.py",
 line 290, in __call__
   return self.forward(*args)
 File 
"C:\Users\masah\Anaconda3\lib\site-packages\mxnet-0.12.0-py3.6.egg\mxnet\gluon\block.py",
 line 481, in forward
   return self.hybrid_forward(symbol, x, *args, **params)
 File 
"C:\Users\masah\Anaconda3\lib\site-packages\mxnet-0.12.0-py3.6.egg\mxnet\gluon\nn\basic_layers.py",
 line 107, in hybrid_forward
   x = block(x)
 File 
"C:\Users\masah\Anaconda3\lib\site-packages\mxnet-0.12.0-py3.6.egg\mxnet\gluon\block.py",
 line 290, in __call__
   return self.forward(*args)
 File 
"C:\Users\masah\Anaconda3\lib\site-packages\mxnet-0.12.0-py3.6.egg\mxnet\gluon\block.py",
 line 481, in forward
   return self.hybrid_forward(symbol, x, *args, **params)
 File 
"C:\Users\masah\Anaconda3\lib\site-packages\mxnet-0.12.0-py3.6.egg\mxnet\gluon\nn\basic_layers.py",
 line 107, in hybrid_forward
   x = block(x)
 File 
"C:\Users\masah\Anaconda3\lib\site-packages\mxnet-0.12.0-py3.6.egg\mxnet\gluon\block.py",
 line 290, in __call__
   return self.forward(*args)
 File "D:\project\dev\incubator-mxnet\example\gluon\style_transfer\net.py", 
line 139, in forward
   x = self.pad(x)
 File 
"C:\Users\masah\Anaconda3\lib\site-packages\mxnet-0.12.0-py3.6.egg\mxnet\gluon\block.py",
 line 290, in __call__
   return self.forward(*args)
 File "D:\project\dev\incubator-mxnet\example\gluon\style_transfer\net.py", 
line 60, in forward
   return F.pad(x, mode='reflect', pad_width=self.pad_width)
 File "", line 112, in pad
   AssertionError: Argument data must have NDArray type, but got 
   ```
   
   The error is occurring at 
[here](https://gist.github.com/masahi/9c70bfc86fe0d63b1b690d991f5e8916#file-net_modified-py-L60),
 inside HybridBlock.
   
   Am I missing something, or is F.pad(...) indeed supposed to only accept 
NDArray inputs?
   
   Thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha closed issue #7307: resnet

2017-11-03 Thread GitBox
szha closed issue #7307: resnet
URL: https://github.com/apache/incubator-mxnet/issues/7307
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha closed issue #7335: Exception in threads kills entire process

2017-11-03 Thread GitBox
szha closed issue #7335: Exception in threads kills entire process
URL: https://github.com/apache/incubator-mxnet/issues/7335
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha closed issue #7333: Imperative Programs Tend to be More Flexible (but do not make much sense ;)

2017-11-03 Thread GitBox
szha closed issue #7333: Imperative Programs Tend to be More Flexible (but do 
not make much sense ;)
URL: https://github.com/apache/incubator-mxnet/issues/7333
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha closed issue #6475: Error in .infer_shape for deconvolution when the input is 5D batch-num_filter-z-y-x

2017-11-03 Thread GitBox
szha closed issue #6475: Error in .infer_shape for deconvolution when the input 
is 5D batch-num_filter-z-y-x
URL: https://github.com/apache/incubator-mxnet/issues/6475
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha closed issue #7189: Gradients are wrong for is_train=False

2017-11-03 Thread GitBox
szha closed issue #7189: Gradients are wrong for is_train=False
URL: https://github.com/apache/incubator-mxnet/issues/7189
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #7307: resnet

2017-11-03 Thread GitBox
szha commented on issue #7307: resnet
URL: 
https://github.com/apache/incubator-mxnet/issues/7307#issuecomment-341857615
 
 
   This issue is closed due to lack of activity in the last 90 days. Feel free 
to ping me to reopen if this is still an active issue. Thanks!
   Also, do please check out our [forum](https://discuss.mxnet.io/) (and 
[Chinese version](https://discuss.gluon.ai/)) for general "how-to" questions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #7189: Gradients are wrong for is_train=False

2017-11-03 Thread GitBox
szha commented on issue #7189: Gradients are wrong for is_train=False
URL: 
https://github.com/apache/incubator-mxnet/issues/7189#issuecomment-341857610
 
 
   This issue is closed due to lack of activity in the last 90 days. Feel free 
to ping me to reopen if this is still an active issue. Thanks!
   Also, do please check out our [forum](https://discuss.mxnet.io/) (and 
[Chinese version](https://discuss.gluon.ai/)) for general "how-to" questions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #7335: Exception in threads kills entire process

2017-11-03 Thread GitBox
szha commented on issue #7335: Exception in threads kills entire process
URL: 
https://github.com/apache/incubator-mxnet/issues/7335#issuecomment-341857619
 
 
   This issue is closed due to lack of activity in the last 90 days. Feel free 
to ping me to reopen if this is still an active issue. Thanks!
   Also, do please check out our [forum](https://discuss.mxnet.io/) (and 
[Chinese version](https://discuss.gluon.ai/)) for general "how-to" questions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #6475: Error in .infer_shape for deconvolution when the input is 5D batch-num_filter-z-y-x

2017-11-03 Thread GitBox
szha commented on issue #6475: Error in .infer_shape for deconvolution when the 
input is 5D batch-num_filter-z-y-x
URL: 
https://github.com/apache/incubator-mxnet/issues/6475#issuecomment-341857611
 
 
   This issue is closed due to lack of activity in the last 90 days. Feel free 
to ping me to reopen if this is still an active issue. Thanks!
   Also, do please check out our [forum](https://discuss.mxnet.io/) (and 
[Chinese version](https://discuss.gluon.ai/)) for general "how-to" questions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #7333: Imperative Programs Tend to be More Flexible (but do not make much sense ;)

2017-11-03 Thread GitBox
szha commented on issue #7333: Imperative Programs Tend to be More Flexible 
(but do not make much sense ;)
URL: 
https://github.com/apache/incubator-mxnet/issues/7333#issuecomment-341857606
 
 
   This issue is closed due to lack of activity in the last 90 days. Feel free 
to ping me to reopen if this is still an active issue. Thanks!
   Also, do please check out our [forum](https://discuss.mxnet.io/) (and 
[Chinese version](https://discuss.gluon.ai/)) for general "how-to" questions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] rahul003 commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
rahul003 commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148720845
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -635,6 +656,302 @@ class CommDevice : public Comm {
   bool inited_;
 };
 
+#if MXNET_USE_NCCL
+class CommNCCL : public Comm {
+ public:
+  CommNCCL() {
+inited_ = false;
+pinned_ctx_ = Context::CPUPinned(0);
+  }
+
+  virtual ~CommNCCL() {
+for (auto e : nccl_data_) {
+  cudaStreamDestroy(e.second.stream);
+  ncclCommDestroy(e.second.comm);
+}
+  }
+
+  void Init(int key, const NDArrayStorageType stype, const TShape& shape,
+int dtype = mshadow::kFloat32, Context pinned_ctx = 
Context::CPUPinned(0)) override {
+if (stype == kDefaultStorage) {
+  sorted_key_attrs_.push_back(std::make_tuple(key, shape, dtype));
+} else {
+  LOG(FATAL) << "NCCL KVStore does not support sparse storage type";
+}
+  }
+
+  const NDArray& Reduce(int key, const std::vector& src,
+int priority) override {
+// avoid extra copy for single device, but it may bring problems for
+// abnormal usage of kvstore
+if (src.size() == 1) {
+  return src[0];
+}
+
+if (!inited_) {
+  std::vector devs;
+  for (const auto& a : src) {
+devs.push_back(a.ctx());
+  }
+  InitNCCL(devs);
+  InitMergeBuffer(devs);
+}
+
+std::vector dev_ids;
+for (auto e : src) {
+  dev_ids.push_back(e.ctx().dev_id);
+}
+std::sort(dev_ids.begin(), dev_ids.end());
+CHECK(device_ids_ == dev_ids) << "NCCL KVStore supports only single set of 
devices";
+
+auto& buf = merge_buf_[key];
+int root = buf.merged.ctx().dev_id;
+size_t root_id = -1;
+for (size_t i = 0; i < src.size(); ++i) {
+  if (src[i].ctx().dev_id == root) {
+root_id = i;
+break;
+  }
+}
+
+auto& reduce = buf.merged;
+
+std::vector const_vars;
+for (size_t i = 0; i < src.size(); ++i) {
+  const_vars.push_back(src[i].var());
+}
+Engine::Get()->PushSync([src, reduce, root_id, this](RunContext rctx) {
+  {
+std::lock_guard 
l(Storage::Get()->GetMutex(Context::kGPU));
+int root = nccl_data_[src[root_id].ctx().dev_id].rank;
+ncclGroupStart();
+for (size_t i = 0; i < src.size(); ++i) {
+  NCCLEntry cur = nccl_data_[src[i].ctx().dev_id];
+  if (i == root_id) {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
 
 Review comment:
   Ya, please indent the Engine Pushes similar to how it is in the existing 
code base. 
   Also, it might be better to move the contents of callback to a different 
function to make it cleaner.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ptrendx commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
ptrendx commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148915871
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -635,6 +656,302 @@ class CommDevice : public Comm {
   bool inited_;
 };
 
+#if MXNET_USE_NCCL
+class CommNCCL : public Comm {
+ public:
+  CommNCCL() {
+inited_ = false;
+pinned_ctx_ = Context::CPUPinned(0);
+  }
+
+  virtual ~CommNCCL() {
+for (auto e : nccl_data_) {
+  cudaStreamDestroy(e.second.stream);
+  ncclCommDestroy(e.second.comm);
+}
+  }
+
+  void Init(int key, const NDArrayStorageType stype, const TShape& shape,
+int dtype = mshadow::kFloat32, Context pinned_ctx = 
Context::CPUPinned(0)) override {
+if (stype == kDefaultStorage) {
+  sorted_key_attrs_.push_back(std::make_tuple(key, shape, dtype));
+} else {
+  LOG(FATAL) << "NCCL KVStore does not support sparse storage type";
+}
+  }
+
+  const NDArray& Reduce(int key, const std::vector& src,
+int priority) override {
+// avoid extra copy for single device, but it may bring problems for
+// abnormal usage of kvstore
+if (src.size() == 1) {
+  return src[0];
+}
+
+if (!inited_) {
+  std::vector devs;
+  for (const auto& a : src) {
+devs.push_back(a.ctx());
+  }
+  InitNCCL(devs);
+  InitMergeBuffer(devs);
+}
+
+std::vector dev_ids;
+for (auto e : src) {
+  dev_ids.push_back(e.ctx().dev_id);
+}
+std::sort(dev_ids.begin(), dev_ids.end());
+CHECK(device_ids_ == dev_ids) << "NCCL KVStore supports only single set of 
devices";
+
+auto& buf = merge_buf_[key];
+int root = buf.merged.ctx().dev_id;
+size_t root_id = -1;
+for (size_t i = 0; i < src.size(); ++i) {
+  if (src[i].ctx().dev_id == root) {
+root_id = i;
+break;
+  }
+}
+
+auto& reduce = buf.merged;
+
+std::vector const_vars;
+for (size_t i = 0; i < src.size(); ++i) {
+  const_vars.push_back(src[i].var());
+}
+Engine::Get()->PushSync([src, reduce, root_id, this](RunContext rctx) {
+  {
+std::lock_guard 
l(Storage::Get()->GetMutex(Context::kGPU));
+int root = nccl_data_[src[root_id].ctx().dev_id].rank;
+ncclGroupStart();
+for (size_t i = 0; i < src.size(); ++i) {
+  NCCLEntry cur = nccl_data_[src[i].ctx().dev_id];
+  if (i == root_id) {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+reduce.data().dptr(),
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  } else {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+NULL,
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  }
+}
+ncclGroupEnd();
+  }
+},
+Context::CPU(),
+const_vars,
+{reduce.var()},
+FnProperty::kCPUPrioritized,
+priority,
+PROFILER_MESSAGE("KVStoreReduce"));
+
+return buf.merged;
+  }
+
+  void CommSync(const std::vector& dst,
+int priority) override {
+std::vector const_vars;
+std::vector mutate_vars;
+for (size_t i = 0; i < dst.size(); ++i) {
+mutate_vars.push_back(dst[i]->var());
+}
+Engine::Get()->PushSync([this](RunContext rctx) {
+  for (auto cur : nccl_data_) {
+CUDA_CALL(cudaSetDevice(cur.second.dev_id));
+CUDA_CALL(cudaStreamSynchronize(cur.second.stream));
+  }
+},
+Context::CPU(),
+const_vars,
+mutate_vars,
+FnProperty::kCPUPrioritized,
+priority,
+PROFILER_MESSAGE("KVStoreStreamSync"));
+  }
+
+  void CommSync(const std::vector& dst,
+int priority) override {
+std::vector const_vars;
+std::vector mutate_vars;
+for (size_t i = 0; i < dst.size(); ++i) {
+mutate_vars.push_back(dst[i].var());
+}
+Engine::Get()->PushSync([this](RunContext rctx) {
+  for (auto cur : nccl_data_) {
+CUDA_CALL(cudaSetDevice(cur.second.dev_id));
+CUDA_CALL(cudaStreamSynchronize(cur.second.stream));
+  }
+},
+Context::CPU(),
+const_vars,
+mutate_vars,
+FnProperty::kCPUPrioritized,
+priority,
+

[GitHub] ptrendx commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
ptrendx commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148915254
 
 

 ##
 File path: include/mxnet/kvstore.h
 ##
 @@ -162,7 +162,7 @@ class KVStore {
* \param priority Priority of the action.
*/
   virtual void Pull(const std::vector& keys,
-const std::vector& values,
+const std::vector& values,
 
 Review comment:
   No, and I changed it back (although this makes some of the other functions 
kind of ugly when you need to support both pointers and references to ndarrays).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ptrendx commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
ptrendx commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148915172
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -635,6 +656,302 @@ class CommDevice : public Comm {
   bool inited_;
 };
 
+#if MXNET_USE_NCCL
+class CommNCCL : public Comm {
+ public:
+  CommNCCL() {
+inited_ = false;
+pinned_ctx_ = Context::CPUPinned(0);
+  }
+
+  virtual ~CommNCCL() {
+for (auto e : nccl_data_) {
+  cudaStreamDestroy(e.second.stream);
+  ncclCommDestroy(e.second.comm);
+}
+  }
+
+  void Init(int key, const NDArrayStorageType stype, const TShape& shape,
+int dtype = mshadow::kFloat32, Context pinned_ctx = 
Context::CPUPinned(0)) override {
+if (stype == kDefaultStorage) {
+  sorted_key_attrs_.push_back(std::make_tuple(key, shape, dtype));
+} else {
+  LOG(FATAL) << "NCCL KVStore does not support sparse storage type";
+}
+  }
+
+  const NDArray& Reduce(int key, const std::vector& src,
+int priority) override {
+// avoid extra copy for single device, but it may bring problems for
+// abnormal usage of kvstore
+if (src.size() == 1) {
+  return src[0];
+}
+
+if (!inited_) {
+  std::vector devs;
+  for (const auto& a : src) {
+devs.push_back(a.ctx());
+  }
+  InitNCCL(devs);
+  InitMergeBuffer(devs);
+}
+
+std::vector dev_ids;
+for (auto e : src) {
+  dev_ids.push_back(e.ctx().dev_id);
+}
+std::sort(dev_ids.begin(), dev_ids.end());
+CHECK(device_ids_ == dev_ids) << "NCCL KVStore supports only single set of 
devices";
+
+auto& buf = merge_buf_[key];
+int root = buf.merged.ctx().dev_id;
+size_t root_id = -1;
+for (size_t i = 0; i < src.size(); ++i) {
+  if (src[i].ctx().dev_id == root) {
+root_id = i;
+break;
+  }
+}
+
+auto& reduce = buf.merged;
+
+std::vector const_vars;
+for (size_t i = 0; i < src.size(); ++i) {
+  const_vars.push_back(src[i].var());
+}
+Engine::Get()->PushSync([src, reduce, root_id, this](RunContext rctx) {
+  {
+std::lock_guard 
l(Storage::Get()->GetMutex(Context::kGPU));
+int root = nccl_data_[src[root_id].ctx().dev_id].rank;
+ncclGroupStart();
+for (size_t i = 0; i < src.size(); ++i) {
+  NCCLEntry cur = nccl_data_[src[i].ctx().dev_id];
+  if (i == root_id) {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+reduce.data().dptr(),
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  } else {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+NULL,
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  }
+}
+ncclGroupEnd();
+  }
+},
+Context::CPU(),
+const_vars,
+{reduce.var()},
+FnProperty::kCPUPrioritized,
+priority,
+PROFILER_MESSAGE("KVStoreReduce"));
+
+return buf.merged;
+  }
+
+  void CommSync(const std::vector& dst,
+int priority) override {
+std::vector const_vars;
+std::vector mutate_vars;
+for (size_t i = 0; i < dst.size(); ++i) {
+mutate_vars.push_back(dst[i]->var());
+}
+Engine::Get()->PushSync([this](RunContext rctx) {
+  for (auto cur : nccl_data_) {
+CUDA_CALL(cudaSetDevice(cur.second.dev_id));
+CUDA_CALL(cudaStreamSynchronize(cur.second.stream));
+  }
+},
+Context::CPU(),
+const_vars,
+mutate_vars,
+FnProperty::kCPUPrioritized,
+priority,
+PROFILER_MESSAGE("KVStoreStreamSync"));
+  }
+
+  void CommSync(const std::vector& dst,
+int priority) override {
+std::vector const_vars;
+std::vector mutate_vars;
+for (size_t i = 0; i < dst.size(); ++i) {
+mutate_vars.push_back(dst[i].var());
+}
+Engine::Get()->PushSync([this](RunContext rctx) {
+  for (auto cur : nccl_data_) {
+CUDA_CALL(cudaSetDevice(cur.second.dev_id));
+CUDA_CALL(cudaStreamSynchronize(cur.second.stream));
+  }
+},
+Context::CPU(),
+const_vars,
+mutate_vars,
+FnProperty::kCPUPrioritized,
+priority,
+

[GitHub] ptrendx commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
ptrendx commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148911813
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -635,6 +656,302 @@ class CommDevice : public Comm {
   bool inited_;
 };
 
+#if MXNET_USE_NCCL
+class CommNCCL : public Comm {
+ public:
+  CommNCCL() {
+inited_ = false;
+pinned_ctx_ = Context::CPUPinned(0);
+  }
+
+  virtual ~CommNCCL() {
+for (auto e : nccl_data_) {
+  cudaStreamDestroy(e.second.stream);
+  ncclCommDestroy(e.second.comm);
+}
+  }
+
+  void Init(int key, const NDArrayStorageType stype, const TShape& shape,
+int dtype = mshadow::kFloat32, Context pinned_ctx = 
Context::CPUPinned(0)) override {
+if (stype == kDefaultStorage) {
+  sorted_key_attrs_.push_back(std::make_tuple(key, shape, dtype));
+} else {
+  LOG(FATAL) << "NCCL KVStore does not support sparse storage type";
+}
+  }
+
+  const NDArray& Reduce(int key, const std::vector& src,
+int priority) override {
+// avoid extra copy for single device, but it may bring problems for
+// abnormal usage of kvstore
+if (src.size() == 1) {
+  return src[0];
+}
+
+if (!inited_) {
+  std::vector devs;
+  for (const auto& a : src) {
+devs.push_back(a.ctx());
+  }
+  InitNCCL(devs);
+  InitMergeBuffer(devs);
+}
+
+std::vector dev_ids;
+for (auto e : src) {
+  dev_ids.push_back(e.ctx().dev_id);
+}
+std::sort(dev_ids.begin(), dev_ids.end());
+CHECK(device_ids_ == dev_ids) << "NCCL KVStore supports only single set of 
devices";
+
+auto& buf = merge_buf_[key];
+int root = buf.merged.ctx().dev_id;
+size_t root_id = -1;
+for (size_t i = 0; i < src.size(); ++i) {
+  if (src[i].ctx().dev_id == root) {
+root_id = i;
+break;
+  }
+}
+
+auto& reduce = buf.merged;
+
+std::vector const_vars;
+for (size_t i = 0; i < src.size(); ++i) {
+  const_vars.push_back(src[i].var());
+}
+Engine::Get()->PushSync([src, reduce, root_id, this](RunContext rctx) {
+  {
+std::lock_guard 
l(Storage::Get()->GetMutex(Context::kGPU));
+int root = nccl_data_[src[root_id].ctx().dev_id].rank;
+ncclGroupStart();
+for (size_t i = 0; i < src.size(); ++i) {
+  NCCLEntry cur = nccl_data_[src[i].ctx().dev_id];
+  if (i == root_id) {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+reduce.data().dptr(),
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  } else {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+NULL,
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  }
+}
+ncclGroupEnd();
+  }
+},
+Context::CPU(),
+const_vars,
+{reduce.var()},
+FnProperty::kCPUPrioritized,
+priority,
+PROFILER_MESSAGE("KVStoreReduce"));
+
+return buf.merged;
+  }
+
+  void CommSync(const std::vector& dst,
+int priority) override {
+std::vector const_vars;
+std::vector mutate_vars;
+for (size_t i = 0; i < dst.size(); ++i) {
+mutate_vars.push_back(dst[i]->var());
+}
+Engine::Get()->PushSync([this](RunContext rctx) {
+  for (auto cur : nccl_data_) {
+CUDA_CALL(cudaSetDevice(cur.second.dev_id));
+CUDA_CALL(cudaStreamSynchronize(cur.second.stream));
+  }
+},
+Context::CPU(),
+const_vars,
+mutate_vars,
+FnProperty::kCPUPrioritized,
+priority,
+PROFILER_MESSAGE("KVStoreStreamSync"));
+  }
+
+  void CommSync(const std::vector& dst,
+int priority) override {
+std::vector const_vars;
+std::vector mutate_vars;
+for (size_t i = 0; i < dst.size(); ++i) {
+mutate_vars.push_back(dst[i].var());
+}
+Engine::Get()->PushSync([this](RunContext rctx) {
+  for (auto cur : nccl_data_) {
+CUDA_CALL(cudaSetDevice(cur.second.dev_id));
+CUDA_CALL(cudaStreamSynchronize(cur.second.stream));
+  }
+},
+Context::CPU(),
+const_vars,
+mutate_vars,
+FnProperty::kCPUPrioritized,
+priority,
+

[GitHub] ptrendx commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
ptrendx commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148911419
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -635,6 +656,302 @@ class CommDevice : public Comm {
   bool inited_;
 };
 
+#if MXNET_USE_NCCL
+class CommNCCL : public Comm {
+ public:
+  CommNCCL() {
+inited_ = false;
+pinned_ctx_ = Context::CPUPinned(0);
+  }
+
+  virtual ~CommNCCL() {
+for (auto e : nccl_data_) {
+  cudaStreamDestroy(e.second.stream);
+  ncclCommDestroy(e.second.comm);
+}
+  }
+
+  void Init(int key, const NDArrayStorageType stype, const TShape& shape,
+int dtype = mshadow::kFloat32, Context pinned_ctx = 
Context::CPUPinned(0)) override {
+if (stype == kDefaultStorage) {
+  sorted_key_attrs_.push_back(std::make_tuple(key, shape, dtype));
+} else {
+  LOG(FATAL) << "NCCL KVStore does not support sparse storage type";
+}
+  }
+
+  const NDArray& Reduce(int key, const std::vector& src,
+int priority) override {
+// avoid extra copy for single device, but it may bring problems for
+// abnormal usage of kvstore
+if (src.size() == 1) {
+  return src[0];
+}
+
+if (!inited_) {
+  std::vector devs;
+  for (const auto& a : src) {
+devs.push_back(a.ctx());
+  }
+  InitNCCL(devs);
+  InitMergeBuffer(devs);
+}
+
+std::vector dev_ids;
+for (auto e : src) {
+  dev_ids.push_back(e.ctx().dev_id);
+}
+std::sort(dev_ids.begin(), dev_ids.end());
+CHECK(device_ids_ == dev_ids) << "NCCL KVStore supports only single set of 
devices";
+
+auto& buf = merge_buf_[key];
+int root = buf.merged.ctx().dev_id;
+size_t root_id = -1;
+for (size_t i = 0; i < src.size(); ++i) {
+  if (src[i].ctx().dev_id == root) {
+root_id = i;
+break;
+  }
+}
+
+auto& reduce = buf.merged;
+
+std::vector const_vars;
+for (size_t i = 0; i < src.size(); ++i) {
+  const_vars.push_back(src[i].var());
+}
+Engine::Get()->PushSync([src, reduce, root_id, this](RunContext rctx) {
+  {
+std::lock_guard 
l(Storage::Get()->GetMutex(Context::kGPU));
+int root = nccl_data_[src[root_id].ctx().dev_id].rank;
+ncclGroupStart();
+for (size_t i = 0; i < src.size(); ++i) {
+  NCCLEntry cur = nccl_data_[src[i].ctx().dev_id];
+  if (i == root_id) {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+reduce.data().dptr(),
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  } else {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+NULL,
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  }
+}
+ncclGroupEnd();
+  }
+},
+Context::CPU(),
+const_vars,
+{reduce.var()},
+FnProperty::kCPUPrioritized,
+priority,
+PROFILER_MESSAGE("KVStoreReduce"));
+
+return buf.merged;
+  }
+
+  void CommSync(const std::vector& dst,
+int priority) override {
+std::vector const_vars;
+std::vector mutate_vars;
+for (size_t i = 0; i < dst.size(); ++i) {
+mutate_vars.push_back(dst[i]->var());
+}
+Engine::Get()->PushSync([this](RunContext rctx) {
+  for (auto cur : nccl_data_) {
+CUDA_CALL(cudaSetDevice(cur.second.dev_id));
+CUDA_CALL(cudaStreamSynchronize(cur.second.stream));
+  }
+},
+Context::CPU(),
+const_vars,
+mutate_vars,
+FnProperty::kCPUPrioritized,
+priority,
+PROFILER_MESSAGE("KVStoreStreamSync"));
+  }
+
+  void CommSync(const std::vector& dst,
+int priority) override {
+std::vector const_vars;
+std::vector mutate_vars;
+for (size_t i = 0; i < dst.size(); ++i) {
+mutate_vars.push_back(dst[i].var());
+}
+Engine::Get()->PushSync([this](RunContext rctx) {
+  for (auto cur : nccl_data_) {
+CUDA_CALL(cudaSetDevice(cur.second.dev_id));
+CUDA_CALL(cudaStreamSynchronize(cur.second.stream));
+  }
+},
+Context::CPU(),
+const_vars,
+mutate_vars,
+FnProperty::kCPUPrioritized,
+priority,
+

[GitHub] rahul003 commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
rahul003 commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148909048
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -32,6 +35,21 @@
 #include "mxnet/ndarray.h"
 #include "../ndarray/ndarray_function.h"
 #include "../operator/tensor/sparse_retain-inl.h"
+
+#if MXNET_USE_NCCL
 
 Review comment:
   Ok, then we can leave them as is


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ptrendx commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
ptrendx commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148906667
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -635,6 +656,302 @@ class CommDevice : public Comm {
   bool inited_;
 };
 
+#if MXNET_USE_NCCL
+class CommNCCL : public Comm {
+ public:
+  CommNCCL() {
+inited_ = false;
+pinned_ctx_ = Context::CPUPinned(0);
+  }
+
+  virtual ~CommNCCL() {
+for (auto e : nccl_data_) {
+  cudaStreamDestroy(e.second.stream);
+  ncclCommDestroy(e.second.comm);
+}
+  }
+
+  void Init(int key, const NDArrayStorageType stype, const TShape& shape,
+int dtype = mshadow::kFloat32, Context pinned_ctx = 
Context::CPUPinned(0)) override {
+if (stype == kDefaultStorage) {
+  sorted_key_attrs_.push_back(std::make_tuple(key, shape, dtype));
+} else {
+  LOG(FATAL) << "NCCL KVStore does not support sparse storage type";
+}
+  }
+
+  const NDArray& Reduce(int key, const std::vector& src,
+int priority) override {
+// avoid extra copy for single device, but it may bring problems for
+// abnormal usage of kvstore
+if (src.size() == 1) {
+  return src[0];
+}
+
+if (!inited_) {
+  std::vector devs;
+  for (const auto& a : src) {
+devs.push_back(a.ctx());
+  }
+  InitNCCL(devs);
+  InitMergeBuffer(devs);
+}
+
+std::vector dev_ids;
+for (auto e : src) {
+  dev_ids.push_back(e.ctx().dev_id);
+}
+std::sort(dev_ids.begin(), dev_ids.end());
+CHECK(device_ids_ == dev_ids) << "NCCL KVStore supports only single set of 
devices";
+
+auto& buf = merge_buf_[key];
+int root = buf.merged.ctx().dev_id;
+size_t root_id = -1;
+for (size_t i = 0; i < src.size(); ++i) {
+  if (src[i].ctx().dev_id == root) {
+root_id = i;
+break;
+  }
+}
+
+auto& reduce = buf.merged;
+
+std::vector const_vars;
+for (size_t i = 0; i < src.size(); ++i) {
+  const_vars.push_back(src[i].var());
+}
+Engine::Get()->PushSync([src, reduce, root_id, this](RunContext rctx) {
+  {
+std::lock_guard 
l(Storage::Get()->GetMutex(Context::kGPU));
+int root = nccl_data_[src[root_id].ctx().dev_id].rank;
+ncclGroupStart();
+for (size_t i = 0; i < src.size(); ++i) {
+  NCCLEntry cur = nccl_data_[src[i].ctx().dev_id];
+  if (i == root_id) {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+reduce.data().dptr(),
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  } else {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+NULL,
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  }
+}
+ncclGroupEnd();
+  }
+},
+Context::CPU(),
+const_vars,
+{reduce.var()},
+FnProperty::kCPUPrioritized,
+priority,
+PROFILER_MESSAGE("KVStoreReduce"));
+
+return buf.merged;
+  }
+
+  void CommSync(const std::vector& dst,
+int priority) override {
+std::vector const_vars;
+std::vector mutate_vars;
+for (size_t i = 0; i < dst.size(); ++i) {
+mutate_vars.push_back(dst[i]->var());
+}
+Engine::Get()->PushSync([this](RunContext rctx) {
+  for (auto cur : nccl_data_) {
+CUDA_CALL(cudaSetDevice(cur.second.dev_id));
+CUDA_CALL(cudaStreamSynchronize(cur.second.stream));
+  }
+},
+Context::CPU(),
+const_vars,
+mutate_vars,
+FnProperty::kCPUPrioritized,
+priority,
+PROFILER_MESSAGE("KVStoreStreamSync"));
+  }
+
+  void CommSync(const std::vector& dst,
+int priority) override {
+std::vector const_vars;
+std::vector mutate_vars;
+for (size_t i = 0; i < dst.size(); ++i) {
+mutate_vars.push_back(dst[i].var());
+}
+Engine::Get()->PushSync([this](RunContext rctx) {
+  for (auto cur : nccl_data_) {
+CUDA_CALL(cudaSetDevice(cur.second.dev_id));
+CUDA_CALL(cudaStreamSynchronize(cur.second.stream));
+  }
+},
+Context::CPU(),
+const_vars,
+mutate_vars,
+FnProperty::kCPUPrioritized,
+priority,
+

[GitHub] ptrendx commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
ptrendx commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148906600
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -635,6 +656,302 @@ class CommDevice : public Comm {
   bool inited_;
 };
 
+#if MXNET_USE_NCCL
+class CommNCCL : public Comm {
+ public:
+  CommNCCL() {
+inited_ = false;
+pinned_ctx_ = Context::CPUPinned(0);
+  }
+
+  virtual ~CommNCCL() {
+for (auto e : nccl_data_) {
+  cudaStreamDestroy(e.second.stream);
+  ncclCommDestroy(e.second.comm);
+}
+  }
+
+  void Init(int key, const NDArrayStorageType stype, const TShape& shape,
+int dtype = mshadow::kFloat32, Context pinned_ctx = 
Context::CPUPinned(0)) override {
+if (stype == kDefaultStorage) {
+  sorted_key_attrs_.push_back(std::make_tuple(key, shape, dtype));
+} else {
+  LOG(FATAL) << "NCCL KVStore does not support sparse storage type";
+}
+  }
+
+  const NDArray& Reduce(int key, const std::vector& src,
+int priority) override {
+// avoid extra copy for single device, but it may bring problems for
+// abnormal usage of kvstore
+if (src.size() == 1) {
+  return src[0];
+}
+
+if (!inited_) {
+  std::vector devs;
+  for (const auto& a : src) {
+devs.push_back(a.ctx());
+  }
+  InitNCCL(devs);
+  InitMergeBuffer(devs);
+}
+
+std::vector dev_ids;
+for (auto e : src) {
+  dev_ids.push_back(e.ctx().dev_id);
+}
+std::sort(dev_ids.begin(), dev_ids.end());
+CHECK(device_ids_ == dev_ids) << "NCCL KVStore supports only single set of 
devices";
+
+auto& buf = merge_buf_[key];
+int root = buf.merged.ctx().dev_id;
+size_t root_id = -1;
+for (size_t i = 0; i < src.size(); ++i) {
+  if (src[i].ctx().dev_id == root) {
+root_id = i;
+break;
+  }
+}
+
+auto& reduce = buf.merged;
+
+std::vector const_vars;
+for (size_t i = 0; i < src.size(); ++i) {
+  const_vars.push_back(src[i].var());
+}
+Engine::Get()->PushSync([src, reduce, root_id, this](RunContext rctx) {
+  {
+std::lock_guard 
l(Storage::Get()->GetMutex(Context::kGPU));
+int root = nccl_data_[src[root_id].ctx().dev_id].rank;
+ncclGroupStart();
+for (size_t i = 0; i < src.size(); ++i) {
+  NCCLEntry cur = nccl_data_[src[i].ctx().dev_id];
+  if (i == root_id) {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+reduce.data().dptr(),
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  } else {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+NULL,
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  }
+}
+ncclGroupEnd();
+  }
+},
+Context::CPU(),
+const_vars,
+{reduce.var()},
+FnProperty::kCPUPrioritized,
+priority,
+PROFILER_MESSAGE("KVStoreReduce"));
+
+return buf.merged;
+  }
+
+  void CommSync(const std::vector& dst,
+int priority) override {
+std::vector const_vars;
+std::vector mutate_vars;
+for (size_t i = 0; i < dst.size(); ++i) {
+mutate_vars.push_back(dst[i]->var());
+}
+Engine::Get()->PushSync([this](RunContext rctx) {
+  for (auto cur : nccl_data_) {
+CUDA_CALL(cudaSetDevice(cur.second.dev_id));
+CUDA_CALL(cudaStreamSynchronize(cur.second.stream));
+  }
+},
+Context::CPU(),
+const_vars,
 
 Review comment:
   Ok.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ptrendx commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
ptrendx commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148905967
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -635,6 +656,302 @@ class CommDevice : public Comm {
   bool inited_;
 };
 
+#if MXNET_USE_NCCL
+class CommNCCL : public Comm {
+ public:
+  CommNCCL() {
+inited_ = false;
+pinned_ctx_ = Context::CPUPinned(0);
+  }
+
+  virtual ~CommNCCL() {
+for (auto e : nccl_data_) {
+  cudaStreamDestroy(e.second.stream);
+  ncclCommDestroy(e.second.comm);
+}
+  }
+
+  void Init(int key, const NDArrayStorageType stype, const TShape& shape,
+int dtype = mshadow::kFloat32, Context pinned_ctx = 
Context::CPUPinned(0)) override {
+if (stype == kDefaultStorage) {
+  sorted_key_attrs_.push_back(std::make_tuple(key, shape, dtype));
+} else {
+  LOG(FATAL) << "NCCL KVStore does not support sparse storage type";
+}
+  }
+
+  const NDArray& Reduce(int key, const std::vector& src,
+int priority) override {
+// avoid extra copy for single device, but it may bring problems for
+// abnormal usage of kvstore
+if (src.size() == 1) {
+  return src[0];
+}
+
+if (!inited_) {
+  std::vector devs;
+  for (const auto& a : src) {
+devs.push_back(a.ctx());
+  }
+  InitNCCL(devs);
+  InitMergeBuffer(devs);
+}
+
+std::vector dev_ids;
+for (auto e : src) {
+  dev_ids.push_back(e.ctx().dev_id);
+}
+std::sort(dev_ids.begin(), dev_ids.end());
+CHECK(device_ids_ == dev_ids) << "NCCL KVStore supports only single set of 
devices";
 
 Review comment:
   Yes. Handling multiple sets of devices can be done, but not with the 
structure imposed by the `Comm` class. Basically in order to keep the benefits 
of batching I need to ensure that the root for the reduction will be the same 
for the whole batch, but I know who participates only during the actual 
push/pull, not during Init, and all of the data structures are initialized only 
once during the first push. This BTW should also be checked in the `device` 
kvstore (and currently is not), otherwise you can do something like this:
   ```
   >>> import mxnet as mx
   >>> kv = mx.kv.create("device")
   >>> shape = (2,3)
   >>> kv.init(4, mx.nd.ones(shape))
   >>> gpus = [mx.gpu(i) for i in range(2)]
   >>> b = [mx.nd.ones(shape, gpu) for gpu in gpus]
   >>> kv.push(4, b)
   >>> a = mx.nd.zeros(shape)
   >>> kv.pull(4, out = a)
   >>> a
   [[ 2.  2.  2.]
[ 2.  2.  2.]]
   
   >>> gpus = [mx.gpu(i) for i in range(4)]
   >>> 
   >>> b = [mx.nd.ones(shape, gpu) for gpu in gpus]
   >>> kv.push(4, b)
   Segmentation fault
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong closed pull request #8533: Tweak a markdown list in note_engine.md

2017-11-03 Thread GitBox
piiswrong closed pull request #8533: Tweak a markdown list in note_engine.md
URL: https://github.com/apache/incubator-mxnet/pull/8533
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/architecture/note_engine.md b/docs/architecture/note_engine.md
index dc0b84aa53..237d8d61b7 100644
--- a/docs/architecture/note_engine.md
+++ b/docs/architecture/note_engine.md
@@ -274,7 +274,7 @@ most existing code can be scheduled by the dependency 
engine in two steps:
 
 
 1. Allocate the variable tags associated with resources like memory blob, 
PRNGS.
-   - Call `push` with the execution function as the original code to 
execute, and put the variable tags of
+2. Call `push` with the execution function as the original code to execute, 
and put the variable tags of
   corresponding resources correctly in `read_vars` and `mutate_vars`.
 
 ## Implementing the Generic Dependency Engine


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet] branch master updated: Tweak a markdown list in note_engine.md (#8533)

2017-11-03 Thread jxie
This is an automated email from the ASF dual-hosted git repository.

jxie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new dd278f3  Tweak a markdown list in note_engine.md (#8533)
dd278f3 is described below

commit dd278f3fbbf7d1763a56ecbb5ae8caae09900045
Author: Brad Bowman 
AuthorDate: Fri Nov 3 22:57:24 2017 +0100

Tweak a markdown list in note_engine.md (#8533)
---
 docs/architecture/note_engine.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/architecture/note_engine.md b/docs/architecture/note_engine.md
index dc0b84a..237d8d6 100644
--- a/docs/architecture/note_engine.md
+++ b/docs/architecture/note_engine.md
@@ -274,7 +274,7 @@ most existing code can be scheduled by the dependency 
engine in two steps:
 
 
 1. Allocate the variable tags associated with resources like memory blob, 
PRNGS.
-   - Call `push` with the execution function as the original code to 
execute, and put the variable tags of
+2. Call `push` with the execution function as the original code to execute, 
and put the variable tags of
   corresponding resources correctly in `read_vars` and `mutate_vars`.
 
 ## Implementing the Generic Dependency Engine

-- 
To stop receiving notification emails like this one, please contact
['"comm...@mxnet.apache.org" '].


[GitHub] ptrendx commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
ptrendx commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148901124
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -46,8 +64,8 @@ class Comm {
   /**
* \brief init key with the data shape and storage shape
*/
-  virtual void Init(int key, const NDArrayStorageType stype,
-const TShape& shape, int dtype = mshadow::kFloat32) = 0;
+  virtual void Init(int key, const NDArrayStorageType stype, const TShape& 
shape,
+  int dtype = mshadow::kFloat32, Context pinned_ctx = 
Context::CPUPinned(0)) = 0;
 
 Review comment:
   That was a relict of a previous iteration of the code - I will remove it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha opened a new pull request #8536: rename output layer of resnet

2017-11-03 Thread GitBox
szha opened a new pull request #8536: rename output layer of resnet
URL: https://github.com/apache/incubator-mxnet/pull/8536
 
 
   ## Description ##
   This is a missed change from #8446
   
   ## Checklist ##
   ### Essentials ###
   - [x] Passed code style checking (`make lint`)
   - [x] Changes are complete (i.e. I finished coding on this PR)
   - [x] All changes have test coverage
   - [x] For user-facing API changes, API doc string has been updated.
   - [x] To my best knowledge, examples are either not affected by this change, 
or have been fixed to be compatible with this change
   
   ### Changes ###
   - reorganize and rename output layer of resnetv2


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #8532: mxnet-mkl (v0.12.0) crash when using (conda-installed) numpy with MKL

2017-11-03 Thread GitBox
szha commented on issue #8532: mxnet-mkl (v0.12.0) crash when using 
(conda-installed) numpy with MKL
URL: 
https://github.com/apache/incubator-mxnet/issues/8532#issuecomment-341834234
 
 
   @ykim362 what's your recommendation on this? MKL2018 doesn't distribute .a, 
which means that I can't hide any of its symbols when loading.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin commented on a change in pull request #8460: cpu sparse embedding op

2017-11-03 Thread GitBox
eric-haibin-lin commented on a change in pull request #8460: cpu sparse 
embedding op
URL: https://github.com/apache/incubator-mxnet/pull/8460#discussion_r148897629
 
 

 ##
 File path: src/operator/tensor/indexing_op.cc
 ##
 @@ -95,6 +95,77 @@ Examples::
 .add_argument("weight", "NDArray-or-Symbol", "The embedding weight matrix.")
 .add_arguments(EmbeddingParam::__FIELDS__());
 
+NNVM_REGISTER_OP(_contrib_SparseEmbedding)
+.describe(R"code(Maps integer indices to vector representations (embeddings).
+
+This operator maps words to real-valued vectors in a high-dimensional space,
+called word embeddings. These embeddings can capture semantic and syntactic 
properties of the words.
+For example, it has been noted that in the learned embedding spaces, similar 
words tend
+to be close to each other and dissimilar words far apart.
+
+For an input array of shape (d1, ..., dK),
+the shape of an output array is (d1, ..., dK, output_dim).
+All the input values should be integers in the range [0, input_dim).
+
+If the input_dim is ip0 and output_dim is op0, then shape of the embedding 
weight matrix must be
+(ip0, op0).
+
+The storage type of weight must be `row_sparse`, and the gradient of the 
weight will be of
+`row_sparse` storage type, too.
+
+.. Note::
+
+`SparseEmbedding` is designed for the use case where `input_dim` is very 
large (e.g. 100k).
+The `row_sparse` weight cannot be used in a `BucketingModule`.
+The operator is only available on CPU.
+
+Examples::
+
+  input_dim = 4
+  output_dim = 5
+
+  // Each row in weight matrix y represents a word. So, y = (w0,w1,w2,w3)
+  y = [[  0.,   1.,   2.,   3.,   4.],
+   [  5.,   6.,   7.,   8.,   9.],
+   [ 10.,  11.,  12.,  13.,  14.],
+   [ 15.,  16.,  17.,  18.,  19.]]
+
+  // Input array x represents n-grams(2-gram). So, x = [(w1,w3), (w0,w2)]
+  x = [[ 1.,  3.],
+   [ 0.,  2.]]
+
+  // Mapped input x to its vector representation y.
+  SparseEmbedding(x, y, 4, 5) = [[[  5.,   6.,   7.,   8.,   9.],
+ [ 15.,  16.,  17.,  18.,  19.]],
+
+[[  0.,   1.,   2.,   3.,   4.],
+ [ 10.,  11.,  12.,  13.,  14.]]]
+
 
 Review comment:
   Cool will add this


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] mbaijal opened a new pull request #8535: Fix pip-test, libgfortran3 installation needs flag '-y'

2017-11-03 Thread GitBox
mbaijal opened a new pull request #8535: Fix pip-test,  libgfortran3 
installation needs flag '-y'
URL: https://github.com/apache/incubator-mxnet/pull/8535
 
 
   ## Description ##
   This is a fix to a previous PR by me where I added a step to pip-test to 
instal libgfortran3. 
   Though this ran fine for a while, it now started failing because it needed a 
runtime 'yes'. This PR adds the flag -y
   
   ## Checklist ##
   ### Essentials ###
   - [ ] Passed code style checking (`make lint`)
   - [ ] Changes are complete (i.e. I finished coding on this PR)
   - [ ] All changes have test coverage
   - [ ] To my best knowledge, examples are either not affected by this change, 
or have been fixed to be compatible with this change
   
   ### Changes ###
   - [ ] Added flag -y to libgfortran3 installation
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ptrendx commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
ptrendx commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148887739
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -32,6 +35,21 @@
 #include "mxnet/ndarray.h"
 #include "../ndarray/ndarray_function.h"
 #include "../operator/tensor/sparse_retain-inl.h"
+
+#if MXNET_USE_NCCL
 
 Review comment:
   Linter would complain about system header being after local headers (I 
assume those are the 2 `#if`s you would want merged.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ptrendx commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
ptrendx commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148887766
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -32,6 +35,21 @@
 #include "mxnet/ndarray.h"
 #include "../ndarray/ndarray_function.h"
 #include "../operator/tensor/sparse_retain-inl.h"
+
+#if MXNET_USE_NCCL
+#include "../common/cuda_utils.h"
+
+#ifndef NCCL_MAJOR
+#define NCCL_MAJOR 1
 
 Review comment:
   Sure.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ptrendx commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
ptrendx commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148887182
 
 

 ##
 File path: tests/python/gpu/test_nccl.py
 ##
 @@ -0,0 +1,42 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: skip-file
 
 Review comment:
   Most current tests have pylint disabled, I copied that part from them. Sure, 
I will do that.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ptrendx commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
ptrendx commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148887226
 
 

 ##
 File path: tests/python/gpu/test_nccl.py
 ##
 @@ -0,0 +1,42 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# pylint: skip-file
+import mxnet as mx
+import numpy as np
+
+shapes = [(10), (100), (1000), (1), (10), (2,2), (2,3,4,5,6,7,8)]
+keys = [1,2,3,4,5,6,7]
+gpus = range(1,1+len(mx.test_utils.list_gpus()))
+
+@unittest.skip("Test requires NCCL library installed and enabled during build")
+def test_nccl_pushpull():
+for shape, key in zip(shapes, keys):
+for n_gpus in gpus:
+kv_nccl = mx.kv.create('nccl')
+a = mx.nd.ones(shape, mx.gpu(0))
+cur_key = str(key*max(gpus)+n_gpus)
+kv_nccl.init(cur_key, a)
+arr_list = [mx.nd.ones(shape, mx.gpu(x)) for x in xrange(n_gpus)]
 
 Review comment:
   Sure.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] ptrendx commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
ptrendx commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148886763
 
 

 ##
 File path: src/kvstore/kvstore_nccl.h
 ##
 @@ -0,0 +1,182 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+/**
+ * @file   kvstore_nccl.h
+ * @brief  NCCL implementation of KVStore
+ */
+#ifndef MXNET_KVSTORE_KVSTORE_NCCL_H_
+#define MXNET_KVSTORE_KVSTORE_NCCL_H_
+
+#if MXNET_USE_NCCL
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "./comm.h"
+#include "./kvstore_local.h"
+
+
+namespace mxnet {
+namespace kvstore {
+
+/**
+ * \brief store data in local machine using NCCL
+ */
+class KVStoreNCCL : public KVStoreLocal {
 
 Review comment:
   Currently no - it is a future work. The biggest problem is how to bootstrap 
NCCL in multi-node scenario, and I do not yet understand MXNet's distributed 
kvstore enough to use it for that task.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] benqua commented on issue #8297: [scala] Make accuracy idependant of output size (fix #8226)

2017-11-03 Thread GitBox
benqua commented on issue #8297: [scala] Make accuracy idependant of output 
size (fix #8226)
URL: https://github.com/apache/incubator-mxnet/pull/8297#issuecomment-341805007
 
 
   done. let's see...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #8475: Move OpenMP calculations into its own class

2017-11-03 Thread GitBox
szha commented on issue #8475: Move OpenMP calculations into its own class
URL: https://github.com/apache/incubator-mxnet/pull/8475#issuecomment-341774298
 
 
   And still merging ? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhreshold commented on issue #8503: Does mxnet support total variation?

2017-11-03 Thread GitBox
zhreshold commented on issue #8503: Does mxnet support total variation?
URL: 
https://github.com/apache/incubator-mxnet/issues/8503#issuecomment-341793369
 
 
   You are welcome to contribute one.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhreshold commented on issue #8506: Monitor Weights in Float16 (mx.nd.norm not implemented for float16)

2017-11-03 Thread GitBox
zhreshold commented on issue #8506: Monitor Weights in Float16 (mx.nd.norm not 
implemented for float16)
URL: 
https://github.com/apache/incubator-mxnet/issues/8506#issuecomment-341792778
 
 
   Currently dot doesn't support fp16, which will be fixed later.
   In the mean time, you hack is valid and won't cause to much trouble since 
it's just a monitor callback. You can use whatever metric you want.
   Not sure if `mx.nd.mean(d ** 2)` will be faster in fp16 compared to the cast 
in fp32


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhreshold commented on issue #8531: How to get the correlation result of two feature maps?

2017-11-03 Thread GitBox
zhreshold commented on issue #8531: How to get the correlation result of two 
feature maps?
URL: 
https://github.com/apache/incubator-mxnet/issues/8531#issuecomment-341789286
 
 
   I am confused. For correlation, the F1 and F2 must have same shapes.
   For convolution, F2 should be (C, 1, 3, 3)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhreshold commented on issue #8532: mxnet-mkl (v0.12.0) crash when using (conda-installed) numpy with MKL

2017-11-03 Thread GitBox
zhreshold commented on issue #8532: mxnet-mkl (v0.12.0) crash when using 
(conda-installed) numpy with MKL
URL: 
https://github.com/apache/incubator-mxnet/issues/8532#issuecomment-341788060
 
 
   @szha 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhreshold commented on issue #8519: More details to the windows build process

2017-11-03 Thread GitBox
zhreshold commented on issue #8519: More details to the windows build process
URL: https://github.com/apache/incubator-mxnet/pull/8519#issuecomment-341787629
 
 
   @yajiedesign 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhreshold commented on issue #8523: multiprocessing Gluon DataLoader

2017-11-03 Thread GitBox
zhreshold commented on issue #8523: multiprocessing Gluon DataLoader
URL: https://github.com/apache/incubator-mxnet/pull/8523#issuecomment-341787339
 
 
   Is it finished?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] sandeep-krishnamurthy commented on issue #8534: fixed practioners to practitioners

2017-11-03 Thread GitBox
sandeep-krishnamurthy commented on issue #8534: fixed practioners to 
practitioners
URL: https://github.com/apache/incubator-mxnet/pull/8534#issuecomment-341780472
 
 
   Thanks. LGTM. Will merge after CI tests.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] piiswrong closed pull request #8182: Make gluon.Block cooperative in multiple inheritance setting

2017-11-03 Thread GitBox
piiswrong closed pull request #8182: Make gluon.Block cooperative in multiple 
inheritance setting
URL: https://github.com/apache/incubator-mxnet/pull/8182
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/python/mxnet/gluon/block.py b/python/mxnet/gluon/block.py
index def5d145f8..b4c534289d 100644
--- a/python/mxnet/gluon/block.py
+++ b/python/mxnet/gluon/block.py
@@ -162,13 +162,16 @@ def forward(self, x):
 dense0 = nn.Dense(20)
 dense1 = nn.Dense(20, params=dense0.collect_params())
 """
-def __init__(self, prefix=None, params=None):
+def __init__(self, prefix=None, params=None, **kwargs):
 self._empty_prefix = prefix == ''
 self._prefix, self._params = _BlockScope.create(prefix, params, 
self._alias())
 self._name = self._prefix[:-1] if self._prefix.endswith('_') else 
self._prefix
 self._scope = _BlockScope(self)
 self._children = []
 
+# Cooperative design for multiple inheritance
+super(Block, self).__init__(**kwargs)
+
 def __repr__(self):
 s = '{name}(\n{modstr}\n)'
 modstr = '\n'.join(['  ({key}): {block}'.format(key=key,
@@ -319,8 +322,9 @@ class HybridBlock(Block):
 Refer `Hybrid tutorial `_ to 
see
 the end-to-end usage.
 """
-def __init__(self, prefix=None, params=None):
-super(HybridBlock, self).__init__(prefix=prefix, params=params)
+def __init__(self, prefix=None, params=None, **kwargs):
+# Cooperative design for multiple inheritance
+super(HybridBlock, self).__init__(prefix=prefix, params=params, 
**kwargs)
 self._reg_params = {}
 self._cached_graph = ()
 self._cached_op = None


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] thinksanky opened a new pull request #8534: fixed practioners to practitioners

2017-11-03 Thread GitBox
thinksanky opened a new pull request #8534: fixed practioners to practitioners
URL: https://github.com/apache/incubator-mxnet/pull/8534
 
 
   ## Description ##
   (Brief description on what this PR is about)
   
   Fixed practioners to practitioners in the main landing page
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[incubator-mxnet] branch master updated: Fix a description (#8501)

2017-11-03 Thread jxie
This is an automated email from the ASF dual-hosted git repository.

jxie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/master by this push:
 new b8745e9  Fix a description (#8501)
b8745e9 is described below

commit b8745e90a7f594696cda79d3d95fc605de31e995
Author: Kenta Kubo <601636+kkk...@users.noreply.github.com>
AuthorDate: Sat Nov 4 02:35:55 2017 +0900

Fix a description (#8501)
---
 tools/coreml/converter/_layers.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/coreml/converter/_layers.py 
b/tools/coreml/converter/_layers.py
index 29562aa..fe00232 100644
--- a/tools/coreml/converter/_layers.py
+++ b/tools/coreml/converter/_layers.py
@@ -422,7 +422,7 @@ def convert_pooling(net, node, module, builder):
 
 
 def convert_batchnorm(net, node, module, builder):
-"""Convert a transpose layer from mxnet to coreml.
+"""Convert a batchnorm layer from mxnet to coreml.
 
 Parameters
 --

-- 
To stop receiving notification emails like this one, please contact
['"comm...@mxnet.apache.org" '].


[GitHub] piiswrong closed pull request #8325: Fix typo in gluon l1loss docstring

2017-11-03 Thread GitBox
piiswrong closed pull request #8325: Fix typo in gluon l1loss docstring
URL: https://github.com/apache/incubator-mxnet/pull/8325
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/python/mxnet/gluon/loss.py b/python/mxnet/gluon/loss.py
index 5d8c8b7d1b..614025cd35 100644
--- a/python/mxnet/gluon/loss.py
+++ b/python/mxnet/gluon/loss.py
@@ -138,7 +138,7 @@ def hybrid_forward(self, F, pred, label, 
sample_weight=None):
 class L1Loss(Loss):
 r"""Calculates the mean absolute error between `pred` and `label`.
 
-.. math:: L = \frac{1}{2} \sum_i \vert {pred}_i - {label}_i \vert.
+.. math:: L = \sum_i \vert {pred}_i - {label}_i \vert.
 
 `pred` and `label` can have arbitrary shape as long as they have the same
 number of elements.


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] rahul003 commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
rahul003 commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148720629
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -46,8 +64,8 @@ class Comm {
   /**
* \brief init key with the data shape and storage shape
*/
-  virtual void Init(int key, const NDArrayStorageType stype,
-const TShape& shape, int dtype = mshadow::kFloat32) = 0;
+  virtual void Init(int key, const NDArrayStorageType stype, const TShape& 
shape,
+  int dtype = mshadow::kFloat32, Context pinned_ctx = 
Context::CPUPinned(0)) = 0;
 
 Review comment:
   Are you changing the pinnned context to something other than CPU in 
   kVStoreNCCL? Or is this change just to generalize the function?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] bohdankornienko commented on issue #6147: [API docs submission] GridGenerator

2017-11-03 Thread GitBox
bohdankornienko commented on issue #6147: [API docs submission] GridGenerator
URL: 
https://github.com/apache/incubator-mxnet/issues/6147#issuecomment-341741503
 
 
   Could anyone explain in formulas/pseudo-code how does GridGenerator 
calculated. 
   Would appreciate that explanation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] bowman opened a new pull request #8533: Tweak a markdown list in note_engine.md

2017-11-03 Thread GitBox
bowman opened a new pull request #8533: Tweak a markdown list in note_engine.md
URL: https://github.com/apache/incubator-mxnet/pull/8533
 
 
   This is a minor change to the markdown to make the "two step" list into a 
numbered list.
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] fhieber opened a new issue #8532: mxnet-mkl (v0.12.0) crash when using (conda-installed) numpy with MKL

2017-11-03 Thread GitBox
fhieber opened a new issue #8532: mxnet-mkl (v0.12.0) crash when using 
(conda-installed) numpy with MKL
URL: https://github.com/apache/incubator-mxnet/issues/8532
 
 
   We have observed crashes with any mkl-enabled pip package of mxnet-0.12.0 in 
combination with numpy if installed through conda (which by default also uses 
MKL).
   
   In this case, mxnet trainings crash with the following error message:
   ```
   OMP: Error #15: Initializing libiomp5.so, but found libiomp5.so already 
initialized.
   OMP: Hint: This means that multiple copies of the OpenMP runtime have been 
linked into the program. That is dangerous, since it can degrade performance or 
cause incorrect results. The best thing to do is to ensure that only a single 
OpenMP runtime is linked into the process, e.g. by avoiding static linking of 
the OpenMP runtime in any library. As an unsafe, unsupported, undocumented 
workaround you
   OMP: Hint: This means that multiple copies of the OpenMP runtime have been 
linked into the program. That is dangerous, since it can degrade performance or 
cause incorrect results. The best thing to do is to ensure that only a single 
OpenMP runtime is linked into the process, e.g. by avoiding static linking of 
the OpenMP runtime in any library. As an unsafe, unsupported, undocumented 
workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to 
allow the program to continue to execute, but that may cause crashes or 
silently produce incorrect results. For more information, please see 
http://www.intel.com/software/products/support/.
   ```
   
   Numpy from conda links against the libmkl_rt.so, distributed through conda:
   ```
   libmkl_rt.so => 
/opt/conda/lib/python3.6/site-packages/numpy/core/../../../../libmkl_rt.so 
(0x7f05256e)
libmkl_rt.so => 
/opt/conda/lib/python3.6/site-packages/numpy/linalg/../../../../libmkl_rt.so 
(0x7f367e1d5000)
libmkl_rt.so => 
/opt/conda/lib/python3.6/site-packages/numpy/linalg/../../../../libmkl_rt.so 
(0x7fd39e751000)
   ```
   
   whereas MXNet links to its own .so:
   ```
   ldd /opt/conda/lib/python3.6/site-packages/mxnet/libmxnet.so
   [...]
libmklml_intel.so => 
/opt/conda/lib/python3.6/site-packages/mxnet/libmklml_intel.so 
(0x7f8c85b94000)
libiomp5.so => /opt/conda/lib/python3.6/site-packages/mxnet/libiomp5.so 
(0x7f8c857f1000)
   [...]
   ```
   
   This prevents people from using numpy w/ MKL in combination with 
mxnet-mkl==0.12.0.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhanghang1989 commented on issue #8494: Autograd bug in mxnet-cu80: 0.12

2017-11-03 Thread GitBox
zhanghang1989 commented on issue #8494: Autograd bug in mxnet-cu80: 0.12
URL: 
https://github.com/apache/incubator-mxnet/issues/8494#issuecomment-341733157
 
 
   Put `loss.backward()` inside the scope of `autograd.record()`
   
https://github.com/zhanghang1989/MXNet-Gluon-Style-Transfer/blob/master/main.py#L167-L176


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] javelinjs commented on issue #8297: [scala] Make accuracy idependant of output size (fix #8226)

2017-11-03 Thread GitBox
javelinjs commented on issue #8297: [scala] Make accuracy idependant of output 
size (fix #8226)
URL: https://github.com/apache/incubator-mxnet/pull/8297#issuecomment-341723565
 
 
   Sorry to see that. Please rebase again, I think the CI is OK now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] anjishnu commented on issue #8422: Implementing a new activation function in Gluon

2017-11-03 Thread GitBox
anjishnu commented on issue #8422: Implementing a new activation function in 
Gluon
URL: 
https://github.com/apache/incubator-mxnet/issues/8422#issuecomment-341718165
 
 
   Would it make sense if I contributed any activation functions I implement as 
Gluon blocks, for instance?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] anjishnu closed issue #8422: Implementing a new activation function in Gluon

2017-11-03 Thread GitBox
anjishnu closed issue #8422: Implementing a new activation function in Gluon
URL: https://github.com/apache/incubator-mxnet/issues/8422
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] anjishnu commented on issue #8422: Implementing a new activation function in Gluon

2017-11-03 Thread GitBox
anjishnu commented on issue #8422: Implementing a new activation function in 
Gluon
URL: 
https://github.com/apache/incubator-mxnet/issues/8422#issuecomment-341717793
 
 
   Thanks Mathias, I think that what I have in mind is quite simple so it 
should be implementable with mx.nd.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Bumblebee1964 commented on issue #8424: windows binaries

2017-11-03 Thread GitBox
Bumblebee1964 commented on issue #8424: windows binaries 
URL: 
https://github.com/apache/incubator-mxnet/issues/8424#issuecomment-341709010
 
 
   For those using c++ API, there is nor real docs and limited examples. I 
figured out some issues how to load a pretrained model, to modify and how o 
predict without a file, but directly in memory. So if you have how-to questions 
I might be able to help you. It would be good to share knowledge. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Bumblebee1964 commented on issue #8424: windows binaries

2017-11-03 Thread GitBox
Bumblebee1964 commented on issue #8424: windows binaries 
URL: 
https://github.com/apache/incubator-mxnet/issues/8424#issuecomment-341709010
 
 
   For those using c++ API, there is nor real docs and limited examples. So if 
you have how-to questions I might be able to help you. It would be good to 
share knowledge. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Bumblebee1964 commented on issue #8424: windows binaries

2017-11-03 Thread GitBox
Bumblebee1964 commented on issue #8424: windows binaries 
URL: 
https://github.com/apache/incubator-mxnet/issues/8424#issuecomment-341708702
 
 
   Welcome to the c++ api club. If you want to use c++ api you need to build 
the library yourself because some include files are generated by the build 
process. These are unfortunately not distributed with binaries.
   I made a short list of steps (with the help of a webpage) and could build 
the library until 0.95 but since 0.11 the build breaks for VS2013/15/17 on 
Win10 64 bit  #(8155). So if anyone has it working please share it. I am stuck 
for a month now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] squidszyd opened a new issue #8531: How to get the correlation result of two feature maps?

2017-11-03 Thread GitBox
squidszyd opened a new issue #8531: How to get the correlation result of two 
feature maps?
URL: https://github.com/apache/incubator-mxnet/issues/8531
 
 
   Suppose I have two feature maps F1 and F2 output by a network. I want to 
compute convolution of F1 and F2. Assume that F1 has shape (1, C, 10, 10) and 
F2 has shape (1, C, 3, 3) and the wanted result should have shape (1, C, 8, 8) 
if pad = 0, stride = 1 and dilate = 1.
   How to implement this using MXNet?
   I have come up with one possible way that use mx.sym.Correlation, but I 
cannot get the idea by reading the doc.
   Or, can I set the weight of a mx.sym.Convolution layer to F2, and data to 
F1? Would this interfere the propagation of grads when training?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #7074: Does anyone know PVAnet? How to configure its CRelu structure using mxnet

2017-11-03 Thread GitBox
szha commented on issue #7074: Does anyone know PVAnet? How to configure its 
CRelu structure using mxnet
URL: 
https://github.com/apache/incubator-mxnet/issues/7074#issuecomment-341689709
 
 
   This issue is closed due to lack of activity in the last 90 days. Feel free 
to ping me to reopen if this is still an active issue. Thanks!
   Also, do please check out our [forum](https://discuss.mxnet.io/) (and 
[Chinese version](https://discuss.gluon.ai/)) for general "how-to" questions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha closed issue #7074: Does anyone know PVAnet? How to configure its CRelu structure using mxnet

2017-11-03 Thread GitBox
szha closed issue #7074: Does anyone know PVAnet? How to configure its CRelu 
structure using mxnet
URL: https://github.com/apache/incubator-mxnet/issues/7074
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] tutnixzursache commented on issue #8530: Fatal JVM Error due to Exception in CustomOpProp#inferTypeEntry

2017-11-03 Thread GitBox
tutnixzursache commented on issue #8530: Fatal JVM Error due to Exception in 
CustomOpProp#inferTypeEntry
URL: 
https://github.com/apache/incubator-mxnet/issues/8530#issuecomment-341686488
 
 
   I suggest to add a Unkown value to the DType object:
   `val Unkown = Value(-1, "unkown")`
 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] mxmxlwlw commented on issue #8529: What is the functionality of OrderMutation

2017-11-03 Thread GitBox
mxmxlwlw commented on issue #8529: What is the functionality of OrderMutation
URL: 
https://github.com/apache/incubator-mxnet/issues/8529#issuecomment-341651443
 
 
   And I want to give a suggestion. Please don't do the pass in just one 
function. Make it a class, and implement it by using more than one function. Or 
just give us more docs about it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] mxmxlwlw opened a new issue #8529: What is the functionality of OrderMutation

2017-11-03 Thread GitBox
mxmxlwlw opened a new issue #8529: What is the functionality of OrderMutation
URL: https://github.com/apache/incubator-mxnet/issues/8529
 
 
   Hi,
   I noticed that the ordermutation pass of graph doesn't be used. I guess 
that's maybe because of the dependency engine. I want to know how to achieve 
the writing and reading order by using ordermutation. Thanks a lot.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] msiraj83 commented on issue #3427: Anyway to avoid the requirement of using CXXABI_1.3.8

2017-11-03 Thread GitBox
msiraj83 commented on issue #3427: Anyway to avoid the requirement of using 
CXXABI_1.3.8
URL: 
https://github.com/apache/incubator-mxnet/issues/3427#issuecomment-341650384
 
 
   @ruotianluo  i also face  the same problem.. i follow  @piiswrong  command 
and found that the gcc version 4.9 found in /usr/lib/x86_64/libstdc++.so.6. So 
i create soft link from  
/usr/local/MATLAB/MATLAB_Production_Server/R2015a/sys/os/glnxa64/libstdc++.so.6 
to /usr/lib/x86_64/libstdc++.so.6.0.17
   This solve my problem..


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] solin319 commented on issue #8423: Re-implement segnet in MXnet

2017-11-03 Thread GitBox
solin319 commented on issue #8423: Re-implement segnet in MXnet
URL: https://github.com/apache/incubator-mxnet/pull/8423#issuecomment-341647873
 
 
   @zhreshold 
   Yes, the test result can match the paper.
   In CamVid data-set, we used 367 for training and 233 for testing like paper.
   The train command is `python train_segnet.py --gpus 0,1,2,3 --lr=0.01 
--network=segnet_basic`.
   The result was shown below.
   
![segnet_basic](https://user-images.githubusercontent.com/13029886/32365889-ba8925fe-c0b6-11e7-86b7-83794c983c8a.png)
   We also trained with segnet_basic_with_drop, the result was shown below.
   
![segnet_basic_with_drop](https://user-images.githubusercontent.com/13029886/32365911-dbc16312-c0b6-11e7-94c5-d37e55791901.png)
   The test accuracy was similar to the paper.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] akturtle commented on issue #8431: I can't figure out why ImageRecordIter use only 1 thread for decoding

2017-11-03 Thread GitBox
akturtle commented on issue #8431: I can't figure out why ImageRecordIter use 
only 1 thread for decoding
URL: 
https://github.com/apache/incubator-mxnet/issues/8431#issuecomment-341645452
 
 
   I meet the same problem.  How to fix it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] rahul003 commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
rahul003 commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148720629
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -46,8 +64,8 @@ class Comm {
   /**
* \brief init key with the data shape and storage shape
*/
-  virtual void Init(int key, const NDArrayStorageType stype,
-const TShape& shape, int dtype = mshadow::kFloat32) = 0;
+  virtual void Init(int key, const NDArrayStorageType stype, const TShape& 
shape,
+  int dtype = mshadow::kFloat32, Context pinned_ctx = 
Context::CPUPinned(0)) = 0;
 
 Review comment:
   Are you changing the pinnned context to something other than CPU ever? Or is 
this change just to generalize the function?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] rahul003 commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
rahul003 commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148724498
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -635,6 +656,302 @@ class CommDevice : public Comm {
   bool inited_;
 };
 
+#if MXNET_USE_NCCL
+class CommNCCL : public Comm {
+ public:
+  CommNCCL() {
+inited_ = false;
+pinned_ctx_ = Context::CPUPinned(0);
+  }
+
+  virtual ~CommNCCL() {
+for (auto e : nccl_data_) {
+  cudaStreamDestroy(e.second.stream);
+  ncclCommDestroy(e.second.comm);
+}
+  }
+
+  void Init(int key, const NDArrayStorageType stype, const TShape& shape,
+int dtype = mshadow::kFloat32, Context pinned_ctx = 
Context::CPUPinned(0)) override {
+if (stype == kDefaultStorage) {
+  sorted_key_attrs_.push_back(std::make_tuple(key, shape, dtype));
+} else {
+  LOG(FATAL) << "NCCL KVStore does not support sparse storage type";
+}
+  }
+
+  const NDArray& Reduce(int key, const std::vector& src,
+int priority) override {
+// avoid extra copy for single device, but it may bring problems for
+// abnormal usage of kvstore
+if (src.size() == 1) {
+  return src[0];
+}
+
+if (!inited_) {
+  std::vector devs;
+  for (const auto& a : src) {
+devs.push_back(a.ctx());
+  }
+  InitNCCL(devs);
+  InitMergeBuffer(devs);
+}
+
+std::vector dev_ids;
+for (auto e : src) {
+  dev_ids.push_back(e.ctx().dev_id);
+}
+std::sort(dev_ids.begin(), dev_ids.end());
+CHECK(device_ids_ == dev_ids) << "NCCL KVStore supports only single set of 
devices";
+
+auto& buf = merge_buf_[key];
+int root = buf.merged.ctx().dev_id;
+size_t root_id = -1;
+for (size_t i = 0; i < src.size(); ++i) {
+  if (src[i].ctx().dev_id == root) {
+root_id = i;
+break;
+  }
+}
+
+auto& reduce = buf.merged;
+
+std::vector const_vars;
+for (size_t i = 0; i < src.size(); ++i) {
+  const_vars.push_back(src[i].var());
+}
+Engine::Get()->PushSync([src, reduce, root_id, this](RunContext rctx) {
+  {
+std::lock_guard 
l(Storage::Get()->GetMutex(Context::kGPU));
+int root = nccl_data_[src[root_id].ctx().dev_id].rank;
+ncclGroupStart();
+for (size_t i = 0; i < src.size(); ++i) {
+  NCCLEntry cur = nccl_data_[src[i].ctx().dev_id];
+  if (i == root_id) {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+reduce.data().dptr(),
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  } else {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+NULL,
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  }
+}
+ncclGroupEnd();
+  }
+},
+Context::CPU(),
+const_vars,
+{reduce.var()},
+FnProperty::kCPUPrioritized,
+priority,
+PROFILER_MESSAGE("KVStoreReduce"));
+
+return buf.merged;
+  }
+
+  void CommSync(const std::vector& dst,
+int priority) override {
+std::vector const_vars;
+std::vector mutate_vars;
+for (size_t i = 0; i < dst.size(); ++i) {
+mutate_vars.push_back(dst[i]->var());
+}
+Engine::Get()->PushSync([this](RunContext rctx) {
+  for (auto cur : nccl_data_) {
+CUDA_CALL(cudaSetDevice(cur.second.dev_id));
+CUDA_CALL(cudaStreamSynchronize(cur.second.stream));
+  }
+},
+Context::CPU(),
+const_vars,
+mutate_vars,
+FnProperty::kCPUPrioritized,
+priority,
+PROFILER_MESSAGE("KVStoreStreamSync"));
+  }
+
+  void CommSync(const std::vector& dst,
+int priority) override {
+std::vector const_vars;
+std::vector mutate_vars;
+for (size_t i = 0; i < dst.size(); ++i) {
+mutate_vars.push_back(dst[i].var());
+}
+Engine::Get()->PushSync([this](RunContext rctx) {
+  for (auto cur : nccl_data_) {
+CUDA_CALL(cudaSetDevice(cur.second.dev_id));
+CUDA_CALL(cudaStreamSynchronize(cur.second.stream));
+  }
+},
+Context::CPU(),
+const_vars,
+mutate_vars,
+FnProperty::kCPUPrioritized,
+priority,
+

[GitHub] rahul003 commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
rahul003 commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148692803
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -635,6 +656,302 @@ class CommDevice : public Comm {
   bool inited_;
 };
 
+#if MXNET_USE_NCCL
+class CommNCCL : public Comm {
+ public:
+  CommNCCL() {
+inited_ = false;
+pinned_ctx_ = Context::CPUPinned(0);
+  }
+
+  virtual ~CommNCCL() {
+for (auto e : nccl_data_) {
+  cudaStreamDestroy(e.second.stream);
+  ncclCommDestroy(e.second.comm);
+}
+  }
+
+  void Init(int key, const NDArrayStorageType stype, const TShape& shape,
+int dtype = mshadow::kFloat32, Context pinned_ctx = 
Context::CPUPinned(0)) override {
+if (stype == kDefaultStorage) {
+  sorted_key_attrs_.push_back(std::make_tuple(key, shape, dtype));
+} else {
+  LOG(FATAL) << "NCCL KVStore does not support sparse storage type";
+}
+  }
+
+  const NDArray& Reduce(int key, const std::vector& src,
+int priority) override {
+// avoid extra copy for single device, but it may bring problems for
+// abnormal usage of kvstore
+if (src.size() == 1) {
+  return src[0];
+}
+
+if (!inited_) {
+  std::vector devs;
+  for (const auto& a : src) {
+devs.push_back(a.ctx());
+  }
+  InitNCCL(devs);
+  InitMergeBuffer(devs);
+}
+
+std::vector dev_ids;
+for (auto e : src) {
+  dev_ids.push_back(e.ctx().dev_id);
+}
+std::sort(dev_ids.begin(), dev_ids.end());
+CHECK(device_ids_ == dev_ids) << "NCCL KVStore supports only single set of 
devices";
 
 Review comment:
   Do you want to check here that the set of devices don't change during the 
training? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] rahul003 commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
rahul003 commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148720845
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -635,6 +656,302 @@ class CommDevice : public Comm {
   bool inited_;
 };
 
+#if MXNET_USE_NCCL
+class CommNCCL : public Comm {
+ public:
+  CommNCCL() {
+inited_ = false;
+pinned_ctx_ = Context::CPUPinned(0);
+  }
+
+  virtual ~CommNCCL() {
+for (auto e : nccl_data_) {
+  cudaStreamDestroy(e.second.stream);
+  ncclCommDestroy(e.second.comm);
+}
+  }
+
+  void Init(int key, const NDArrayStorageType stype, const TShape& shape,
+int dtype = mshadow::kFloat32, Context pinned_ctx = 
Context::CPUPinned(0)) override {
+if (stype == kDefaultStorage) {
+  sorted_key_attrs_.push_back(std::make_tuple(key, shape, dtype));
+} else {
+  LOG(FATAL) << "NCCL KVStore does not support sparse storage type";
+}
+  }
+
+  const NDArray& Reduce(int key, const std::vector& src,
+int priority) override {
+// avoid extra copy for single device, but it may bring problems for
+// abnormal usage of kvstore
+if (src.size() == 1) {
+  return src[0];
+}
+
+if (!inited_) {
+  std::vector devs;
+  for (const auto& a : src) {
+devs.push_back(a.ctx());
+  }
+  InitNCCL(devs);
+  InitMergeBuffer(devs);
+}
+
+std::vector dev_ids;
+for (auto e : src) {
+  dev_ids.push_back(e.ctx().dev_id);
+}
+std::sort(dev_ids.begin(), dev_ids.end());
+CHECK(device_ids_ == dev_ids) << "NCCL KVStore supports only single set of 
devices";
+
+auto& buf = merge_buf_[key];
+int root = buf.merged.ctx().dev_id;
+size_t root_id = -1;
+for (size_t i = 0; i < src.size(); ++i) {
+  if (src[i].ctx().dev_id == root) {
+root_id = i;
+break;
+  }
+}
+
+auto& reduce = buf.merged;
+
+std::vector const_vars;
+for (size_t i = 0; i < src.size(); ++i) {
+  const_vars.push_back(src[i].var());
+}
+Engine::Get()->PushSync([src, reduce, root_id, this](RunContext rctx) {
+  {
+std::lock_guard 
l(Storage::Get()->GetMutex(Context::kGPU));
+int root = nccl_data_[src[root_id].ctx().dev_id].rank;
+ncclGroupStart();
+for (size_t i = 0; i < src.size(); ++i) {
+  NCCLEntry cur = nccl_data_[src[i].ctx().dev_id];
+  if (i == root_id) {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
 
 Review comment:
   Ya, please indent the Engine Pushes similar to how it is in the existing 
code base. 
   Also, it might be better to move the contents of callback to a different 
function to make it cleaner.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] rahul003 commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
rahul003 commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148723493
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -635,6 +656,302 @@ class CommDevice : public Comm {
   bool inited_;
 };
 
+#if MXNET_USE_NCCL
+class CommNCCL : public Comm {
+ public:
+  CommNCCL() {
+inited_ = false;
+pinned_ctx_ = Context::CPUPinned(0);
+  }
+
+  virtual ~CommNCCL() {
+for (auto e : nccl_data_) {
+  cudaStreamDestroy(e.second.stream);
+  ncclCommDestroy(e.second.comm);
+}
+  }
+
+  void Init(int key, const NDArrayStorageType stype, const TShape& shape,
+int dtype = mshadow::kFloat32, Context pinned_ctx = 
Context::CPUPinned(0)) override {
+if (stype == kDefaultStorage) {
+  sorted_key_attrs_.push_back(std::make_tuple(key, shape, dtype));
+} else {
+  LOG(FATAL) << "NCCL KVStore does not support sparse storage type";
+}
+  }
+
+  const NDArray& Reduce(int key, const std::vector& src,
+int priority) override {
+// avoid extra copy for single device, but it may bring problems for
+// abnormal usage of kvstore
+if (src.size() == 1) {
+  return src[0];
+}
+
+if (!inited_) {
+  std::vector devs;
+  for (const auto& a : src) {
+devs.push_back(a.ctx());
+  }
+  InitNCCL(devs);
+  InitMergeBuffer(devs);
+}
+
+std::vector dev_ids;
+for (auto e : src) {
+  dev_ids.push_back(e.ctx().dev_id);
+}
+std::sort(dev_ids.begin(), dev_ids.end());
+CHECK(device_ids_ == dev_ids) << "NCCL KVStore supports only single set of 
devices";
+
+auto& buf = merge_buf_[key];
+int root = buf.merged.ctx().dev_id;
+size_t root_id = -1;
+for (size_t i = 0; i < src.size(); ++i) {
+  if (src[i].ctx().dev_id == root) {
+root_id = i;
+break;
+  }
+}
+
+auto& reduce = buf.merged;
+
+std::vector const_vars;
+for (size_t i = 0; i < src.size(); ++i) {
+  const_vars.push_back(src[i].var());
+}
+Engine::Get()->PushSync([src, reduce, root_id, this](RunContext rctx) {
+  {
+std::lock_guard 
l(Storage::Get()->GetMutex(Context::kGPU));
+int root = nccl_data_[src[root_id].ctx().dev_id].rank;
+ncclGroupStart();
+for (size_t i = 0; i < src.size(); ++i) {
+  NCCLEntry cur = nccl_data_[src[i].ctx().dev_id];
+  if (i == root_id) {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+reduce.data().dptr(),
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  } else {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+NULL,
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  }
+}
+ncclGroupEnd();
+  }
+},
+Context::CPU(),
+const_vars,
+{reduce.var()},
+FnProperty::kCPUPrioritized,
+priority,
+PROFILER_MESSAGE("KVStoreReduce"));
+
+return buf.merged;
+  }
+
+  void CommSync(const std::vector& dst,
+int priority) override {
+std::vector const_vars;
+std::vector mutate_vars;
+for (size_t i = 0; i < dst.size(); ++i) {
+mutate_vars.push_back(dst[i]->var());
+}
+Engine::Get()->PushSync([this](RunContext rctx) {
+  for (auto cur : nccl_data_) {
+CUDA_CALL(cudaSetDevice(cur.second.dev_id));
+CUDA_CALL(cudaStreamSynchronize(cur.second.stream));
+  }
+},
+Context::CPU(),
+const_vars,
+mutate_vars,
+FnProperty::kCPUPrioritized,
+priority,
+PROFILER_MESSAGE("KVStoreStreamSync"));
+  }
+
+  void CommSync(const std::vector& dst,
+int priority) override {
+std::vector const_vars;
+std::vector mutate_vars;
+for (size_t i = 0; i < dst.size(); ++i) {
+mutate_vars.push_back(dst[i].var());
+}
+Engine::Get()->PushSync([this](RunContext rctx) {
+  for (auto cur : nccl_data_) {
+CUDA_CALL(cudaSetDevice(cur.second.dev_id));
+CUDA_CALL(cudaStreamSynchronize(cur.second.stream));
+  }
+},
+Context::CPU(),
+const_vars,
+mutate_vars,
+FnProperty::kCPUPrioritized,
+priority,
+

[GitHub] rahul003 commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
rahul003 commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148724330
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -635,6 +656,302 @@ class CommDevice : public Comm {
   bool inited_;
 };
 
+#if MXNET_USE_NCCL
+class CommNCCL : public Comm {
+ public:
+  CommNCCL() {
+inited_ = false;
+pinned_ctx_ = Context::CPUPinned(0);
+  }
+
+  virtual ~CommNCCL() {
+for (auto e : nccl_data_) {
+  cudaStreamDestroy(e.second.stream);
+  ncclCommDestroy(e.second.comm);
+}
+  }
+
+  void Init(int key, const NDArrayStorageType stype, const TShape& shape,
+int dtype = mshadow::kFloat32, Context pinned_ctx = 
Context::CPUPinned(0)) override {
+if (stype == kDefaultStorage) {
+  sorted_key_attrs_.push_back(std::make_tuple(key, shape, dtype));
+} else {
+  LOG(FATAL) << "NCCL KVStore does not support sparse storage type";
+}
+  }
+
+  const NDArray& Reduce(int key, const std::vector& src,
+int priority) override {
+// avoid extra copy for single device, but it may bring problems for
+// abnormal usage of kvstore
+if (src.size() == 1) {
+  return src[0];
+}
+
+if (!inited_) {
+  std::vector devs;
+  for (const auto& a : src) {
+devs.push_back(a.ctx());
+  }
+  InitNCCL(devs);
+  InitMergeBuffer(devs);
+}
+
+std::vector dev_ids;
+for (auto e : src) {
+  dev_ids.push_back(e.ctx().dev_id);
+}
+std::sort(dev_ids.begin(), dev_ids.end());
+CHECK(device_ids_ == dev_ids) << "NCCL KVStore supports only single set of 
devices";
+
+auto& buf = merge_buf_[key];
+int root = buf.merged.ctx().dev_id;
+size_t root_id = -1;
+for (size_t i = 0; i < src.size(); ++i) {
+  if (src[i].ctx().dev_id == root) {
+root_id = i;
+break;
+  }
+}
+
+auto& reduce = buf.merged;
+
+std::vector const_vars;
+for (size_t i = 0; i < src.size(); ++i) {
+  const_vars.push_back(src[i].var());
+}
+Engine::Get()->PushSync([src, reduce, root_id, this](RunContext rctx) {
+  {
+std::lock_guard 
l(Storage::Get()->GetMutex(Context::kGPU));
+int root = nccl_data_[src[root_id].ctx().dev_id].rank;
+ncclGroupStart();
+for (size_t i = 0; i < src.size(); ++i) {
+  NCCLEntry cur = nccl_data_[src[i].ctx().dev_id];
+  if (i == root_id) {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+reduce.data().dptr(),
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  } else {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+NULL,
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  }
+}
+ncclGroupEnd();
+  }
+},
+Context::CPU(),
+const_vars,
+{reduce.var()},
+FnProperty::kCPUPrioritized,
+priority,
+PROFILER_MESSAGE("KVStoreReduce"));
+
+return buf.merged;
+  }
+
+  void CommSync(const std::vector& dst,
+int priority) override {
+std::vector const_vars;
+std::vector mutate_vars;
+for (size_t i = 0; i < dst.size(); ++i) {
+mutate_vars.push_back(dst[i]->var());
+}
+Engine::Get()->PushSync([this](RunContext rctx) {
+  for (auto cur : nccl_data_) {
+CUDA_CALL(cudaSetDevice(cur.second.dev_id));
+CUDA_CALL(cudaStreamSynchronize(cur.second.stream));
+  }
+},
+Context::CPU(),
+const_vars,
+mutate_vars,
+FnProperty::kCPUPrioritized,
+priority,
+PROFILER_MESSAGE("KVStoreStreamSync"));
+  }
+
+  void CommSync(const std::vector& dst,
+int priority) override {
+std::vector const_vars;
+std::vector mutate_vars;
+for (size_t i = 0; i < dst.size(); ++i) {
+mutate_vars.push_back(dst[i].var());
+}
+Engine::Get()->PushSync([this](RunContext rctx) {
+  for (auto cur : nccl_data_) {
+CUDA_CALL(cudaSetDevice(cur.second.dev_id));
+CUDA_CALL(cudaStreamSynchronize(cur.second.stream));
+  }
+},
+Context::CPU(),
+const_vars,
+mutate_vars,
+FnProperty::kCPUPrioritized,
+priority,
+

[GitHub] rahul003 commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
rahul003 commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148699052
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -635,6 +656,302 @@ class CommDevice : public Comm {
   bool inited_;
 };
 
+#if MXNET_USE_NCCL
+class CommNCCL : public Comm {
+ public:
+  CommNCCL() {
+inited_ = false;
+pinned_ctx_ = Context::CPUPinned(0);
+  }
+
+  virtual ~CommNCCL() {
+for (auto e : nccl_data_) {
+  cudaStreamDestroy(e.second.stream);
+  ncclCommDestroy(e.second.comm);
+}
+  }
+
+  void Init(int key, const NDArrayStorageType stype, const TShape& shape,
+int dtype = mshadow::kFloat32, Context pinned_ctx = 
Context::CPUPinned(0)) override {
+if (stype == kDefaultStorage) {
+  sorted_key_attrs_.push_back(std::make_tuple(key, shape, dtype));
+} else {
+  LOG(FATAL) << "NCCL KVStore does not support sparse storage type";
+}
+  }
+
+  const NDArray& Reduce(int key, const std::vector& src,
+int priority) override {
+// avoid extra copy for single device, but it may bring problems for
+// abnormal usage of kvstore
+if (src.size() == 1) {
+  return src[0];
+}
+
+if (!inited_) {
+  std::vector devs;
+  for (const auto& a : src) {
+devs.push_back(a.ctx());
+  }
+  InitNCCL(devs);
+  InitMergeBuffer(devs);
+}
+
+std::vector dev_ids;
+for (auto e : src) {
+  dev_ids.push_back(e.ctx().dev_id);
+}
+std::sort(dev_ids.begin(), dev_ids.end());
+CHECK(device_ids_ == dev_ids) << "NCCL KVStore supports only single set of 
devices";
+
+auto& buf = merge_buf_[key];
+int root = buf.merged.ctx().dev_id;
+size_t root_id = -1;
+for (size_t i = 0; i < src.size(); ++i) {
+  if (src[i].ctx().dev_id == root) {
+root_id = i;
+break;
+  }
+}
+
+auto& reduce = buf.merged;
+
+std::vector const_vars;
+for (size_t i = 0; i < src.size(); ++i) {
+  const_vars.push_back(src[i].var());
+}
+Engine::Get()->PushSync([src, reduce, root_id, this](RunContext rctx) {
+  {
+std::lock_guard 
l(Storage::Get()->GetMutex(Context::kGPU));
+int root = nccl_data_[src[root_id].ctx().dev_id].rank;
+ncclGroupStart();
+for (size_t i = 0; i < src.size(); ++i) {
+  NCCLEntry cur = nccl_data_[src[i].ctx().dev_id];
+  if (i == root_id) {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+reduce.data().dptr(),
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  } else {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+NULL,
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  }
+}
+ncclGroupEnd();
+  }
+},
+Context::CPU(),
+const_vars,
+{reduce.var()},
+FnProperty::kCPUPrioritized,
+priority,
+PROFILER_MESSAGE("KVStoreReduce"));
+
+return buf.merged;
+  }
+
+  void CommSync(const std::vector& dst,
+int priority) override {
+std::vector const_vars;
+std::vector mutate_vars;
+for (size_t i = 0; i < dst.size(); ++i) {
+mutate_vars.push_back(dst[i]->var());
+}
+Engine::Get()->PushSync([this](RunContext rctx) {
+  for (auto cur : nccl_data_) {
+CUDA_CALL(cudaSetDevice(cur.second.dev_id));
+CUDA_CALL(cudaStreamSynchronize(cur.second.stream));
+  }
+},
+Context::CPU(),
+const_vars,
+mutate_vars,
+FnProperty::kCPUPrioritized,
+priority,
+PROFILER_MESSAGE("KVStoreStreamSync"));
+  }
+
+  void CommSync(const std::vector& dst,
+int priority) override {
+std::vector const_vars;
+std::vector mutate_vars;
+for (size_t i = 0; i < dst.size(); ++i) {
+mutate_vars.push_back(dst[i].var());
+}
+Engine::Get()->PushSync([this](RunContext rctx) {
+  for (auto cur : nccl_data_) {
+CUDA_CALL(cudaSetDevice(cur.second.dev_id));
+CUDA_CALL(cudaStreamSynchronize(cur.second.stream));
+  }
+},
+Context::CPU(),
+const_vars,
+mutate_vars,
+FnProperty::kCPUPrioritized,
+priority,
+

[GitHub] rahul003 commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
rahul003 commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148720552
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -32,6 +35,21 @@
 #include "mxnet/ndarray.h"
 #include "../ndarray/ndarray_function.h"
 #include "../operator/tensor/sparse_retain-inl.h"
+
+#if MXNET_USE_NCCL
+#include "../common/cuda_utils.h"
+
+#ifndef NCCL_MAJOR
+#define NCCL_MAJOR 1
 
 Review comment:
   Could you add a comment explaining this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] rahul003 commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
rahul003 commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148691972
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -32,6 +35,21 @@
 #include "mxnet/ndarray.h"
 #include "../ndarray/ndarray_function.h"
 #include "../operator/tensor/sparse_retain-inl.h"
+
+#if MXNET_USE_NCCL
 
 Review comment:
   Can't we merge the two ```#if MXNET_USE_NCCL ```?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] rahul003 commented on a change in pull request #8294: NCCL integration

2017-11-03 Thread GitBox
rahul003 commented on a change in pull request #8294: NCCL integration
URL: https://github.com/apache/incubator-mxnet/pull/8294#discussion_r148698941
 
 

 ##
 File path: src/kvstore/comm.h
 ##
 @@ -635,6 +656,302 @@ class CommDevice : public Comm {
   bool inited_;
 };
 
+#if MXNET_USE_NCCL
+class CommNCCL : public Comm {
+ public:
+  CommNCCL() {
+inited_ = false;
+pinned_ctx_ = Context::CPUPinned(0);
+  }
+
+  virtual ~CommNCCL() {
+for (auto e : nccl_data_) {
+  cudaStreamDestroy(e.second.stream);
+  ncclCommDestroy(e.second.comm);
+}
+  }
+
+  void Init(int key, const NDArrayStorageType stype, const TShape& shape,
+int dtype = mshadow::kFloat32, Context pinned_ctx = 
Context::CPUPinned(0)) override {
+if (stype == kDefaultStorage) {
+  sorted_key_attrs_.push_back(std::make_tuple(key, shape, dtype));
+} else {
+  LOG(FATAL) << "NCCL KVStore does not support sparse storage type";
+}
+  }
+
+  const NDArray& Reduce(int key, const std::vector& src,
+int priority) override {
+// avoid extra copy for single device, but it may bring problems for
+// abnormal usage of kvstore
+if (src.size() == 1) {
+  return src[0];
+}
+
+if (!inited_) {
+  std::vector devs;
+  for (const auto& a : src) {
+devs.push_back(a.ctx());
+  }
+  InitNCCL(devs);
+  InitMergeBuffer(devs);
+}
+
+std::vector dev_ids;
+for (auto e : src) {
+  dev_ids.push_back(e.ctx().dev_id);
+}
+std::sort(dev_ids.begin(), dev_ids.end());
+CHECK(device_ids_ == dev_ids) << "NCCL KVStore supports only single set of 
devices";
+
+auto& buf = merge_buf_[key];
+int root = buf.merged.ctx().dev_id;
+size_t root_id = -1;
+for (size_t i = 0; i < src.size(); ++i) {
+  if (src[i].ctx().dev_id == root) {
+root_id = i;
+break;
+  }
+}
+
+auto& reduce = buf.merged;
+
+std::vector const_vars;
+for (size_t i = 0; i < src.size(); ++i) {
+  const_vars.push_back(src[i].var());
+}
+Engine::Get()->PushSync([src, reduce, root_id, this](RunContext rctx) {
+  {
+std::lock_guard 
l(Storage::Get()->GetMutex(Context::kGPU));
+int root = nccl_data_[src[root_id].ctx().dev_id].rank;
+ncclGroupStart();
+for (size_t i = 0; i < src.size(); ++i) {
+  NCCLEntry cur = nccl_data_[src[i].ctx().dev_id];
+  if (i == root_id) {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+reduce.data().dptr(),
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  } else {
+  MSHADOW_TYPE_SWITCH(src[i].dtype(), DType,
+  ncclReduce(src[i].data().dptr(),
+NULL,
+src[i].shape().Size(),
+GetNCCLType(src[i].dtype()),
+ncclSum,
+root,
+cur.comm,
+cur.stream););
+  }
+}
+ncclGroupEnd();
+  }
+},
+Context::CPU(),
+const_vars,
+{reduce.var()},
+FnProperty::kCPUPrioritized,
+priority,
+PROFILER_MESSAGE("KVStoreReduce"));
+
+return buf.merged;
+  }
+
+  void CommSync(const std::vector& dst,
+int priority) override {
+std::vector const_vars;
+std::vector mutate_vars;
+for (size_t i = 0; i < dst.size(); ++i) {
+mutate_vars.push_back(dst[i]->var());
+}
+Engine::Get()->PushSync([this](RunContext rctx) {
+  for (auto cur : nccl_data_) {
+CUDA_CALL(cudaSetDevice(cur.second.dev_id));
+CUDA_CALL(cudaStreamSynchronize(cur.second.stream));
+  }
+},
+Context::CPU(),
+const_vars,
+mutate_vars,
+FnProperty::kCPUPrioritized,
+priority,
+PROFILER_MESSAGE("KVStoreStreamSync"));
+  }
+
+  void CommSync(const std::vector& dst,
+int priority) override {
+std::vector const_vars;
+std::vector mutate_vars;
+for (size_t i = 0; i < dst.size(); ++i) {
+mutate_vars.push_back(dst[i].var());
+}
+Engine::Get()->PushSync([this](RunContext rctx) {
+  for (auto cur : nccl_data_) {
+CUDA_CALL(cudaSetDevice(cur.second.dev_id));
+CUDA_CALL(cudaStreamSynchronize(cur.second.stream));
+  }
+},
+Context::CPU(),
+const_vars,
+mutate_vars,
+FnProperty::kCPUPrioritized,
+priority,
+

[GitHub] edmBernard commented on issue #8427: Bug: ThreadedEnginePerDevice: speed variation with mixed Mxnet API and Amalgamation API

2017-11-03 Thread GitBox
edmBernard commented on issue #8427: Bug: ThreadedEnginePerDevice: speed 
variation with mixed Mxnet API and Amalgamation API
URL: 
https://github.com/apache/incubator-mxnet/issues/8427#issuecomment-341637172
 
 
   I just realize that ThreadedEnginePerDevice should have the same impact than 
NaiveEngine since I don't use GPU :(
   On separated script I got these timing:
   
   | MXNET_ENGINE_TYPE | API | time (s) |
   | -- | -- | -- |
   | `NaiveEngine` | Module API | 12.8s |
   | `NaiveEngine` | Amalgamation | 12.3s |
   | `ThreadedEngine` | Module API | 0.9s |
   | `ThreadedEngine` | Amalgamation | 1.1s |
   
   If launch several time the same API in the same script I got the same time 
on all iteration.
   
   Time change if I mixed Amalgamation and Module API on the same script:
   
   | MXNET_ENGINE_TYPE | API | time (s) |
   | -- | -- | -- |
   | `NaiveEngine` | Amalgamation / Module API / Amalgamation | 12.6s /  12.9s 
/  12.6s |
   | `ThreadedEngine` | Amalgamation / Module API / Amalgamation | 1.4s /  0.9s 
/  0.9s |
   | `ThreadedEnginePerDevice` | Amalgamation / Module API / Amalgamation | 
12.8s / 0.7s / 0.78s |
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] liumilan commented on issue #8500: program crash when run sparse model predict

2017-11-03 Thread GitBox
liumilan commented on issue #8500: program crash when run sparse model predict
URL: 
https://github.com/apache/incubator-mxnet/issues/8500#issuecomment-341638446
 
 
   OK,
   And one more question,
   Now the train datasize is 98994453 ,bathsize is 200
   it takes 2 hours to finish 1 epoch,seemed to slow.
   Can it possible to speed up sparse trainning?
   In other sparse lr training ,it should be much more faster
   @eric-haibin-lin 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] edmBernard commented on issue #8427: Ask : Speed difference between Mxnet API and Amalgamation API

2017-11-03 Thread GitBox
edmBernard commented on issue #8427: Ask : Speed difference between Mxnet API 
and Amalgamation API
URL: 
https://github.com/apache/incubator-mxnet/issues/8427#issuecomment-341637172
 
 
   I just realize that ThreadedEnginePerDevice have the same impact than 
NaiveEngine since I don't use GPU :(
   On separated script I got these timing:
   
   | MXNET_ENGINE_TYPE | API | time (s) |
   | -- | -- | -- |
   | `NaiveEngine` | Module API | 12.8s |
   | `NaiveEngine` | Amalgamation | 12.3s |
   | `ThreadedEngine` | Module API | 0.9s |
   | `ThreadedEngine` | Amalgamation | 1.1s |
   
   If launch several time the same API in the same script I got the same time 
on all iteration.
   
   Time change if I mixed Amalgamation and Module API on the same script:
   
   | MXNET_ENGINE_TYPE | API | time (s) |
   | -- | -- | -- |
   | `NaiveEngine` | Amalgamation / Module API / Amalgamation | 12.6s /  12.9s 
/  12.6s |
   | `ThreadedEngine` | Amalgamation / Module API / Amalgamation | 1.4s /  0.9s 
/  0.9s |
   | `ThreadedEnginePerDevice` | Amalgamation / Module API / Amalgamation | 
12.8s / 0.7s / 0.78s |
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] roggiezhang commented on issue #8494: Autograd bug in mxnet-cu80: 0.12

2017-11-03 Thread GitBox
roggiezhang commented on issue #8494: Autograd bug in mxnet-cu80: 0.12
URL: 
https://github.com/apache/incubator-mxnet/issues/8494#issuecomment-341635996
 
 
   It does not need to adjust the weights of vgg since I only used that for 
feature extraction. The code logic is OK since I already made it work for 
0.11.3, but not for 0.12. That's why I started this thread. I cannot find 
anything useful from source code of python api. And I debugged and found the 
difference after [executing 
graph](https://github.com/apache/incubator-mxnet/blob/396943e22661f03867d103d134416541e7e4f2bb/python/mxnet/gluon/block.py#L394).
 For 0.12, the grad that I attached disappears after this line, which is not 
for 0.11.3.
   
   By the way, here is the rough logic of my code:
   
   1. Build pre-trained network vgg.
   2. Get inputs and outputs symbols that are interested.
   3. Use those inputs and outputs to build SymbolBlock.
   4. Open autograd.record(), attach the grad for input, and feed the input to 
the SymbolBlock created above. Lastly, compute the loss and do grad. 
   
   In theory, I should get a correct grad for input (that's how it works in 
0.11.3), but it's broken in 0.12. Since it's related to native code, i'd have 
to ask for you guys to help.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] roggiezhang commented on issue #8494: Autograd bug in mxnet-cu80: 0.12

2017-11-03 Thread GitBox
roggiezhang commented on issue #8494: Autograd bug in mxnet-cu80: 0.12
URL: 
https://github.com/apache/incubator-mxnet/issues/8494#issuecomment-341635996
 
 
   It does not need to adjust the weights of vgg since I only used that for 
feature extraction. The code logic is OK since I already made it work for 0.12, 
but not for 0.11.3. That's why I started this thread. I cannot find anything 
useful from source code of python api. And I debugged and found the difference 
after [executing 
graph](https://github.com/apache/incubator-mxnet/blob/396943e22661f03867d103d134416541e7e4f2bb/python/mxnet/gluon/block.py#L394).
 For 0.12, the grad that I attached disappears after this line, which is not 
for 0.11.3.
   
   By the way, here is the rough logic of my code:
   
   1. Build pre-trained network vgg.
   2. Get inputs and outputs symbols that are interested.
   3. Use those inputs and outputs to build SymbolBlock.
   4. Open autograd.record(), attach the grad for input, and feed the input to 
the SymbolBlock created above. Lastly, compute the loss and do grad. 
   
   In theory, I should get a correct grad for input (that's how it works in 
0.11.3), but it's broken in 0.12. Since it's related to native code, i'd have 
to ask for you guys to help.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] roggiezhang commented on issue #8494: Autograd bug in mxnet-cu80: 0.12

2017-11-03 Thread GitBox
roggiezhang commented on issue #8494: Autograd bug in mxnet-cu80: 0.12
URL: 
https://github.com/apache/incubator-mxnet/issues/8494#issuecomment-341635996
 
 
   It does not need to adjust the weights of vgg since I only used that for 
feature extraction. The code logic is OK since I already made it work for 0.12, 
but not for 0.11.3. That's why I started this thread. I cannot find anything 
useful from source code of python api. And I debugged and found the difference 
after executing graph 
[https://github.com/apache/incubator-mxnet/blob/396943e22661f03867d103d134416541e7e4f2bb/python/mxnet/gluon/block.py#L394](line).
 For 0.12, the grad that I attached disappears after this line, which is not 
for 0.11.3.
   
   By the way, here is the rough logic of my code:
   
   1. Build pre-trained network vgg.
   2. Get inputs and outputs symbols that are interested.
   3. Use those inputs and outputs to build SymbolBlock.
   4. Open autograd.record(), attach the grad for input, and feed the input to 
the SymbolBlock created above. Lastly, compute the loss and do grad. 
   
   In theory, I should get a correct grad for input (that's how it works in 
0.11.3), but it's broken in 0.12. Since it's related to native code, i'd have 
to ask for you guys to help.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eric-haibin-lin closed issue #8442: some questions about sparse lr examples

2017-11-03 Thread GitBox
eric-haibin-lin closed issue #8442: some questions about sparse lr examples
URL: https://github.com/apache/incubator-mxnet/issues/8442
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] szha commented on issue #8494: Autograd bug in mxnet-cu80: 0.12

2017-11-03 Thread GitBox
szha commented on issue #8494: Autograd bug in mxnet-cu80: 0.12
URL: 
https://github.com/apache/incubator-mxnet/issues/8494#issuecomment-341633045
 
 
   I haven't had the time to look into the details of the code yet. I found 
[the part that your pseudo-code corresponds 
to](https://github.com/roggiezhang-nv/mxnet-sample/blob/master/python/ml/vision/style_transfer/classic.py#L103-L111),
 though I haven't found any apparent problem. How are you adjusting the weights 
given that your code doesn't use trainer?
   
   BTW you may want to check out @zhanghang1989's work on 
[style-transfer](https://github.com/zhanghang1989/MXNet-Gluon-Style-Transfer).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Prasad9 commented on issue #8484: MXNet-to-CoreML module: Fixed KeyError for Reshape

2017-11-03 Thread GitBox
Prasad9 commented on issue #8484: MXNet-to-CoreML module: Fixed KeyError for 
Reshape
URL: https://github.com/apache/incubator-mxnet/pull/8484#issuecomment-341361564
 
 
   @tqchen , @piiswrong , I am making use of MXNet 0.12.0 version. When I am 
saving the symbol file in JSON format, I get the output in this format for 
various types of layers:
   
   ```
 {
 "op": "Convolution", 
 "name": "convolution5", 
 "attr": {
   "kernel": "(3, 3)", 
   "num_filter": "16", 
   "stride": "(1, 1)"
 }, 
 "inputs": [[6, 0, 0], [7, 0, 0], [8, 0, 0]]
   }, 
{
 "op": "Activation", 
 "name": "activation8", 
 "attr": {"act_type": "relu"}, 
 "inputs": [[15, 0, 0]]
   }, 
   {
 "op": "FullyConnected", 
 "name": "fullyconnected5", 
 "attr": {"num_hidden": "2"}, 
 "inputs": [[16, 0, 0], [17, 0, 0], [18, 0, 0]]
   }, 
   {
 "op": "Reshape", 
 "name": "reshape1", 
 "attr": {"shape": "(0, 2)"}, 
 "inputs": [[19, 0, 0]]
   }, 
   {
 "op": "transpose", 
 "name": "transpose0", 
 "attr": {"axes": "(1, 0)"}, 
 "inputs": [[20, 0, 0]]
   }, 
   {
 "op": "SoftmaxOutput", 
 "name": "softmax", 
 "inputs": [[22, 0, 0], [23, 0, 0]]
   }
   ```
   So all these layers currently are making use of keyword `attr` while 
generating the symbol file in MXNet 0.12.0 version. I checked the 
[_layers.py](https://github.com/apache/incubator-mxnet/blob/master/tools/coreml/converter/_layers.py)
 file in various branches like dev, master, ci-test, mli-patch-1 etc and the 
code in all these branches for other layers (i.e. transpose, reshape, 
activation etc) seem to be using `attr` only. I am not getting why you want it 
to be replaced with `attrs`.
   
   Or else please do let me know if you are referring to some other `attr` 
which has to be replaced with `attrs`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


  1   2   >