[incubator-mxnet] branch master updated: Add LANS optimizer (#18620)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new d6c3578 Add LANS optimizer (#18620) d6c3578 is described below commit d6c35785a870ac6e0b42903d7e27de2c9a6efdbe Author: Shuai Zheng AuthorDate: Sat Jun 27 13:25:03 2020 -0700 Add LANS optimizer (#18620) * add lans optimizer * fix * fix Co-authored-by: Zheng --- python/mxnet/ndarray/contrib.py | 78 +++ python/mxnet/optimizer/__init__.py | 6 +- python/mxnet/optimizer/lans.py | 220 ++ src/operator/contrib/multi_lans-inl.h | 385 src/operator/contrib/multi_lans.cc | 267 ++ src/operator/contrib/multi_lans.cu | 287 src/operator/contrib/multi_sum_sq-inl.h | 13 +- src/operator/contrib/multi_sum_sq.cc| 20 +- src/operator/contrib/multi_sum_sq.cu| 16 +- tests/python/unittest/test_optimizer.py | 30 +++ 10 files changed, 1302 insertions(+), 20 deletions(-) diff --git a/python/mxnet/ndarray/contrib.py b/python/mxnet/ndarray/contrib.py index 2ff422f..0975013 100644 --- a/python/mxnet/ndarray/contrib.py +++ b/python/mxnet/ndarray/contrib.py @@ -680,3 +680,81 @@ def multi_mp_lamb_update(weights, grads, mean, var, weights32, step_count, learning_rates=lrs, wds=wds, **kwargs) + + +def multi_lans_update(weights, grads, mean, var, step_count, + lrs, wds, out=None, num_tensors=0, **kwargs): +"""Given a list of gradients, update weights, mean and variance of multiple tensors +following LANS Optimizer implementation. + +Parameters +-- +weights : List of NDArrays containing the input weights of multiple tensors + +grads : List of NDArrays containing input gradients + +mean : List of NDArrays containing mean of multiple tensors to be updated + +var : List of NDArrays containing variance of multiple tensors to be updated + +step_count : List of scalars with the number of update step for each tensor + +lrs : List of learning rates (one for each tensor) + +wds : List of weight decays (one for each tensor) + +out: List of NDArrays where the updated weights will be stored + +num_tensors : Number of NDArrays/tensors in the list +""" + +if not num_tensors: +num_tensors = len(weights) +temp_list = _flatten_list(zip(weights, grads, mean, var)) +return ndarray._internal._multi_lans_update(*temp_list, +out=out, +num_tensors=num_tensors, +step_count=step_count, +learning_rates=lrs, +wds=wds, +**kwargs) + + +def multi_mp_lans_update(weights, grads, mean, var, weights32, step_count, + lrs, wds, out=None, num_tensors=0, **kwargs): +"""Given a list of gradients, update weights, mean and variance of multiple tensors +following LANS Optimizer implementation, and using Mixed-Precision. + +Parameters +-- +weights : List of NDArrays containing the input weights of multiple tensors + +grads : List of NDArrays containing input gradients + +mean : List of NDArrays containing mean of multiple tensors to be updated + +var : List of NDArrays containing variance of multiple tensors to be updated + +weights32 : Master copy of weights in FP32 + +step_count : List of scalars with the number of update step for each tensor + +lrs : List of learning rates (one for each tensor) + +wds : List of weight decays (one for each tensor) + +out: List of NDArrays where the updated weights will be stored + +num_tensors : Number of NDArrays/tensors in the list +""" + +if not num_tensors: +num_tensors = len(weights) +temp_list = _flatten_list(zip(weights, grads, mean, var, weights32)) +return ndarray._internal._multi_mp_lans_update(*temp_list, + out=out, + num_tensors=num_tensors, + step_count=step_count, + learning_rates=lrs, + wds=wds, + **kwargs) diff --git a/python/mxnet/optimizer/__init__.py b/python/mxnet/opti
[incubator-mxnet] branch master updated: Add LANS optimizer (#18620)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new d6c3578 Add LANS optimizer (#18620) d6c3578 is described below commit d6c35785a870ac6e0b42903d7e27de2c9a6efdbe Author: Shuai Zheng AuthorDate: Sat Jun 27 13:25:03 2020 -0700 Add LANS optimizer (#18620) * add lans optimizer * fix * fix Co-authored-by: Zheng --- python/mxnet/ndarray/contrib.py | 78 +++ python/mxnet/optimizer/__init__.py | 6 +- python/mxnet/optimizer/lans.py | 220 ++ src/operator/contrib/multi_lans-inl.h | 385 src/operator/contrib/multi_lans.cc | 267 ++ src/operator/contrib/multi_lans.cu | 287 src/operator/contrib/multi_sum_sq-inl.h | 13 +- src/operator/contrib/multi_sum_sq.cc| 20 +- src/operator/contrib/multi_sum_sq.cu| 16 +- tests/python/unittest/test_optimizer.py | 30 +++ 10 files changed, 1302 insertions(+), 20 deletions(-) diff --git a/python/mxnet/ndarray/contrib.py b/python/mxnet/ndarray/contrib.py index 2ff422f..0975013 100644 --- a/python/mxnet/ndarray/contrib.py +++ b/python/mxnet/ndarray/contrib.py @@ -680,3 +680,81 @@ def multi_mp_lamb_update(weights, grads, mean, var, weights32, step_count, learning_rates=lrs, wds=wds, **kwargs) + + +def multi_lans_update(weights, grads, mean, var, step_count, + lrs, wds, out=None, num_tensors=0, **kwargs): +"""Given a list of gradients, update weights, mean and variance of multiple tensors +following LANS Optimizer implementation. + +Parameters +-- +weights : List of NDArrays containing the input weights of multiple tensors + +grads : List of NDArrays containing input gradients + +mean : List of NDArrays containing mean of multiple tensors to be updated + +var : List of NDArrays containing variance of multiple tensors to be updated + +step_count : List of scalars with the number of update step for each tensor + +lrs : List of learning rates (one for each tensor) + +wds : List of weight decays (one for each tensor) + +out: List of NDArrays where the updated weights will be stored + +num_tensors : Number of NDArrays/tensors in the list +""" + +if not num_tensors: +num_tensors = len(weights) +temp_list = _flatten_list(zip(weights, grads, mean, var)) +return ndarray._internal._multi_lans_update(*temp_list, +out=out, +num_tensors=num_tensors, +step_count=step_count, +learning_rates=lrs, +wds=wds, +**kwargs) + + +def multi_mp_lans_update(weights, grads, mean, var, weights32, step_count, + lrs, wds, out=None, num_tensors=0, **kwargs): +"""Given a list of gradients, update weights, mean and variance of multiple tensors +following LANS Optimizer implementation, and using Mixed-Precision. + +Parameters +-- +weights : List of NDArrays containing the input weights of multiple tensors + +grads : List of NDArrays containing input gradients + +mean : List of NDArrays containing mean of multiple tensors to be updated + +var : List of NDArrays containing variance of multiple tensors to be updated + +weights32 : Master copy of weights in FP32 + +step_count : List of scalars with the number of update step for each tensor + +lrs : List of learning rates (one for each tensor) + +wds : List of weight decays (one for each tensor) + +out: List of NDArrays where the updated weights will be stored + +num_tensors : Number of NDArrays/tensors in the list +""" + +if not num_tensors: +num_tensors = len(weights) +temp_list = _flatten_list(zip(weights, grads, mean, var, weights32)) +return ndarray._internal._multi_mp_lans_update(*temp_list, + out=out, + num_tensors=num_tensors, + step_count=step_count, + learning_rates=lrs, + wds=wds, + **kwargs) diff --git a/python/mxnet/optimizer/__init__.py b/python/mxnet/opti
[incubator-mxnet] branch master updated: add epsilon to adamax (#18532)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new e4c93e3 add epsilon to adamax (#18532) e4c93e3 is described below commit e4c93e3e3a68559cb38e4ff92c9e0bf9c9cdd0bf Author: Shuai Zheng AuthorDate: Wed Jun 24 22:03:39 2020 -0700 add epsilon to adamax (#18532) Co-authored-by: Ubuntu --- python/mxnet/optimizer/adamax.py | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/python/mxnet/optimizer/adamax.py b/python/mxnet/optimizer/adamax.py index a2ffd9c..d7bc2d1 100644 --- a/python/mxnet/optimizer/adamax.py +++ b/python/mxnet/optimizer/adamax.py @@ -37,7 +37,7 @@ class Adamax(Optimizer): grad = clip(grad * rescale_grad, clip_gradient) + wd * weight m = beta1 * m_t + (1 - beta1) * grad u = maximum(beta2 * u, abs(grad)) -weight -= lr / (1 - beta1**t) * m / u +weight -= lr / (1 - beta1**t) * m / (u + epsilon) This optimizer accepts the following parameters in addition to those accepted by :class:`.Optimizer`. @@ -58,13 +58,14 @@ class Adamax(Optimizer): When use_fused_step=False, step is called, otherwise, fused_step is called. """ -def __init__(self, learning_rate=0.002, beta1=0.9, beta2=0.999, +def __init__(self, learning_rate=0.002, beta1=0.9, beta2=0.999, epsilon=1e-8, use_fused_step=False, **kwargs): super(Adamax, self).__init__(learning_rate=learning_rate, use_fused_step=use_fused_step, **kwargs) self.beta1 = beta1 self.beta2 = beta2 +self.epsilon = epsilon def create_state(self, index, weight): return (zeros(weight.shape, weight.context, dtype=weight.dtype), # mean @@ -107,5 +108,5 @@ class Adamax(Optimizer): var[:] = maximum(self.beta2 * var, NDabs(grad)) # update weight -d = mean / var +d = mean / (var + self.epsilon) weight[:] -= lr * d
[incubator-mxnet] branch master updated: add epsilon to adamax (#18532)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new e4c93e3 add epsilon to adamax (#18532) e4c93e3 is described below commit e4c93e3e3a68559cb38e4ff92c9e0bf9c9cdd0bf Author: Shuai Zheng AuthorDate: Wed Jun 24 22:03:39 2020 -0700 add epsilon to adamax (#18532) Co-authored-by: Ubuntu --- python/mxnet/optimizer/adamax.py | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/python/mxnet/optimizer/adamax.py b/python/mxnet/optimizer/adamax.py index a2ffd9c..d7bc2d1 100644 --- a/python/mxnet/optimizer/adamax.py +++ b/python/mxnet/optimizer/adamax.py @@ -37,7 +37,7 @@ class Adamax(Optimizer): grad = clip(grad * rescale_grad, clip_gradient) + wd * weight m = beta1 * m_t + (1 - beta1) * grad u = maximum(beta2 * u, abs(grad)) -weight -= lr / (1 - beta1**t) * m / u +weight -= lr / (1 - beta1**t) * m / (u + epsilon) This optimizer accepts the following parameters in addition to those accepted by :class:`.Optimizer`. @@ -58,13 +58,14 @@ class Adamax(Optimizer): When use_fused_step=False, step is called, otherwise, fused_step is called. """ -def __init__(self, learning_rate=0.002, beta1=0.9, beta2=0.999, +def __init__(self, learning_rate=0.002, beta1=0.9, beta2=0.999, epsilon=1e-8, use_fused_step=False, **kwargs): super(Adamax, self).__init__(learning_rate=learning_rate, use_fused_step=use_fused_step, **kwargs) self.beta1 = beta1 self.beta2 = beta2 +self.epsilon = epsilon def create_state(self, index, weight): return (zeros(weight.shape, weight.context, dtype=weight.dtype), # mean @@ -107,5 +108,5 @@ class Adamax(Optimizer): var[:] = maximum(self.beta2 * var, NDabs(grad)) # update weight -d = mean / var +d = mean / (var + self.epsilon) weight[:] -= lr * d
[incubator-mxnet] branch master updated (74fcb99 -> 4b86c32)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 74fcb99 redirect api reference on v-master to v1.6 (#18607) add 4b86c32 Allow input reordering duing Gluon / CachedOp graph transformations (#17949) No new revisions were added by this update. Summary of changes: src/imperative/cached_op.cc| 58 ++ src/imperative/cached_op.h | 34 +--- src/imperative/cached_op_threadsafe.cc | 4 ++- src/imperative/naive_cached_op.cc | 3 +- tests/python/gpu/test_fusion.py| 35 5 files changed, 94 insertions(+), 40 deletions(-)
[incubator-mxnet] branch master updated: Allow input reordering duing Gluon / CachedOp graph transformations (#17949)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 4b86c32 Allow input reordering duing Gluon / CachedOp graph transformations (#17949) 4b86c32 is described below commit 4b86c32832a994e76b97dfc58c8a672db87e721d Author: mk-61 <56651474+mk...@users.noreply.github.com> AuthorDate: Tue Jun 23 13:49:06 2020 -0700 Allow input reordering duing Gluon / CachedOp graph transformations (#17949) * Initial commit of input reordering in Gluon * Add test for Gluon input reorder * Fix backward in CachedOp for input reordering * Fix test_input_reorder for backward pass * Fix merge error in NaiveCachedOp * Include correct header for std::iota Co-authored-by: Vladimir Cherepanov --- src/imperative/cached_op.cc| 58 ++ src/imperative/cached_op.h | 34 +--- src/imperative/cached_op_threadsafe.cc | 4 ++- src/imperative/naive_cached_op.cc | 3 +- tests/python/gpu/test_fusion.py| 35 5 files changed, 94 insertions(+), 40 deletions(-) diff --git a/src/imperative/cached_op.cc b/src/imperative/cached_op.cc index 83e8d31..7b3a5d3 100644 --- a/src/imperative/cached_op.cc +++ b/src/imperative/cached_op.cc @@ -147,10 +147,9 @@ bool CachedOp::CheckDynamicShapeExists(const Context& default_ctx, auto& state = state_ptr.get_state(); nnvm::Graph& g = state.info.fwd_graph; - ShapeVector shape_inputs; - shape_inputs.reserve(inputs.size()); - for (auto input : inputs) { -shape_inputs.emplace_back(input->shape()); + ShapeVector shape_inputs(inputs.size()); + for (size_t i = 0; i < inputs.size(); ++i) { +shape_inputs[i] = inputs[state.info.input_map[i]]->shape(); } // We leverage the shape inference pass to detect whether dynamic shape exists. // If so, the pass will fail with `contain_dynamic_shape = true`, @@ -176,16 +175,13 @@ bool CachedOp::SetForwardGraph( CHECK_EQ(inputs.size(), num_inputs()); nnvm::Graph& g = info->fwd_graph; - ShapeVector shape_inputs; - DTypeVector dtype_inputs; - StorageTypeVector storage_type_inputs; - shape_inputs.reserve(inputs.size()); - dtype_inputs.reserve(inputs.size()); - storage_type_inputs.reserve(inputs.size()); - for (auto input : inputs) { -shape_inputs.emplace_back(input->shape()); -dtype_inputs.emplace_back(input->dtype()); -storage_type_inputs.emplace_back(input->storage_type()); + ShapeVector shape_inputs(inputs.size()); + DTypeVector dtype_inputs(inputs.size()); + StorageTypeVector storage_type_inputs(inputs.size()); + for (size_t i = 0; i < inputs.size(); ++i) { +shape_inputs[i] = inputs[info->input_map[i]]->shape(); +dtype_inputs[i] = inputs[info->input_map[i]]->dtype(); +storage_type_inputs[i] = inputs[info->input_map[i]]->storage_type(); } bool match = true; @@ -321,9 +317,10 @@ bool CachedOp::SetBackwardGraph( if (info->bwd_input_eid[i] == kEidNotExist) { continue; } -shapes[info->bwd_input_eid[i]] = inputs[i]->shape(); -dtypes[info->bwd_input_eid[i]] = inputs[i]->dtype(); -stypes[info->bwd_input_eid[i]] = inputs[i]->storage_type(); +size_t oi = BwdOriginalInput(info->input_map, i); +shapes[info->bwd_input_eid[i]] = inputs[oi]->shape(); +dtypes[info->bwd_input_eid[i]] = inputs[oi]->dtype(); +stypes[info->bwd_input_eid[i]] = inputs[oi]->storage_type(); } std::pair node_range, entry_range; @@ -649,22 +646,22 @@ OpStatePtr CachedOp::StaticForward( if (config_.static_shape) { for (auto i : config_.param_indices) { auto nid = idx.input_nodes()[i]; - if (!arrays[idx.entry_id(nid, 0)]->IsSame(*inputs[i])) { + if (!arrays[idx.entry_id(nid, 0)]->IsSame(*inputs[state.info.input_map[i]])) { match = false; auto ptr = &state.buff[idx.entry_id(nid, 0)]; CHECK_EQ(arrays[idx.entry_id(nid, 0)], ptr); -*arrays[idx.entry_id(nid, 0)] = *inputs[i]; +*arrays[idx.entry_id(nid, 0)] = *inputs[state.info.input_map[i]]; state.dynamic_entries[idx.entry_id(nid, 0)] = false; } } for (auto i : config_.data_indices) { auto eid = idx.entry_id(idx.input_nodes()[i], 0); - arrays[eid] = inputs[i]; + arrays[eid] = inputs[state.info.input_map[i]]; } } else { for (size_t i = 0; i < num_inputs(); ++i) { auto nid = idx.input_nodes()[i]; - arrays[idx.entry_id(nid, 0)] = inputs[i]; + arrays[idx.entry_id(nid, 0)] = inputs[state.info.input_map[i]]; } } @@ -714,6 +711,7 @@ OpStatePtr CachedOp::DynamicForward( std::lock_guard lock(state.mutex)
[incubator-mxnet] branch master updated (a1db5b2 -> 5df0025)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from a1db5b2 Update .codecov.yml (#18497) add 5df0025 Fix race condition in FusedOp (#18498) No new revisions were added by this update. Summary of changes: src/operator/fusion/fused_op.cc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
[incubator-mxnet] branch master updated (a1db5b2 -> 5df0025)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from a1db5b2 Update .codecov.yml (#18497) add 5df0025 Fix race condition in FusedOp (#18498) No new revisions were added by this update. Summary of changes: src/operator/fusion/fused_op.cc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
[incubator-mxnet] branch master updated (a1db5b2 -> 5df0025)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from a1db5b2 Update .codecov.yml (#18497) add 5df0025 Fix race condition in FusedOp (#18498) No new revisions were added by this update. Summary of changes: src/operator/fusion/fused_op.cc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
[incubator-mxnet] branch master updated (a1db5b2 -> 5df0025)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from a1db5b2 Update .codecov.yml (#18497) add 5df0025 Fix race condition in FusedOp (#18498) No new revisions were added by this update. Summary of changes: src/operator/fusion/fused_op.cc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
[incubator-mxnet] branch master updated: Fix race condition in FusedOp (#18498)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 5df0025 Fix race condition in FusedOp (#18498) 5df0025 is described below commit 5df002567dd2e9ebcfeb620a9ba55adbded743da Author: Przemyslaw Tredak AuthorDate: Fri Jun 5 19:55:06 2020 -0700 Fix race condition in FusedOp (#18498) --- src/operator/fusion/fused_op.cc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/operator/fusion/fused_op.cc b/src/operator/fusion/fused_op.cc index 2ac0b53..ee470cf 100644 --- a/src/operator/fusion/fused_op.cc +++ b/src/operator/fusion/fused_op.cc @@ -61,6 +61,7 @@ FusedOp::FusedOp(const nnvm::NodeAttrs* attrs, const FusedOpConfig& config) : bool FusedOp::InferShape(const nnvm::NodeAttrs &attrs, std::vector *in_attrs, std::vector *out_attrs) { + std::lock_guard lock(my_mutex_); subgraph_.attrs.erase("shape"); subgraph_.attrs.erase("shape_inputs"); std::vector input_shapes(*in_attrs); @@ -95,7 +96,6 @@ bool FusedOp::InferShape(const nnvm::NodeAttrs &attrs, inferred = inferred && !op::shape_is_none(attr); } if (inferred) { -std::lock_guard lock(my_mutex_); intermediate_shapes_.push_back({*in_attrs, *out_attrs, shapes}); } return inferred; @@ -104,6 +104,7 @@ bool FusedOp::InferShape(const nnvm::NodeAttrs &attrs, bool FusedOp::InferType(const nnvm::NodeAttrs &attrs, std::vector *in_attrs, std::vector *out_attrs) { + std::lock_guard lock(my_mutex_); subgraph_.attrs.erase("dtype"); subgraph_.attrs.erase("dtype_inputs"); std::vector input_types(*in_attrs); @@ -138,7 +139,6 @@ bool FusedOp::InferType(const nnvm::NodeAttrs &attrs, inferred = inferred && !op::type_is_none(attr); } if (inferred) { -std::lock_guard lock(my_mutex_); intermediate_dtypes_.push_back({*in_attrs, *out_attrs, types}); } return inferred;
[incubator-mxnet] branch v1.x updated: Revert PR 17767 for fixing GPU memory usage regression (#18283) (#18309)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch v1.x in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/v1.x by this push: new d621e50 Revert PR 17767 for fixing GPU memory usage regression (#18283) (#18309) d621e50 is described below commit d621e50862a96d259135fcfac0098f7709ee0f00 Author: Ziyi Mu AuthorDate: Fri May 29 14:51:17 2020 -0700 Revert PR 17767 for fixing GPU memory usage regression (#18283) (#18309) * Revert "Fix and optimize handling of vectorized memory accesses (#17767)" This reverts commit 5542d03695b4a2589afb88acf128d4ba8ac94d0d. * add license to reverted file --- 3rdparty/mshadow/mshadow/base.h| 48 +++ 3rdparty/mshadow/mshadow/half2.h | 162 +++ src/common/cuda_vectorization.cuh | 283 -- src/operator/mshadow_op.h | 67 + src/operator/tensor/elemwise_binary_op.cuh | 322 - src/operator/tensor/elemwise_binary_op.h | 206 ++--- src/operator/tensor/elemwise_binary_op_basic.cu| 23 +- src/operator/tensor/elemwise_binary_scalar_op.cuh | 207 - src/operator/tensor/elemwise_binary_scalar_op.h| 75 + .../tensor/elemwise_binary_scalar_op_basic.cu | 9 +- .../tensor/elemwise_binary_scalar_op_extended.cu | 15 +- src/operator/tensor/elemwise_sum.cu| 112 +-- src/operator/tensor/elemwise_sum.h | 12 + src/operator/tensor/elemwise_unary_op.cuh | 127 src/operator/tensor/elemwise_unary_op.h| 56 ++-- src/operator/tensor/elemwise_unary_op_basic.cu | 1 - src/operator/tensor/elemwise_unary_op_pow.cu | 1 - src/operator/tensor/elemwise_unary_op_trig.cu | 1 - tests/python/unittest/test_operator.py | 81 +- 19 files changed, 464 insertions(+), 1344 deletions(-) diff --git a/3rdparty/mshadow/mshadow/base.h b/3rdparty/mshadow/mshadow/base.h index 6469bbc..9f53857 100755 --- a/3rdparty/mshadow/mshadow/base.h +++ b/3rdparty/mshadow/mshadow/base.h @@ -295,6 +295,7 @@ extern "C" { } #include "./half.h" +#include "./half2.h" #include "./bfloat.h" #define MSHADOW_HALF_BF_OPERATOR(RTYPE, OP) \ MSHADOW_XINLINE RTYPE operator OP(mshadow::half::half_t a, mshadow::bfloat::bf16_t b) { \ @@ -409,6 +410,11 @@ struct DataType { #endif }; template<> +struct DataType { + static const int kFlag = kFloat16; + static const int kLanes = 2; +}; +template<> struct DataType { static const int kFlag = kBfloat16; static const int kLanes = 1; @@ -1161,6 +1167,48 @@ struct minimum { } #endif +#define MSHADOW_TYPE_SWITCH_WITH_HALF2(type, DType, ...) \ + switch (type) { \ + case mshadow::kFloat32: \ +{ \ + typedef float DType;\ + {__VA_ARGS__} \ +} \ +break;\ + case mshadow::kFloat64: \ +{ \ + typedef double DType; \ + {__VA_ARGS__} \ +} \ +break;\ + case mshadow::kFloat16: \ +{ \ + typedef mshadow::half::half2_t DType; \ + {__VA_ARGS__} \ +} \ +break;\ + case mshadow::kUint8: \ +{ \ + typedef uint8_t DType; \ + {__VA_ARGS__} \ +} \ +break;\ + case mshadow::kInt32: \ +{ \ + typedef int32_t DType; \ + {__VA_ARGS__} \ +} \ +break;\ + case mshadow::kInt64: \ +{
[incubator-mxnet] branch v1.x updated: Revert PR 17767 for fixing GPU memory usage regression (#18283) (#18309)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch v1.x in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/v1.x by this push: new d621e50 Revert PR 17767 for fixing GPU memory usage regression (#18283) (#18309) d621e50 is described below commit d621e50862a96d259135fcfac0098f7709ee0f00 Author: Ziyi Mu AuthorDate: Fri May 29 14:51:17 2020 -0700 Revert PR 17767 for fixing GPU memory usage regression (#18283) (#18309) * Revert "Fix and optimize handling of vectorized memory accesses (#17767)" This reverts commit 5542d03695b4a2589afb88acf128d4ba8ac94d0d. * add license to reverted file --- 3rdparty/mshadow/mshadow/base.h| 48 +++ 3rdparty/mshadow/mshadow/half2.h | 162 +++ src/common/cuda_vectorization.cuh | 283 -- src/operator/mshadow_op.h | 67 + src/operator/tensor/elemwise_binary_op.cuh | 322 - src/operator/tensor/elemwise_binary_op.h | 206 ++--- src/operator/tensor/elemwise_binary_op_basic.cu| 23 +- src/operator/tensor/elemwise_binary_scalar_op.cuh | 207 - src/operator/tensor/elemwise_binary_scalar_op.h| 75 + .../tensor/elemwise_binary_scalar_op_basic.cu | 9 +- .../tensor/elemwise_binary_scalar_op_extended.cu | 15 +- src/operator/tensor/elemwise_sum.cu| 112 +-- src/operator/tensor/elemwise_sum.h | 12 + src/operator/tensor/elemwise_unary_op.cuh | 127 src/operator/tensor/elemwise_unary_op.h| 56 ++-- src/operator/tensor/elemwise_unary_op_basic.cu | 1 - src/operator/tensor/elemwise_unary_op_pow.cu | 1 - src/operator/tensor/elemwise_unary_op_trig.cu | 1 - tests/python/unittest/test_operator.py | 81 +- 19 files changed, 464 insertions(+), 1344 deletions(-) diff --git a/3rdparty/mshadow/mshadow/base.h b/3rdparty/mshadow/mshadow/base.h index 6469bbc..9f53857 100755 --- a/3rdparty/mshadow/mshadow/base.h +++ b/3rdparty/mshadow/mshadow/base.h @@ -295,6 +295,7 @@ extern "C" { } #include "./half.h" +#include "./half2.h" #include "./bfloat.h" #define MSHADOW_HALF_BF_OPERATOR(RTYPE, OP) \ MSHADOW_XINLINE RTYPE operator OP(mshadow::half::half_t a, mshadow::bfloat::bf16_t b) { \ @@ -409,6 +410,11 @@ struct DataType { #endif }; template<> +struct DataType { + static const int kFlag = kFloat16; + static const int kLanes = 2; +}; +template<> struct DataType { static const int kFlag = kBfloat16; static const int kLanes = 1; @@ -1161,6 +1167,48 @@ struct minimum { } #endif +#define MSHADOW_TYPE_SWITCH_WITH_HALF2(type, DType, ...) \ + switch (type) { \ + case mshadow::kFloat32: \ +{ \ + typedef float DType;\ + {__VA_ARGS__} \ +} \ +break;\ + case mshadow::kFloat64: \ +{ \ + typedef double DType; \ + {__VA_ARGS__} \ +} \ +break;\ + case mshadow::kFloat16: \ +{ \ + typedef mshadow::half::half2_t DType; \ + {__VA_ARGS__} \ +} \ +break;\ + case mshadow::kUint8: \ +{ \ + typedef uint8_t DType; \ + {__VA_ARGS__} \ +} \ +break;\ + case mshadow::kInt32: \ +{ \ + typedef int32_t DType; \ + {__VA_ARGS__} \ +} \ +break;\ + case mshadow::kInt64: \ +{
[incubator-mxnet] branch v1.x updated: Revert PR 17767 for fixing GPU memory usage regression (#18283) (#18309)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch v1.x in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/v1.x by this push: new d621e50 Revert PR 17767 for fixing GPU memory usage regression (#18283) (#18309) d621e50 is described below commit d621e50862a96d259135fcfac0098f7709ee0f00 Author: Ziyi Mu AuthorDate: Fri May 29 14:51:17 2020 -0700 Revert PR 17767 for fixing GPU memory usage regression (#18283) (#18309) * Revert "Fix and optimize handling of vectorized memory accesses (#17767)" This reverts commit 5542d03695b4a2589afb88acf128d4ba8ac94d0d. * add license to reverted file --- 3rdparty/mshadow/mshadow/base.h| 48 +++ 3rdparty/mshadow/mshadow/half2.h | 162 +++ src/common/cuda_vectorization.cuh | 283 -- src/operator/mshadow_op.h | 67 + src/operator/tensor/elemwise_binary_op.cuh | 322 - src/operator/tensor/elemwise_binary_op.h | 206 ++--- src/operator/tensor/elemwise_binary_op_basic.cu| 23 +- src/operator/tensor/elemwise_binary_scalar_op.cuh | 207 - src/operator/tensor/elemwise_binary_scalar_op.h| 75 + .../tensor/elemwise_binary_scalar_op_basic.cu | 9 +- .../tensor/elemwise_binary_scalar_op_extended.cu | 15 +- src/operator/tensor/elemwise_sum.cu| 112 +-- src/operator/tensor/elemwise_sum.h | 12 + src/operator/tensor/elemwise_unary_op.cuh | 127 src/operator/tensor/elemwise_unary_op.h| 56 ++-- src/operator/tensor/elemwise_unary_op_basic.cu | 1 - src/operator/tensor/elemwise_unary_op_pow.cu | 1 - src/operator/tensor/elemwise_unary_op_trig.cu | 1 - tests/python/unittest/test_operator.py | 81 +- 19 files changed, 464 insertions(+), 1344 deletions(-) diff --git a/3rdparty/mshadow/mshadow/base.h b/3rdparty/mshadow/mshadow/base.h index 6469bbc..9f53857 100755 --- a/3rdparty/mshadow/mshadow/base.h +++ b/3rdparty/mshadow/mshadow/base.h @@ -295,6 +295,7 @@ extern "C" { } #include "./half.h" +#include "./half2.h" #include "./bfloat.h" #define MSHADOW_HALF_BF_OPERATOR(RTYPE, OP) \ MSHADOW_XINLINE RTYPE operator OP(mshadow::half::half_t a, mshadow::bfloat::bf16_t b) { \ @@ -409,6 +410,11 @@ struct DataType { #endif }; template<> +struct DataType { + static const int kFlag = kFloat16; + static const int kLanes = 2; +}; +template<> struct DataType { static const int kFlag = kBfloat16; static const int kLanes = 1; @@ -1161,6 +1167,48 @@ struct minimum { } #endif +#define MSHADOW_TYPE_SWITCH_WITH_HALF2(type, DType, ...) \ + switch (type) { \ + case mshadow::kFloat32: \ +{ \ + typedef float DType;\ + {__VA_ARGS__} \ +} \ +break;\ + case mshadow::kFloat64: \ +{ \ + typedef double DType; \ + {__VA_ARGS__} \ +} \ +break;\ + case mshadow::kFloat16: \ +{ \ + typedef mshadow::half::half2_t DType; \ + {__VA_ARGS__} \ +} \ +break;\ + case mshadow::kUint8: \ +{ \ + typedef uint8_t DType; \ + {__VA_ARGS__} \ +} \ +break;\ + case mshadow::kInt32: \ +{ \ + typedef int32_t DType; \ + {__VA_ARGS__} \ +} \ +break;\ + case mshadow::kInt64: \ +{
[incubator-mxnet] branch master updated (5343aef -> 4827de8)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 5343aef [Numpy] Fix gluon activations (#18370) add 4827de8 Improve the backward mirroring implementation (#18228) No new revisions were added by this update. Summary of changes: ci/windows/test_py3_cpu.ps1 | 6 + ci/windows/test_py3_gpu.ps1 | 7 + docs/static_site/src/pages/api/faq/env_var.md | 6 +- example/image-classification/README.md| 11 +- python/mxnet/rnn/rnn_cell.py | 5 + src/executor/exec_pass.h | 37 +- src/executor/graph_executor.cc| 128 +++-- src/executor/graph_executor.h | 8 +- src/imperative/cached_op.h| 2 +- src/imperative/imperative.cc | 2 +- src/nnvm/gradient.cc | 709 +- src/nnvm/plan_memory.cc | 15 +- src/operator/nn/activation-inl.h | 9 +- src/operator/nn/activation.cc | 50 +- src/operator/nn/activation.cu | 46 +- src/operator/nn/cudnn/cudnn_batch_norm-inl.h | 16 +- tests/python/unittest/test_memory_opt.py | 202 17 files changed, 1009 insertions(+), 250 deletions(-) create mode 100644 tests/python/unittest/test_memory_opt.py
[incubator-mxnet] branch master updated: Fix races in block scope (#17749)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new f4d0290 Fix races in block scope (#17749) f4d0290 is described below commit f4d0290fc2cd5763aa5a9c890e4d3dcd4ea6ec6b Author: Haozheng Fan AuthorDate: Thu May 21 07:16:22 2020 +0800 Fix races in block scope (#17749) * Add tests * Fix block_scope Co-authored-by: Haibin Lin Co-authored-by: Lin --- python/mxnet/gluon/block.py| 25 +++-- python/mxnet/name.py | 9 tests/python/unittest/test_thread_local.py | 36 ++ 3 files changed, 54 insertions(+), 16 deletions(-) diff --git a/python/mxnet/gluon/block.py b/python/mxnet/gluon/block.py index 6d9ea9a..ded66a7 100644 --- a/python/mxnet/gluon/block.py +++ b/python/mxnet/gluon/block.py @@ -52,8 +52,9 @@ class _BlockScope(object): def __init__(self, block): self._block = weakref.ref(block) if block is not None else None self._counter = {} -self._old_scope = None -self._name_scope = None +self._local = threading.local() +self._local._old_scope = None +self._local._name_scope = None @staticmethod def create(prefix, params, hint): @@ -96,23 +97,23 @@ class _BlockScope(object): block = self._block() if block is None or block._empty_prefix: return self -self._old_scope = getattr(_BlockScope._current, "value", None) +self._local._old_scope = getattr(_BlockScope._current, "value", None) _BlockScope._current.value = self -self._name_scope = _name.Prefix(block.prefix) -self._name_scope.__enter__() -self._profiler_scope = _profiler.Scope(block._profiler_scope_name) -self._profiler_scope.__enter__() +self._local._name_scope = _name.Prefix(block.prefix) +self._local._name_scope.__enter__() +self._local._profiler_scope = _profiler.Scope(block._profiler_scope_name) +self._local._profiler_scope.__enter__() return self def __exit__(self, ptype, value, trace): block = self._block() if block is None or block._empty_prefix: return -self._name_scope.__exit__(ptype, value, trace) -self._name_scope = None -self._profiler_scope.__exit__(ptype, value, trace) -self._profiler_scope = None -_BlockScope._current.value = self._old_scope +self._local._name_scope.__exit__(ptype, value, trace) +self._local._name_scope = None +self._local._profiler_scope.__exit__(ptype, value, trace) +self._local._profiler_scope = None +_BlockScope._current.value = self._local._old_scope def _gather_type_ctx_info(args): diff --git a/python/mxnet/name.py b/python/mxnet/name.py index b276c72..e39752e 100644 --- a/python/mxnet/name.py +++ b/python/mxnet/name.py @@ -30,7 +30,8 @@ class NameManager(with_metaclass(_MXClassPropertyMetaClass, object)): def __init__(self): self._counter = {} -self._old_manager = None +self._local = threading.local() +self._local._old_manager = None def get(self, name, hint): """Get the canonical name for a symbol. @@ -66,13 +67,13 @@ class NameManager(with_metaclass(_MXClassPropertyMetaClass, object)): def __enter__(self): if not hasattr(NameManager._current, "value"): NameManager._current.value = NameManager() -self._old_manager = NameManager._current.value +self._local._old_manager = NameManager._current.value NameManager._current.value = self return self def __exit__(self, ptype, value, trace): -assert self._old_manager -NameManager._current.value = self._old_manager +assert self._local._old_manager +NameManager._current.value = self._local._old_manager #pylint: disable=no-self-argument @classproperty diff --git a/tests/python/unittest/test_thread_local.py b/tests/python/unittest/test_thread_local.py index 5423249..975ad2a 100644 --- a/tests/python/unittest/test_thread_local.py +++ b/tests/python/unittest/test_thread_local.py @@ -222,3 +222,39 @@ def test_np_global_shape(): finally: set_np_shape(0) +def test_blockscope_multithread(): +event = threading.Event() +status = [False] + +class dummy_block(object): +def __init__(self, prefix): +self.prefix = prefix +self._profiler_scope_name = prefix +self._empty_prefix = False + +def f(scope): +try: +with scope: +event.wait() +except: +status[0] = True + +def g(scope): +wit
[incubator-mxnet] branch master updated (7ab326c -> 7f5df07)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 7ab326c [numpy] add dlpack functions to npx (#18342) add 7f5df07 [BUGFIX] Remove Profiler from the runtime feature list, since its always built (#18308) No new revisions were added by this update. Summary of changes: include/mxnet/libinfo.h | 1 - perl-package/AI-MXNet/lib/AI/MXNet/RunTime.pm | 3 +-- python/mxnet/runtime.py | 2 +- src/libinfo.cc| 1 - 4 files changed, 2 insertions(+), 5 deletions(-)
[incubator-mxnet] branch master updated (7ab326c -> 7f5df07)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 7ab326c [numpy] add dlpack functions to npx (#18342) add 7f5df07 [BUGFIX] Remove Profiler from the runtime feature list, since its always built (#18308) No new revisions were added by this update. Summary of changes: include/mxnet/libinfo.h | 1 - perl-package/AI-MXNet/lib/AI/MXNet/RunTime.pm | 3 +-- python/mxnet/runtime.py | 2 +- src/libinfo.cc| 1 - 4 files changed, 2 insertions(+), 5 deletions(-)
[incubator-mxnet] branch master updated (7ab326c -> 7f5df07)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 7ab326c [numpy] add dlpack functions to npx (#18342) add 7f5df07 [BUGFIX] Remove Profiler from the runtime feature list, since its always built (#18308) No new revisions were added by this update. Summary of changes: include/mxnet/libinfo.h | 1 - perl-package/AI-MXNet/lib/AI/MXNet/RunTime.pm | 3 +-- python/mxnet/runtime.py | 2 +- src/libinfo.cc| 1 - 4 files changed, 2 insertions(+), 5 deletions(-)
[incubator-mxnet] branch master updated (7ab326c -> 7f5df07)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 7ab326c [numpy] add dlpack functions to npx (#18342) add 7f5df07 [BUGFIX] Remove Profiler from the runtime feature list, since its always built (#18308) No new revisions were added by this update. Summary of changes: include/mxnet/libinfo.h | 1 - perl-package/AI-MXNet/lib/AI/MXNet/RunTime.pm | 3 +-- python/mxnet/runtime.py | 2 +- src/libinfo.cc| 1 - 4 files changed, 2 insertions(+), 5 deletions(-)
[incubator-mxnet] branch master updated (7ab326c -> 7f5df07)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 7ab326c [numpy] add dlpack functions to npx (#18342) add 7f5df07 [BUGFIX] Remove Profiler from the runtime feature list, since its always built (#18308) No new revisions were added by this update. Summary of changes: include/mxnet/libinfo.h | 1 - perl-package/AI-MXNet/lib/AI/MXNet/RunTime.pm | 3 +-- python/mxnet/runtime.py | 2 +- src/libinfo.cc| 1 - 4 files changed, 2 insertions(+), 5 deletions(-)
[incubator-mxnet] branch master updated (37280e4 -> 09224c4)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 37280e4 Fix deferred compute mode for operators using new FFI (#18284) add 09224c4 Add a timeout to the storage profiler in case mem_counters_ is not yet initialized (#18306) No new revisions were added by this update. Summary of changes: src/profiler/storage_profiler.h | 14 ++ 1 file changed, 14 insertions(+)
[incubator-mxnet] branch master updated (37280e4 -> 09224c4)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 37280e4 Fix deferred compute mode for operators using new FFI (#18284) add 09224c4 Add a timeout to the storage profiler in case mem_counters_ is not yet initialized (#18306) No new revisions were added by this update. Summary of changes: src/profiler/storage_profiler.h | 14 ++ 1 file changed, 14 insertions(+)
[incubator-mxnet] branch master updated: add gelu doc (#18274)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 8a5886a add gelu doc (#18274) 8a5886a is described below commit 8a5886a6770808db78ae62f3fbfe887c507c47de Author: Haibin Lin AuthorDate: Tue May 12 14:37:40 2020 -0700 add gelu doc (#18274) Co-authored-by: Lin --- src/operator/leaky_relu.cc | 1 + 1 file changed, 1 insertion(+) diff --git a/src/operator/leaky_relu.cc b/src/operator/leaky_relu.cc index d3ed234..681ca44 100644 --- a/src/operator/leaky_relu.cc +++ b/src/operator/leaky_relu.cc @@ -150,6 +150,7 @@ when the input is negative and has a slope of one when input is positive. The following modified ReLU Activation functions are supported: - *elu*: Exponential Linear Unit. `y = x > 0 ? x : slope * (exp(x)-1)` +- *gelu*: Gaussian Error Linear Unit. `y = 0.5 * x * (1 + erf(x / sqrt(2)))` - *selu*: Scaled Exponential Linear Unit. `y = lambda * (x > 0 ? x : alpha * (exp(x) - 1))` where *lambda = 1.0507009873554804934193349852946* and *alpha = 1.6732632423543772848170429916717*. - *leaky*: Leaky ReLU. `y = x > 0 ? x : slope * x`
[incubator-mxnet] branch master updated: add gelu doc (#18274)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 8a5886a add gelu doc (#18274) 8a5886a is described below commit 8a5886a6770808db78ae62f3fbfe887c507c47de Author: Haibin Lin AuthorDate: Tue May 12 14:37:40 2020 -0700 add gelu doc (#18274) Co-authored-by: Lin --- src/operator/leaky_relu.cc | 1 + 1 file changed, 1 insertion(+) diff --git a/src/operator/leaky_relu.cc b/src/operator/leaky_relu.cc index d3ed234..681ca44 100644 --- a/src/operator/leaky_relu.cc +++ b/src/operator/leaky_relu.cc @@ -150,6 +150,7 @@ when the input is negative and has a slope of one when input is positive. The following modified ReLU Activation functions are supported: - *elu*: Exponential Linear Unit. `y = x > 0 ? x : slope * (exp(x)-1)` +- *gelu*: Gaussian Error Linear Unit. `y = 0.5 * x * (1 + erf(x / sqrt(2)))` - *selu*: Scaled Exponential Linear Unit. `y = lambda * (x > 0 ? x : alpha * (exp(x) - 1))` where *lambda = 1.0507009873554804934193349852946* and *alpha = 1.6732632423543772848170429916717*. - *leaky*: Leaky ReLU. `y = x > 0 ? x : slope * x`
[incubator-mxnet] branch master updated: Fix interleave matmul doc (#18260)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new de51058 Fix interleave matmul doc (#18260) de51058 is described below commit de510582438ad5fad576eba1b85c845b0ba9989c Author: Haibin Lin AuthorDate: Sat May 9 23:06:27 2020 -0700 Fix interleave matmul doc (#18260) * fix doc * fix doc * fix axis Co-authored-by: Lin --- src/operator/contrib/transformer.cc | 22 -- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/src/operator/contrib/transformer.cc b/src/operator/contrib/transformer.cc index 58826a2..1abd2a0 100644 --- a/src/operator/contrib/transformer.cc +++ b/src/operator/contrib/transformer.cc @@ -655,14 +655,16 @@ the input must be a single tensor of interleaved projections of queries, keys and values following the layout: (seq_length, batch_size, num_heads * head_dim * 3) -the equivalent code would be: -tmp = mx.nd.reshape(queries_keys_values, shape=(0, 0, num_heads, 3, -1)) -q_proj = mx.nd.transpose(tmp[:,:,:,0,:], axes=(1, 2, 0, 3)) -q_proj = mx.nd.reshape(q_proj, shape=(-1, 0, 0), reverse=True) -q_proj = mx.nd.contrib.div_sqrt_dim(q_proj) -k_proj = mx.nd.transpose(tmp[:,:,:,1,:], axes=(1, 2, 0, 3)) -k_proj = mx.nd.reshap(k_proj, shape=(-1, 0, 0), reverse=True) -output = mx.nd.batch_dot(q_proj, k_proj, transpose_b=True) +the equivalent code would be:: + + tmp = mx.nd.reshape(queries_keys_values, shape=(0, 0, num_heads, 3, -1)) + q_proj = mx.nd.transpose(tmp[:,:,:,0,:], axes=(1, 2, 0, 3)) + q_proj = mx.nd.reshape(q_proj, shape=(-1, 0, 0), reverse=True) + q_proj = mx.nd.contrib.div_sqrt_dim(q_proj) + k_proj = mx.nd.transpose(tmp[:,:,:,1,:], axes=(1, 2, 0, 3)) + k_proj = mx.nd.reshape(k_proj, shape=(-1, 0, 0), reverse=True) + output = mx.nd.batch_dot(q_proj, k_proj, transpose_b=True) + )code" ADD_FILELINE) .set_num_inputs(1) .set_num_outputs(1) @@ -703,9 +705,9 @@ the equivalent code would be: tmp = mx.nd.reshape(queries_keys_values, shape=(0, 0, num_heads, 3, -1)) v_proj = mx.nd.transpose(tmp[:,:,:,2,:], axes=(1, 2, 0, 3)) v_proj = mx.nd.reshape(v_proj, shape=(-1, 0, 0), reverse=True) -output = mx.nd.batch_dot(attention, v_proj, transpose_b=True) +output = mx.nd.batch_dot(attention, v_proj) output = mx.nd.reshape(output, shape=(-1, num_heads, 0, 0), reverse=True) -output = mx.nd.transpose(output, axes=(0, 2, 1, 3)) +output = mx.nd.transpose(output, axes=(2, 0, 1, 3)) output = mx.nd.reshape(output, shape=(0, 0, -1)) )code" ADD_FILELINE) .set_num_inputs(2)
[incubator-mxnet] branch master updated: Fix interleave matmul doc (#18260)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new de51058 Fix interleave matmul doc (#18260) de51058 is described below commit de510582438ad5fad576eba1b85c845b0ba9989c Author: Haibin Lin AuthorDate: Sat May 9 23:06:27 2020 -0700 Fix interleave matmul doc (#18260) * fix doc * fix doc * fix axis Co-authored-by: Lin --- src/operator/contrib/transformer.cc | 22 -- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/src/operator/contrib/transformer.cc b/src/operator/contrib/transformer.cc index 58826a2..1abd2a0 100644 --- a/src/operator/contrib/transformer.cc +++ b/src/operator/contrib/transformer.cc @@ -655,14 +655,16 @@ the input must be a single tensor of interleaved projections of queries, keys and values following the layout: (seq_length, batch_size, num_heads * head_dim * 3) -the equivalent code would be: -tmp = mx.nd.reshape(queries_keys_values, shape=(0, 0, num_heads, 3, -1)) -q_proj = mx.nd.transpose(tmp[:,:,:,0,:], axes=(1, 2, 0, 3)) -q_proj = mx.nd.reshape(q_proj, shape=(-1, 0, 0), reverse=True) -q_proj = mx.nd.contrib.div_sqrt_dim(q_proj) -k_proj = mx.nd.transpose(tmp[:,:,:,1,:], axes=(1, 2, 0, 3)) -k_proj = mx.nd.reshap(k_proj, shape=(-1, 0, 0), reverse=True) -output = mx.nd.batch_dot(q_proj, k_proj, transpose_b=True) +the equivalent code would be:: + + tmp = mx.nd.reshape(queries_keys_values, shape=(0, 0, num_heads, 3, -1)) + q_proj = mx.nd.transpose(tmp[:,:,:,0,:], axes=(1, 2, 0, 3)) + q_proj = mx.nd.reshape(q_proj, shape=(-1, 0, 0), reverse=True) + q_proj = mx.nd.contrib.div_sqrt_dim(q_proj) + k_proj = mx.nd.transpose(tmp[:,:,:,1,:], axes=(1, 2, 0, 3)) + k_proj = mx.nd.reshape(k_proj, shape=(-1, 0, 0), reverse=True) + output = mx.nd.batch_dot(q_proj, k_proj, transpose_b=True) + )code" ADD_FILELINE) .set_num_inputs(1) .set_num_outputs(1) @@ -703,9 +705,9 @@ the equivalent code would be: tmp = mx.nd.reshape(queries_keys_values, shape=(0, 0, num_heads, 3, -1)) v_proj = mx.nd.transpose(tmp[:,:,:,2,:], axes=(1, 2, 0, 3)) v_proj = mx.nd.reshape(v_proj, shape=(-1, 0, 0), reverse=True) -output = mx.nd.batch_dot(attention, v_proj, transpose_b=True) +output = mx.nd.batch_dot(attention, v_proj) output = mx.nd.reshape(output, shape=(-1, num_heads, 0, 0), reverse=True) -output = mx.nd.transpose(output, axes=(0, 2, 1, 3)) +output = mx.nd.transpose(output, axes=(2, 0, 1, 3)) output = mx.nd.reshape(output, shape=(0, 0, -1)) )code" ADD_FILELINE) .set_num_inputs(2)
[incubator-mxnet] branch master updated (fb73a17 -> e796ae9)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from fb73a17 Switch to C++17 and modernize toolchain + CI (#17984) add e796ae9 Integrate Horovod training API as part of MXNet native distributed training API (#17531) No new revisions were added by this update. Summary of changes: ci/docker/runtime_functions.sh | 5 +- .../{cifar10_dist.py => cifar10_kvstore_hvd.py}| 243 - python/mxnet/gluon/trainer.py | 1 + python/mxnet/kvstore/__init__.py | 1 + python/mxnet/kvstore/horovod.py| 161 ++ python/mxnet/kvstore/kvstore.py| 3 + tests/nightly/dist_device_sync_kvstore_horovod.py | 80 +++ tests/nightly/test_distributed_training-gpu.sh | 11 +- tools/launch.py| 63 +++--- 9 files changed, 429 insertions(+), 139 deletions(-) copy example/distributed_training/{cifar10_dist.py => cifar10_kvstore_hvd.py} (52%) create mode 100644 python/mxnet/kvstore/horovod.py create mode 100644 tests/nightly/dist_device_sync_kvstore_horovod.py
[incubator-mxnet] branch master updated (fb73a17 -> e796ae9)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from fb73a17 Switch to C++17 and modernize toolchain + CI (#17984) add e796ae9 Integrate Horovod training API as part of MXNet native distributed training API (#17531) No new revisions were added by this update. Summary of changes: ci/docker/runtime_functions.sh | 5 +- .../{cifar10_dist.py => cifar10_kvstore_hvd.py}| 243 - python/mxnet/gluon/trainer.py | 1 + python/mxnet/kvstore/__init__.py | 1 + python/mxnet/kvstore/horovod.py| 161 ++ python/mxnet/kvstore/kvstore.py| 3 + tests/nightly/dist_device_sync_kvstore_horovod.py | 80 +++ tests/nightly/test_distributed_training-gpu.sh | 11 +- tools/launch.py| 63 +++--- 9 files changed, 429 insertions(+), 139 deletions(-) copy example/distributed_training/{cifar10_dist.py => cifar10_kvstore_hvd.py} (52%) create mode 100644 python/mxnet/kvstore/horovod.py create mode 100644 tests/nightly/dist_device_sync_kvstore_horovod.py
[incubator-mxnet] branch master updated (fb73a17 -> e796ae9)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from fb73a17 Switch to C++17 and modernize toolchain + CI (#17984) add e796ae9 Integrate Horovod training API as part of MXNet native distributed training API (#17531) No new revisions were added by this update. Summary of changes: ci/docker/runtime_functions.sh | 5 +- .../{cifar10_dist.py => cifar10_kvstore_hvd.py}| 243 - python/mxnet/gluon/trainer.py | 1 + python/mxnet/kvstore/__init__.py | 1 + python/mxnet/kvstore/horovod.py| 161 ++ python/mxnet/kvstore/kvstore.py| 3 + tests/nightly/dist_device_sync_kvstore_horovod.py | 80 +++ tests/nightly/test_distributed_training-gpu.sh | 11 +- tools/launch.py| 63 +++--- 9 files changed, 429 insertions(+), 139 deletions(-) copy example/distributed_training/{cifar10_dist.py => cifar10_kvstore_hvd.py} (52%) create mode 100644 python/mxnet/kvstore/horovod.py create mode 100644 tests/nightly/dist_device_sync_kvstore_horovod.py
[incubator-mxnet] branch master updated (fb73a17 -> e796ae9)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from fb73a17 Switch to C++17 and modernize toolchain + CI (#17984) add e796ae9 Integrate Horovod training API as part of MXNet native distributed training API (#17531) No new revisions were added by this update. Summary of changes: ci/docker/runtime_functions.sh | 5 +- .../{cifar10_dist.py => cifar10_kvstore_hvd.py}| 243 - python/mxnet/gluon/trainer.py | 1 + python/mxnet/kvstore/__init__.py | 1 + python/mxnet/kvstore/horovod.py| 161 ++ python/mxnet/kvstore/kvstore.py| 3 + tests/nightly/dist_device_sync_kvstore_horovod.py | 80 +++ tests/nightly/test_distributed_training-gpu.sh | 11 +- tools/launch.py| 63 +++--- 9 files changed, 429 insertions(+), 139 deletions(-) copy example/distributed_training/{cifar10_dist.py => cifar10_kvstore_hvd.py} (52%) create mode 100644 python/mxnet/kvstore/horovod.py create mode 100644 tests/nightly/dist_device_sync_kvstore_horovod.py
[incubator-mxnet] branch master updated (fb73a17 -> e796ae9)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from fb73a17 Switch to C++17 and modernize toolchain + CI (#17984) add e796ae9 Integrate Horovod training API as part of MXNet native distributed training API (#17531) No new revisions were added by this update. Summary of changes: ci/docker/runtime_functions.sh | 5 +- .../{cifar10_dist.py => cifar10_kvstore_hvd.py}| 243 - python/mxnet/gluon/trainer.py | 1 + python/mxnet/kvstore/__init__.py | 1 + python/mxnet/kvstore/horovod.py| 161 ++ python/mxnet/kvstore/kvstore.py| 3 + tests/nightly/dist_device_sync_kvstore_horovod.py | 80 +++ tests/nightly/test_distributed_training-gpu.sh | 11 +- tools/launch.py| 63 +++--- 9 files changed, 429 insertions(+), 139 deletions(-) copy example/distributed_training/{cifar10_dist.py => cifar10_kvstore_hvd.py} (52%) create mode 100644 python/mxnet/kvstore/horovod.py create mode 100644 tests/nightly/dist_device_sync_kvstore_horovod.py
[incubator-mxnet] branch master updated (da95add -> 6cc990c)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from da95add Fix vector access out of bound in MKLDNNConvolutionBackward (#17997) add 6cc990c Revert "[MXNET-#16795] Byteps-KVStore: Intergrate Byteps into mxnet as new type of kvstore backend (#17555)" (#17998) No new revisions were added by this update. Summary of changes: ci/docker/runtime_functions.sh | 19 -- ci/jenkins/Jenkins_steps.groovy | 14 -- ci/jenkins/Jenkinsfile_edge | 2 +- ci/jenkins/Jenkinsfile_unix_gpu | 1 - python/mxnet/kvstore/__init__.py | 1 - python/mxnet/kvstore/base.py | 9 +- python/mxnet/kvstore/byteps.py | 255 --- tests/nightly/dist_device_sync_kvstore_byteps.py | 114 -- tools/byteps_launcher.py | 195 - tools/launch.py | 17 +- 10 files changed, 3 insertions(+), 624 deletions(-) delete mode 100644 python/mxnet/kvstore/byteps.py delete mode 100644 tests/nightly/dist_device_sync_kvstore_byteps.py delete mode 100644 tools/byteps_launcher.py
[incubator-mxnet] branch master updated (da95add -> 6cc990c)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from da95add Fix vector access out of bound in MKLDNNConvolutionBackward (#17997) add 6cc990c Revert "[MXNET-#16795] Byteps-KVStore: Intergrate Byteps into mxnet as new type of kvstore backend (#17555)" (#17998) No new revisions were added by this update. Summary of changes: ci/docker/runtime_functions.sh | 19 -- ci/jenkins/Jenkins_steps.groovy | 14 -- ci/jenkins/Jenkinsfile_edge | 2 +- ci/jenkins/Jenkinsfile_unix_gpu | 1 - python/mxnet/kvstore/__init__.py | 1 - python/mxnet/kvstore/base.py | 9 +- python/mxnet/kvstore/byteps.py | 255 --- tests/nightly/dist_device_sync_kvstore_byteps.py | 114 -- tools/byteps_launcher.py | 195 - tools/launch.py | 17 +- 10 files changed, 3 insertions(+), 624 deletions(-) delete mode 100644 python/mxnet/kvstore/byteps.py delete mode 100644 tests/nightly/dist_device_sync_kvstore_byteps.py delete mode 100644 tools/byteps_launcher.py
[incubator-mxnet] branch master updated (c244f9f -> 5adcbf8)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from c244f9f [MXNET-#16795] Byteps-KVStore: Intergrate Byteps into mxnet as new type of kvstore backend (#17555) add 5adcbf8 GPU gemms true fp16 (#17466) No new revisions were added by this update. Summary of changes: docs/static_site/src/pages/api/faq/env_var.md | 4 ++ src/operator/contrib/transformer.cu | 30 +-- src/operator/linalg_impl.h| 53 ++- tests/python/gpu/test_gluon_gpu.py| 21 +++ 4 files changed, 95 insertions(+), 13 deletions(-)
[incubator-mxnet] branch master updated (ff234db -> c244f9f)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from ff234db Skip test_gluon_data.py on OSX (#17969) add c244f9f [MXNET-#16795] Byteps-KVStore: Intergrate Byteps into mxnet as new type of kvstore backend (#17555) No new revisions were added by this update. Summary of changes: ci/docker/runtime_functions.sh | 19 ++ ci/jenkins/Jenkins_steps.groovy| 14 ++ ci/jenkins/Jenkinsfile_edge| 2 +- ci/jenkins/Jenkinsfile_unix_gpu| 1 + python/mxnet/kvstore/__init__.py | 1 + python/mxnet/kvstore/base.py | 9 +- python/mxnet/kvstore/byteps.py | 255 + ...ustom.py => dist_device_sync_kvstore_byteps.py} | 52 +++-- tools/byteps_launcher.py | 195 tools/launch.py| 17 +- 10 files changed, 545 insertions(+), 20 deletions(-) create mode 100644 python/mxnet/kvstore/byteps.py copy tests/nightly/{dist_device_sync_kvstore_custom.py => dist_device_sync_kvstore_byteps.py} (58%) create mode 100644 tools/byteps_launcher.py
[incubator-mxnet] branch master updated (84b0ddd -> 03b8146)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 84b0ddd Add USE_DIST_KVSTORE=ON to GPU build (#17911) add 03b8146 Skip test_kvstore_gpu.test_rsp_push_pull (#17983) No new revisions were added by this update. Summary of changes: tests/python/gpu/test_kvstore_gpu.py | 1 + 1 file changed, 1 insertion(+)
[incubator-mxnet] branch master updated (84b0ddd -> 03b8146)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 84b0ddd Add USE_DIST_KVSTORE=ON to GPU build (#17911) add 03b8146 Skip test_kvstore_gpu.test_rsp_push_pull (#17983) No new revisions were added by this update. Summary of changes: tests/python/gpu/test_kvstore_gpu.py | 1 + 1 file changed, 1 insertion(+)
[incubator-mxnet] branch master updated (84b0ddd -> 03b8146)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 84b0ddd Add USE_DIST_KVSTORE=ON to GPU build (#17911) add 03b8146 Skip test_kvstore_gpu.test_rsp_push_pull (#17983) No new revisions were added by this update. Summary of changes: tests/python/gpu/test_kvstore_gpu.py | 1 + 1 file changed, 1 insertion(+)
[incubator-mxnet] branch master updated (84b0ddd -> 03b8146)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 84b0ddd Add USE_DIST_KVSTORE=ON to GPU build (#17911) add 03b8146 Skip test_kvstore_gpu.test_rsp_push_pull (#17983) No new revisions were added by this update. Summary of changes: tests/python/gpu/test_kvstore_gpu.py | 1 + 1 file changed, 1 insertion(+)
[incubator-mxnet] branch master updated (84b0ddd -> 03b8146)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 84b0ddd Add USE_DIST_KVSTORE=ON to GPU build (#17911) add 03b8146 Skip test_kvstore_gpu.test_rsp_push_pull (#17983) No new revisions were added by this update. Summary of changes: tests/python/gpu/test_kvstore_gpu.py | 1 + 1 file changed, 1 insertion(+)
[incubator-mxnet] branch master updated (1b107a0 -> 84b0ddd)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 1b107a0 Remove redundant condition in np_matrix_op.cc (#17933) add 84b0ddd Add USE_DIST_KVSTORE=ON to GPU build (#17911) No new revisions were added by this update. Summary of changes: ci/docker/runtime_functions.sh | 12 ++-- .../nightly/test_distributed_training-gpu.sh | 34 ++ 2 files changed, 25 insertions(+), 21 deletions(-) copy scala-package/examples/scripts/neuralstyle_end2end/run_test_end2end.sh => tests/nightly/test_distributed_training-gpu.sh (55%) mode change 100644 => 100755
[incubator-mxnet] branch master updated (1b107a0 -> 84b0ddd)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 1b107a0 Remove redundant condition in np_matrix_op.cc (#17933) add 84b0ddd Add USE_DIST_KVSTORE=ON to GPU build (#17911) No new revisions were added by this update. Summary of changes: ci/docker/runtime_functions.sh | 12 ++-- .../nightly/test_distributed_training-gpu.sh | 34 ++ 2 files changed, 25 insertions(+), 21 deletions(-) copy scala-package/examples/scripts/neuralstyle_end2end/run_test_end2end.sh => tests/nightly/test_distributed_training-gpu.sh (55%) mode change 100644 => 100755
[incubator-mxnet] branch master updated (66ee118 -> 792011e)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 66ee118 Fix Windows GPU CI (#17962) add 792011e Omit kNullOp req when comparing changed NDArrays in static_shape=True (#17966) No new revisions were added by this update. Summary of changes: src/imperative/cached_op.cc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
[incubator-mxnet] branch master updated: Use FP32 copy of weights for norm (multitensor LAMB optimizer) (#17700)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 8e39518 Use FP32 copy of weights for norm (multitensor LAMB optimizer) (#17700) 8e39518 is described below commit 8e3951876b3598c8b52606a467add5f239d88b38 Author: MoisesHer <50716238+moises...@users.noreply.github.com> AuthorDate: Mon Mar 23 09:55:24 2020 -0700 Use FP32 copy of weights for norm (multitensor LAMB optimizer) (#17700) * Use fp32 copy of weights for computing norm in LAMB optimizer * Fix cpplint --- src/operator/contrib/multi_lamb-inl.h | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/src/operator/contrib/multi_lamb-inl.h b/src/operator/contrib/multi_lamb-inl.h index 7fb186f..256445a 100644 --- a/src/operator/contrib/multi_lamb-inl.h +++ b/src/operator/contrib/multi_lamb-inl.h @@ -282,10 +282,14 @@ inline void MultiLAMB(const nnvm::NodeAttrs& attrs, FillMultiLAMBKernelParam (attrs, ctx, inputs, outputs, &kernel_params); -// create vector of TBlob with all the weights contiguous -std::vector weights; +// create vector of TBlob with all the weights contiguous to compute the norm +// if mixed precision, use fp32 copy +std::vector weights_for_norm; +int position_weights = 0; +if (!std::is_same::value) + position_weights = input_stride - 1; for (size_t index = 0; index < kernel_params.ntensors; ++index) { -weights.emplace_back(inputs[index*input_stride]); + weights_for_norm.emplace_back(inputs[index * input_stride + position_weights]); } // Calculate amount of temporary storage (temp_g, r1, r2, block_to_tensor, block_to_chunk) @@ -327,7 +331,7 @@ inline void MultiLAMB(const nnvm::NodeAttrs& attrs, Tensor block_to_chunk(reinterpret_cast(&workspace[pos_wspace]), Shape1(kernel_params.nchunks), s); -MultiSumSqRun(weights, kernel_params.ntensors, r1.dptr_, ctx); +MultiSumSqRun(weights_for_norm, kernel_params.ntensors, r1.dptr_, ctx); CallKernel1(s, kernel_params, param, temp_g.dptr_, block_to_tensor.dptr_, block_to_chunk.dptr_);
[incubator-mxnet] branch master updated (2f358fd -> b133899)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 2f358fd [Numpy] Add op fmax, fmin, fmod (#17567) add b133899 Use multi-tensor sumSQ in clip_global_norm (#17652) No new revisions were added by this update. Summary of changes: python/mxnet/gluon/utils.py| 24 tests/python/gpu/test_gluon_gpu.py | 14 +- 2 files changed, 25 insertions(+), 13 deletions(-)
[incubator-mxnet] branch master updated (2f358fd -> b133899)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 2f358fd [Numpy] Add op fmax, fmin, fmod (#17567) add b133899 Use multi-tensor sumSQ in clip_global_norm (#17652) No new revisions were added by this update. Summary of changes: python/mxnet/gluon/utils.py| 24 tests/python/gpu/test_gluon_gpu.py | 14 +- 2 files changed, 25 insertions(+), 13 deletions(-)
[incubator-mxnet] branch master updated (91c4516 -> 9993738)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 91c4516 [numpy] Add np.random.pareto and np.random.power (#17517) add 9993738 Partitioning Gluon HybridBlocks (#15969) No new revisions were added by this update. Summary of changes: python/mxnet/gluon/block.py | 80 +-- tests/python/unittest/test_subgraph_op.py | 789 +- 2 files changed, 508 insertions(+), 361 deletions(-)
[incubator-mxnet] branch master updated (91c4516 -> 9993738)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 91c4516 [numpy] Add np.random.pareto and np.random.power (#17517) add 9993738 Partitioning Gluon HybridBlocks (#15969) No new revisions were added by this update. Summary of changes: python/mxnet/gluon/block.py | 80 +-- tests/python/unittest/test_subgraph_op.py | 789 +- 2 files changed, 508 insertions(+), 361 deletions(-)
[incubator-mxnet] branch master updated (b1e4911 -> 65aab9e)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from b1e4911 Multithreaded Inference Support (#16654) add 65aab9e Add p3 KVStore (#15124) No new revisions were added by this update. Summary of changes: ci/docker/runtime_functions.sh | 1 + docs/static_site/src/pages/api/faq/env_var.md| 5 + include/mxnet/c_api.h| 42 include/mxnet/kvstore.h | 28 +++ python/mxnet/error.py| 1 + python/mxnet/gluon/trainer.py| 2 +- python/mxnet/kvstore/base.py | 4 +- python/mxnet/kvstore/kvstore.py | 33 ++- python/mxnet/model.py| 21 +- src/c_api/c_api.cc | 52 + src/kvstore/kvstore.cc | 9 +- src/kvstore/kvstore_dist.h | 170 --- src/kvstore/kvstore_local.h | 40 src/kvstore/p3store_dist.h | 256 +++ tests/nightly/dist_device_sync_kvstore.py| 12 +- tests/nightly/dist_device_sync_kvstore_custom.py | 2 +- tools/launch.py | 4 + 17 files changed, 571 insertions(+), 111 deletions(-) create mode 100644 src/kvstore/p3store_dist.h
[incubator-mxnet] branch master updated (b1e4911 -> 65aab9e)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from b1e4911 Multithreaded Inference Support (#16654) add 65aab9e Add p3 KVStore (#15124) No new revisions were added by this update. Summary of changes: ci/docker/runtime_functions.sh | 1 + docs/static_site/src/pages/api/faq/env_var.md| 5 + include/mxnet/c_api.h| 42 include/mxnet/kvstore.h | 28 +++ python/mxnet/error.py| 1 + python/mxnet/gluon/trainer.py| 2 +- python/mxnet/kvstore/base.py | 4 +- python/mxnet/kvstore/kvstore.py | 33 ++- python/mxnet/model.py| 21 +- src/c_api/c_api.cc | 52 + src/kvstore/kvstore.cc | 9 +- src/kvstore/kvstore_dist.h | 170 --- src/kvstore/kvstore_local.h | 40 src/kvstore/p3store_dist.h | 256 +++ tests/nightly/dist_device_sync_kvstore.py| 12 +- tests/nightly/dist_device_sync_kvstore_custom.py | 2 +- tools/launch.py | 4 + 17 files changed, 571 insertions(+), 111 deletions(-) create mode 100644 src/kvstore/p3store_dist.h
[incubator-mxnet] branch master updated (a1b0ff2 -> 3ef8935)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from a1b0ff2 update mkl to 2020.0 (#17355) add 3ef8935 [LICENSE] fix cpp predcit license (#17377) No new revisions were added by this update. Summary of changes: .../predict-cpp/image-classification-predict.cc| 23 +- .../nightly/apache_rat_license_check/rat-excludes | 3 ++- tools/license_header.py| 3 +++ 3 files changed, 14 insertions(+), 15 deletions(-)
[incubator-mxnet] branch master updated (a1b0ff2 -> 3ef8935)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from a1b0ff2 update mkl to 2020.0 (#17355) add 3ef8935 [LICENSE] fix cpp predcit license (#17377) No new revisions were added by this update. Summary of changes: .../predict-cpp/image-classification-predict.cc| 23 +- .../nightly/apache_rat_license_check/rat-excludes | 3 ++- tools/license_header.py| 3 +++ 3 files changed, 14 insertions(+), 15 deletions(-)
[incubator-mxnet] 01/02: [BUGFIX] fix model zoo parallel download (#17372)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch v1.6.x in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git commit 69f4f3161364f290be053ecbd48931a40bd7ab68 Author: Haibin Lin AuthorDate: Thu Jan 23 19:32:21 2020 -0800 [BUGFIX] fix model zoo parallel download (#17372) * use temp file * fix dependency * Update model_store.py * Update test_gluon_model_zoo.py * remove NamedTempFile --- python/mxnet/gluon/model_zoo/model_store.py | 22 +--- python/mxnet/gluon/utils.py | 30 +++ tests/python/unittest/test_gluon_model_zoo.py | 16 ++ 3 files changed, 52 insertions(+), 16 deletions(-) diff --git a/python/mxnet/gluon/model_zoo/model_store.py b/python/mxnet/gluon/model_zoo/model_store.py index 11ac47b..6da7dd1 100644 --- a/python/mxnet/gluon/model_zoo/model_store.py +++ b/python/mxnet/gluon/model_zoo/model_store.py @@ -22,8 +22,11 @@ __all__ = ['get_model_file', 'purge'] import os import zipfile import logging +import tempfile +import uuid +import shutil -from ..utils import download, check_sha1 +from ..utils import download, check_sha1, replace_file from ... import base, util _model_sha1 = {name: checksum for checksum, name in [ @@ -103,16 +106,21 @@ def get_model_file(name, root=os.path.join(base.data_dir(), 'models')): util.makedirs(root) -zip_file_path = os.path.join(root, file_name+'.zip') repo_url = os.environ.get('MXNET_GLUON_REPO', apache_repo_url) if repo_url[-1] != '/': repo_url = repo_url + '/' + +random_uuid = str(uuid.uuid4()) +temp_zip_file_path = os.path.join(root, file_name+'.zip'+random_uuid) download(_url_format.format(repo_url=repo_url, file_name=file_name), - path=zip_file_path, - overwrite=True) -with zipfile.ZipFile(zip_file_path) as zf: -zf.extractall(root) -os.remove(zip_file_path) + path=temp_zip_file_path, overwrite=True) +with zipfile.ZipFile(temp_zip_file_path) as zf: +temp_dir = tempfile.mkdtemp(dir=root) +zf.extractall(temp_dir) +temp_file_path = os.path.join(temp_dir, file_name+'.params') +replace_file(temp_file_path, file_path) +shutil.rmtree(temp_dir) +os.remove(temp_zip_file_path) if check_sha1(file_path, sha1_hash): return file_path diff --git a/python/mxnet/gluon/utils.py b/python/mxnet/gluon/utils.py index 81a8dba..63e11ea 100644 --- a/python/mxnet/gluon/utils.py +++ b/python/mxnet/gluon/utils.py @@ -21,7 +21,7 @@ from __future__ import absolute_import __all__ = ['split_data', 'split_and_load', 'clip_global_norm', - 'check_sha1', 'download'] + 'check_sha1', 'download', 'replace_file'] import os import sys @@ -35,7 +35,7 @@ import requests import numpy as np from .. import ndarray -from ..util import is_np_shape, is_np_array +from ..util import is_np_shape, is_np_array, makedirs from .. import numpy as _mx_np # pylint: disable=reimported @@ -209,8 +209,14 @@ def check_sha1(filename, sha1_hash): if not sys.platform.startswith('win32'): # refer to https://github.com/untitaker/python-atomicwrites -def _replace_atomic(src, dst): -"""Implement atomic os.replace with linux and OSX. Internal use only""" +def replace_file(src, dst): +"""Implement atomic os.replace with linux and OSX. + +Parameters +-- +src : source file path +dst : destination file path +""" try: os.rename(src, dst) except OSError: @@ -252,11 +258,17 @@ else: finally: raise OSError(msg) -def _replace_atomic(src, dst): +def replace_file(src, dst): """Implement atomic os.replace with windows. + refer to https://docs.microsoft.com/en-us/windows/desktop/api/winbase/nf-winbase-movefileexw The function fails when one of the process(copy, flush, delete) fails. -Internal use only""" + +Parameters +-- +src : source file path +dst : destination file path +""" _handle_errors(ctypes.windll.kernel32.MoveFileExW( _str_to_unicode(src), _str_to_unicode(dst), _windows_default_flags | _MOVEFILE_REPLACE_EXISTING @@ -264,7 +276,7 @@ else: def download(url, path=None, overwrite=False, sha1_hash=None, retries=5, verify_ssl=True): -"""Download an given URL +"""Download a given URL Parameters -- @@ -310,7 +322,7 @@ def download(url, pa
[incubator-mxnet] 02/02: Update symbol.py (#17408)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch v1.6.x in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git commit 2c61787c78a51ea87d46c4820746a33b70fca64c Author: Haibin Lin AuthorDate: Thu Jan 23 15:36:47 2020 -0800 Update symbol.py (#17408) --- python/mxnet/contrib/amp/lists/symbol.py | 4 1 file changed, 4 insertions(+) diff --git a/python/mxnet/contrib/amp/lists/symbol.py b/python/mxnet/contrib/amp/lists/symbol.py index 2146853..d501a7d 100644 --- a/python/mxnet/contrib/amp/lists/symbol.py +++ b/python/mxnet/contrib/amp/lists/symbol.py @@ -591,6 +591,10 @@ WIDEST_TYPE_CASTS = [ '_contrib_dgl_graph_compact', '_contrib_dgl_subgraph', '_contrib_edge_id', +'_contrib_interleaved_matmul_encdec_qk', +'_contrib_interleaved_matmul_encdec_valatt', +'_contrib_interleaved_matmul_selfatt_qk', +'_contrib_interleaved_matmul_selfatt_valatt', 'where', '_sparse_where', '_sparse_broadcast_add',
[incubator-mxnet] branch v1.6.x updated (1cb738a -> 2c61787)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch v1.6.x in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 1cb738a Update ps-lite LICENSE (#17351) (#17370) new 69f4f31 [BUGFIX] fix model zoo parallel download (#17372) new 2c61787 Update symbol.py (#17408) The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: python/mxnet/contrib/amp/lists/symbol.py | 4 python/mxnet/gluon/model_zoo/model_store.py | 22 +--- python/mxnet/gluon/utils.py | 30 +++ tests/python/unittest/test_gluon_model_zoo.py | 16 ++ 4 files changed, 56 insertions(+), 16 deletions(-)
[incubator-mxnet] 02/02: Update symbol.py (#17408)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch v1.6.x in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git commit 2c61787c78a51ea87d46c4820746a33b70fca64c Author: Haibin Lin AuthorDate: Thu Jan 23 15:36:47 2020 -0800 Update symbol.py (#17408) --- python/mxnet/contrib/amp/lists/symbol.py | 4 1 file changed, 4 insertions(+) diff --git a/python/mxnet/contrib/amp/lists/symbol.py b/python/mxnet/contrib/amp/lists/symbol.py index 2146853..d501a7d 100644 --- a/python/mxnet/contrib/amp/lists/symbol.py +++ b/python/mxnet/contrib/amp/lists/symbol.py @@ -591,6 +591,10 @@ WIDEST_TYPE_CASTS = [ '_contrib_dgl_graph_compact', '_contrib_dgl_subgraph', '_contrib_edge_id', +'_contrib_interleaved_matmul_encdec_qk', +'_contrib_interleaved_matmul_encdec_valatt', +'_contrib_interleaved_matmul_selfatt_qk', +'_contrib_interleaved_matmul_selfatt_valatt', 'where', '_sparse_where', '_sparse_broadcast_add',
[incubator-mxnet] 01/02: [BUGFIX] fix model zoo parallel download (#17372)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch v1.6.x in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git commit 69f4f3161364f290be053ecbd48931a40bd7ab68 Author: Haibin Lin AuthorDate: Thu Jan 23 19:32:21 2020 -0800 [BUGFIX] fix model zoo parallel download (#17372) * use temp file * fix dependency * Update model_store.py * Update test_gluon_model_zoo.py * remove NamedTempFile --- python/mxnet/gluon/model_zoo/model_store.py | 22 +--- python/mxnet/gluon/utils.py | 30 +++ tests/python/unittest/test_gluon_model_zoo.py | 16 ++ 3 files changed, 52 insertions(+), 16 deletions(-) diff --git a/python/mxnet/gluon/model_zoo/model_store.py b/python/mxnet/gluon/model_zoo/model_store.py index 11ac47b..6da7dd1 100644 --- a/python/mxnet/gluon/model_zoo/model_store.py +++ b/python/mxnet/gluon/model_zoo/model_store.py @@ -22,8 +22,11 @@ __all__ = ['get_model_file', 'purge'] import os import zipfile import logging +import tempfile +import uuid +import shutil -from ..utils import download, check_sha1 +from ..utils import download, check_sha1, replace_file from ... import base, util _model_sha1 = {name: checksum for checksum, name in [ @@ -103,16 +106,21 @@ def get_model_file(name, root=os.path.join(base.data_dir(), 'models')): util.makedirs(root) -zip_file_path = os.path.join(root, file_name+'.zip') repo_url = os.environ.get('MXNET_GLUON_REPO', apache_repo_url) if repo_url[-1] != '/': repo_url = repo_url + '/' + +random_uuid = str(uuid.uuid4()) +temp_zip_file_path = os.path.join(root, file_name+'.zip'+random_uuid) download(_url_format.format(repo_url=repo_url, file_name=file_name), - path=zip_file_path, - overwrite=True) -with zipfile.ZipFile(zip_file_path) as zf: -zf.extractall(root) -os.remove(zip_file_path) + path=temp_zip_file_path, overwrite=True) +with zipfile.ZipFile(temp_zip_file_path) as zf: +temp_dir = tempfile.mkdtemp(dir=root) +zf.extractall(temp_dir) +temp_file_path = os.path.join(temp_dir, file_name+'.params') +replace_file(temp_file_path, file_path) +shutil.rmtree(temp_dir) +os.remove(temp_zip_file_path) if check_sha1(file_path, sha1_hash): return file_path diff --git a/python/mxnet/gluon/utils.py b/python/mxnet/gluon/utils.py index 81a8dba..63e11ea 100644 --- a/python/mxnet/gluon/utils.py +++ b/python/mxnet/gluon/utils.py @@ -21,7 +21,7 @@ from __future__ import absolute_import __all__ = ['split_data', 'split_and_load', 'clip_global_norm', - 'check_sha1', 'download'] + 'check_sha1', 'download', 'replace_file'] import os import sys @@ -35,7 +35,7 @@ import requests import numpy as np from .. import ndarray -from ..util import is_np_shape, is_np_array +from ..util import is_np_shape, is_np_array, makedirs from .. import numpy as _mx_np # pylint: disable=reimported @@ -209,8 +209,14 @@ def check_sha1(filename, sha1_hash): if not sys.platform.startswith('win32'): # refer to https://github.com/untitaker/python-atomicwrites -def _replace_atomic(src, dst): -"""Implement atomic os.replace with linux and OSX. Internal use only""" +def replace_file(src, dst): +"""Implement atomic os.replace with linux and OSX. + +Parameters +-- +src : source file path +dst : destination file path +""" try: os.rename(src, dst) except OSError: @@ -252,11 +258,17 @@ else: finally: raise OSError(msg) -def _replace_atomic(src, dst): +def replace_file(src, dst): """Implement atomic os.replace with windows. + refer to https://docs.microsoft.com/en-us/windows/desktop/api/winbase/nf-winbase-movefileexw The function fails when one of the process(copy, flush, delete) fails. -Internal use only""" + +Parameters +-- +src : source file path +dst : destination file path +""" _handle_errors(ctypes.windll.kernel32.MoveFileExW( _str_to_unicode(src), _str_to_unicode(dst), _windows_default_flags | _MOVEFILE_REPLACE_EXISTING @@ -264,7 +276,7 @@ else: def download(url, path=None, overwrite=False, sha1_hash=None, retries=5, verify_ssl=True): -"""Download an given URL +"""Download a given URL Parameters -- @@ -310,7 +322,7 @@ def download(url, pa
[incubator-mxnet] branch v1.6.x updated (1cb738a -> 2c61787)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch v1.6.x in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 1cb738a Update ps-lite LICENSE (#17351) (#17370) new 69f4f31 [BUGFIX] fix model zoo parallel download (#17372) new 2c61787 Update symbol.py (#17408) The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: python/mxnet/contrib/amp/lists/symbol.py | 4 python/mxnet/gluon/model_zoo/model_store.py | 22 +--- python/mxnet/gluon/utils.py | 30 +++ tests/python/unittest/test_gluon_model_zoo.py | 16 ++ 4 files changed, 56 insertions(+), 16 deletions(-)
[incubator-mxnet] branch master updated (e1435a3 -> e1779f4)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from e1435a3 [NumPy] Add NumPy support for norm (#17014) add e1779f4 [BUILD] pslite fix link zmq (#17427) No new revisions were added by this update. Summary of changes: CMakeLists.txt | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-)
[incubator-mxnet] branch master updated (e1435a3 -> e1779f4)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from e1435a3 [NumPy] Add NumPy support for norm (#17014) add e1779f4 [BUILD] pslite fix link zmq (#17427) No new revisions were added by this update. Summary of changes: CMakeLists.txt | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-)
[incubator-mxnet] branch master updated (5e64e96 -> 6bab3c4)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 5e64e96 adding docs for 64bit C APIs of large tensor (#17309) add 6bab3c4 Update ps-lite LICENSE (#17351) No new revisions were added by this update. Summary of changes: LICENSE | 1 + 1 file changed, 1 insertion(+)
[incubator-mxnet] branch master updated (7b349dd -> 6b9a1da)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 7b349dd grouping large array tests based on type and updating nightly CI function (#17305) add 6b9a1da Multi-tensor LAMB (#16893) No new revisions were added by this update. Summary of changes: python/mxnet/ndarray/contrib.py | 76 +++ python/mxnet/optimizer/optimizer.py | 121 --- python/mxnet/test_utils.py | 77 --- src/operator/contrib/multi_lamb-inl.h | 359 src/operator/contrib/multi_lamb.cc | 251 ++ src/operator/contrib/multi_lamb.cu | 261 +++ src/operator/contrib/multi_sum_sq-inl.h | 4 + src/operator/contrib/multi_sum_sq.cc| 6 + src/operator/contrib/multi_sum_sq.cu| 29 ++- tests/python/unittest/test_optimizer.py | 46 +++- 10 files changed, 1157 insertions(+), 73 deletions(-) mode change 100644 => 100755 python/mxnet/optimizer/optimizer.py mode change 100644 => 100755 python/mxnet/test_utils.py create mode 100644 src/operator/contrib/multi_lamb-inl.h create mode 100644 src/operator/contrib/multi_lamb.cc create mode 100644 src/operator/contrib/multi_lamb.cu mode change 100644 => 100755 tests/python/unittest/test_optimizer.py
[incubator-mxnet] branch master updated (7b349dd -> 6b9a1da)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 7b349dd grouping large array tests based on type and updating nightly CI function (#17305) add 6b9a1da Multi-tensor LAMB (#16893) No new revisions were added by this update. Summary of changes: python/mxnet/ndarray/contrib.py | 76 +++ python/mxnet/optimizer/optimizer.py | 121 --- python/mxnet/test_utils.py | 77 --- src/operator/contrib/multi_lamb-inl.h | 359 src/operator/contrib/multi_lamb.cc | 251 ++ src/operator/contrib/multi_lamb.cu | 261 +++ src/operator/contrib/multi_sum_sq-inl.h | 4 + src/operator/contrib/multi_sum_sq.cc| 6 + src/operator/contrib/multi_sum_sq.cu| 29 ++- tests/python/unittest/test_optimizer.py | 46 +++- 10 files changed, 1157 insertions(+), 73 deletions(-) mode change 100644 => 100755 python/mxnet/optimizer/optimizer.py mode change 100644 => 100755 python/mxnet/test_utils.py create mode 100644 src/operator/contrib/multi_lamb-inl.h create mode 100644 src/operator/contrib/multi_lamb.cc create mode 100644 src/operator/contrib/multi_lamb.cu mode change 100644 => 100755 tests/python/unittest/test_optimizer.py
[incubator-mxnet] branch master updated (058de55 -> 3971938)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 058de55 Fix image display in python autograd tutorial (#17243) add 3971938 Fix #17267, add expected and got datatype for concat error msgs (#17271) No new revisions were added by this update. Summary of changes: src/operator/nn/concat.cc | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-)
[incubator-mxnet] branch master updated: fix typo (#17277)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 2002d60 fix typo (#17277) 2002d60 is described below commit 2002d6065d5668f9de4d4050c8cf750dd6cc5ca8 Author: Chaitanya Prakash Bapat AuthorDate: Sat Jan 11 21:25:07 2020 -0800 fix typo (#17277) --- cpp-package/include/mxnet-cpp/ndarray.h | 8 include/mxnet/c_api.h | 4 ++-- include/mxnet/ndarray.h | 4 ++-- perl-package/AI-MXNetCAPI/mxnet.i | 4 ++-- src/ndarray/ndarray_function.cu | 2 +- 5 files changed, 11 insertions(+), 11 deletions(-) diff --git a/cpp-package/include/mxnet-cpp/ndarray.h b/cpp-package/include/mxnet-cpp/ndarray.h index c4d51c5..0a9a412 100644 --- a/cpp-package/include/mxnet-cpp/ndarray.h +++ b/cpp-package/include/mxnet-cpp/ndarray.h @@ -251,7 +251,7 @@ class NDArray { NDArray &operator%=(const NDArray &src); NDArray ArgmaxChannel(); /*! - * \brief Do a synchronize copy from a continugous CPU memory region. + * \brief Do a synchronize copy from a contiguous CPU memory region. * * This function will call WaitToWrite before the copy is performed. * This is useful to copy data from existing memory region that are @@ -262,7 +262,7 @@ class NDArray { */ void SyncCopyFromCPU(const mx_float *data, size_t size); /*! - * \brief Do a synchronize copy from a continugous CPU memory region. + * \brief Do a synchronize copy from a contiguous CPU memory region. * * This function will call WaitToWrite before the copy is performed. * This is useful to copy data from existing memory region that are @@ -272,7 +272,7 @@ class NDArray { */ void SyncCopyFromCPU(const std::vector &data); /*! - * \brief Do a synchronize copy to a continugous CPU memory region. + * \brief Do a synchronize copy to a contiguous CPU memory region. * * This function will call WaitToRead before the copy is performed. * This is useful to copy data from existing memory region that are @@ -283,7 +283,7 @@ class NDArray { */ void SyncCopyToCPU(mx_float *data, size_t size = 0); /*! - * \brief Do a synchronize copy to a continugous CPU memory region. + * \brief Do a synchronize copy to a contiguous CPU memory region. * * This function will call WaitToRead before the copy is performed. * This is useful to copy data from existing memory region that are diff --git a/include/mxnet/c_api.h b/include/mxnet/c_api.h index 27b420f..33d79bd 100644 --- a/include/mxnet/c_api.h +++ b/include/mxnet/c_api.h @@ -723,7 +723,7 @@ MXNET_DLL int MXNDArrayLoadFromBuffer(const void *ndarray_buffer, const char*** out_names); /*! - * \brief Perform a synchronize copy from a continugous CPU memory region. + * \brief Perform a synchronize copy from a contiguous CPU memory region. * * This function will call WaitToWrite before the copy is performed. * This is useful to copy data from existing memory region that are @@ -737,7 +737,7 @@ MXNET_DLL int MXNDArraySyncCopyFromCPU(NDArrayHandle handle, const void *data, size_t size); /*! - * \brief Perform a synchronize copyto a continugous CPU memory region. + * \brief Perform a synchronize copyto a contiguous CPU memory region. * * This function will call WaitToRead before the copy is performed. * This is useful to copy data from existing memory region that are diff --git a/include/mxnet/ndarray.h b/include/mxnet/ndarray.h index 1b0b119..3e780a1 100644 --- a/include/mxnet/ndarray.h +++ b/include/mxnet/ndarray.h @@ -483,7 +483,7 @@ class NDArray { */ NDArray Copy(Context ctx) const; /*! - * \brief Do a synchronize copy from a continugous CPU memory region. + * \brief Do a synchronize copy from a contiguous CPU memory region. * * This function will call WaitToWrite before the copy is performed. * This is useful to copy data from existing memory region that are @@ -500,7 +500,7 @@ class NDArray { void SyncCopyFromNDArray(const NDArray &src, int i = -1, int j = -1); /*! - * \brief Do a synchronize copy to a continugous CPU memory region. + * \brief Do a synchronize copy to a contiguous CPU memory region. * * This function will call WaitToRead before the copy is performed. * This is useful to copy data from existing memory region that are diff --git a/perl-package/AI-MXNetCAPI/mxnet.i b/perl-package/AI-MXNetCAPI/mxnet.i index e38402c..f35f620 100644 --- a/perl-package/AI-MXNetCAPI/mxnet.i +++ b/perl-package/AI-MXNetCAPI/mxnet.i @@ -510,7 +510,7 @@ int MXNDArrayLoadFromBuffer(const void *in, const char*** out_array); /*! - * \brief Perform a synchronize copy from a continugous C
[incubator-mxnet] branch master updated (f88b1ed -> c3b0baa)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from f88b1ed Add contributors (#17268) add c3b0baa fix lstm layer with projection save params (#17266) No new revisions were added by this update. Summary of changes: python/mxnet/gluon/rnn/rnn_layer.py | 2 +- tests/python/gpu/test_gluon_gpu.py | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-)
[incubator-mxnet] branch master updated (f88b1ed -> c3b0baa)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from f88b1ed Add contributors (#17268) add c3b0baa fix lstm layer with projection save params (#17266) No new revisions were added by this update. Summary of changes: python/mxnet/gluon/rnn/rnn_layer.py | 2 +- tests/python/gpu/test_gluon_gpu.py | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-)
[incubator-mxnet] branch master updated (6ba9aad -> ac88f1e)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 6ba9aad Enabling large tensor support for binary broadcast operators (#16755) add ac88f1e [DOC] Add a few tips for running horovod (#17235) No new revisions were added by this update. Summary of changes: docs/static_site/src/pages/api/faq/perf.md | 23 +-- example/distributed_training-horovod/README.md | 8 2 files changed, 21 insertions(+), 10 deletions(-)
[incubator-mxnet] branch master updated (8e946c9 -> 55e222b)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 8e946c9 Implement atleast_1d/2d/3d (#17099) add 55e222b Interleaved MHA for CPU path (#17138) No new revisions were added by this update. Summary of changes: src/operator/contrib/transformer.cc| 549 - tests/python/gpu/test_operator_gpu.py | 317 --- tests/python/unittest/test_operator.py | 324 +++ 3 files changed, 861 insertions(+), 329 deletions(-)
[incubator-mxnet] branch master updated (8e946c9 -> 55e222b)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 8e946c9 Implement atleast_1d/2d/3d (#17099) add 55e222b Interleaved MHA for CPU path (#17138) No new revisions were added by this update. Summary of changes: src/operator/contrib/transformer.cc| 549 - tests/python/gpu/test_operator_gpu.py | 317 --- tests/python/unittest/test_operator.py | 324 +++ 3 files changed, 861 insertions(+), 329 deletions(-)
[incubator-mxnet] branch v1.6.x updated: fix norm sparse fallback (#17149)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch v1.6.x in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/v1.6.x by this push: new dafbb11 fix norm sparse fallback (#17149) dafbb11 is described below commit dafbb1107a4dd34950b5f5a4d513ddf51f7c07a8 Author: Hao Jin AuthorDate: Thu Dec 26 07:07:30 2019 +0800 fix norm sparse fallback (#17149) --- src/operator/tensor/broadcast_reduce_norm_value.cc | 2 +- src/operator/tensor/broadcast_reduce_norm_value.cu | 2 +- src/operator/tensor/broadcast_reduce_op.h | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/src/operator/tensor/broadcast_reduce_norm_value.cc b/src/operator/tensor/broadcast_reduce_norm_value.cc index 4cd92d4..9acc157 100644 --- a/src/operator/tensor/broadcast_reduce_norm_value.cc +++ b/src/operator/tensor/broadcast_reduce_norm_value.cc @@ -40,7 +40,7 @@ void L2NormComputeEx(const nnvm::NodeAttrs& attrs, const NormParam& param = nnvm::get(attrs.parsed); mshadow::Stream* s = ctx.get_stream(); const NDArrayStorageType istype = inputs[0].storage_type(); - const mxnet::TShape axis = param.axis.has_value() ? param.axis.value() : mxnet::TShape(); + const mxnet::TShape axis = param.axis.has_value() ? param.axis.value() : mxnet::TShape(0, -1); if ((istype == kRowSparseStorage || istype == kCSRStorage) && axis.ndim() == 0 && param.ord == 2) { // l2 norm on the entire array diff --git a/src/operator/tensor/broadcast_reduce_norm_value.cu b/src/operator/tensor/broadcast_reduce_norm_value.cu index 188c93e..735c3d7 100644 --- a/src/operator/tensor/broadcast_reduce_norm_value.cu +++ b/src/operator/tensor/broadcast_reduce_norm_value.cu @@ -39,7 +39,7 @@ void L2NormComputeEx(const nnvm::NodeAttrs& attrs, const NormParam& param = nnvm::get(attrs.parsed); mshadow::Stream* s = ctx.get_stream(); const NDArrayStorageType istype = inputs[0].storage_type(); - const mxnet::TShape axis = param.axis.has_value() ? param.axis.value() : mxnet::TShape(); + const mxnet::TShape axis = param.axis.has_value() ? param.axis.value() : mxnet::TShape(0, -1); if ((istype == kRowSparseStorage || istype == kCSRStorage) && axis.ndim() == 0 && param.ord == 2) { // l2 norm on the entire array diff --git a/src/operator/tensor/broadcast_reduce_op.h b/src/operator/tensor/broadcast_reduce_op.h index 27e2249..799f865 100644 --- a/src/operator/tensor/broadcast_reduce_op.h +++ b/src/operator/tensor/broadcast_reduce_op.h @@ -1152,7 +1152,7 @@ inline bool LpNormStorageType(const nnvm::NodeAttrs& attrs, DispatchMode::kFCompute); } if (param.ord == 2) { -const mxnet::TShape axis = param.axis.has_value() ? param.axis.value() : mxnet::TShape(); +const mxnet::TShape axis = param.axis.has_value() ? param.axis.value() : mxnet::TShape(0, -1); if (!dispatched && (in_stype == kRowSparseStorage || in_stype == kCSRStorage) && axis.ndim() == 0 && param.ord == 2) { // l2 norm: rsp/csr, axis = () -> dns
[incubator-mxnet] branch master updated: fix norm sparse fallback (#17149)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 2551a9d fix norm sparse fallback (#17149) 2551a9d is described below commit 2551a9d8c8a4f5fd73c98e56ff79ab5410053d0e Author: Hao Jin AuthorDate: Thu Dec 26 07:07:30 2019 +0800 fix norm sparse fallback (#17149) --- src/operator/tensor/broadcast_reduce_norm_value.cc | 2 +- src/operator/tensor/broadcast_reduce_norm_value.cu | 2 +- src/operator/tensor/broadcast_reduce_op.h | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/src/operator/tensor/broadcast_reduce_norm_value.cc b/src/operator/tensor/broadcast_reduce_norm_value.cc index 4cd92d4..9acc157 100644 --- a/src/operator/tensor/broadcast_reduce_norm_value.cc +++ b/src/operator/tensor/broadcast_reduce_norm_value.cc @@ -40,7 +40,7 @@ void L2NormComputeEx(const nnvm::NodeAttrs& attrs, const NormParam& param = nnvm::get(attrs.parsed); mshadow::Stream* s = ctx.get_stream(); const NDArrayStorageType istype = inputs[0].storage_type(); - const mxnet::TShape axis = param.axis.has_value() ? param.axis.value() : mxnet::TShape(); + const mxnet::TShape axis = param.axis.has_value() ? param.axis.value() : mxnet::TShape(0, -1); if ((istype == kRowSparseStorage || istype == kCSRStorage) && axis.ndim() == 0 && param.ord == 2) { // l2 norm on the entire array diff --git a/src/operator/tensor/broadcast_reduce_norm_value.cu b/src/operator/tensor/broadcast_reduce_norm_value.cu index 188c93e..735c3d7 100644 --- a/src/operator/tensor/broadcast_reduce_norm_value.cu +++ b/src/operator/tensor/broadcast_reduce_norm_value.cu @@ -39,7 +39,7 @@ void L2NormComputeEx(const nnvm::NodeAttrs& attrs, const NormParam& param = nnvm::get(attrs.parsed); mshadow::Stream* s = ctx.get_stream(); const NDArrayStorageType istype = inputs[0].storage_type(); - const mxnet::TShape axis = param.axis.has_value() ? param.axis.value() : mxnet::TShape(); + const mxnet::TShape axis = param.axis.has_value() ? param.axis.value() : mxnet::TShape(0, -1); if ((istype == kRowSparseStorage || istype == kCSRStorage) && axis.ndim() == 0 && param.ord == 2) { // l2 norm on the entire array diff --git a/src/operator/tensor/broadcast_reduce_op.h b/src/operator/tensor/broadcast_reduce_op.h index 27e2249..799f865 100644 --- a/src/operator/tensor/broadcast_reduce_op.h +++ b/src/operator/tensor/broadcast_reduce_op.h @@ -1152,7 +1152,7 @@ inline bool LpNormStorageType(const nnvm::NodeAttrs& attrs, DispatchMode::kFCompute); } if (param.ord == 2) { -const mxnet::TShape axis = param.axis.has_value() ? param.axis.value() : mxnet::TShape(); +const mxnet::TShape axis = param.axis.has_value() ? param.axis.value() : mxnet::TShape(0, -1); if (!dispatched && (in_stype == kRowSparseStorage || in_stype == kCSRStorage) && axis.ndim() == 0 && param.ord == 2) { // l2 norm: rsp/csr, axis = () -> dns
[incubator-mxnet] branch master updated (814be59 -> f86a8d1)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 814be59 Add #include needed for waitpid (#17078) add f86a8d1 [API] unified API for custom kvstores (#17010) No new revisions were added by this update. Summary of changes: ci/docker/runtime_functions.sh | 1 + python/mxnet/__init__.py | 24 +- python/mxnet/gluon/trainer.py | 50 ++- .../{contrib/amp/lists => kvstore}/__init__.py | 9 +- python/mxnet/kvstore/base.py | 455 + python/mxnet/{ => kvstore}/kvstore.py | 183 - python/mxnet/{ => kvstore}/kvstore_server.py | 6 +- python/mxnet/model.py | 35 +- src/kvstore/kvstore_local.h| 4 +- tests/nightly/dist_device_sync_kvstore_custom.py | 96 + tests/python/unittest/test_gluon_trainer.py| 33 +- tests/python/unittest/test_kvstore_custom.py | 195 + 12 files changed, 937 insertions(+), 154 deletions(-) copy python/mxnet/{contrib/amp/lists => kvstore}/__init__.py (84%) create mode 100644 python/mxnet/kvstore/base.py rename python/mxnet/{ => kvstore}/kvstore.py (85%) rename python/mxnet/{ => kvstore}/kvstore_server.py (97%) create mode 100644 tests/nightly/dist_device_sync_kvstore_custom.py create mode 100644 tests/python/unittest/test_kvstore_custom.py
[incubator-mxnet] branch master updated (814be59 -> f86a8d1)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 814be59 Add #include needed for waitpid (#17078) add f86a8d1 [API] unified API for custom kvstores (#17010) No new revisions were added by this update. Summary of changes: ci/docker/runtime_functions.sh | 1 + python/mxnet/__init__.py | 24 +- python/mxnet/gluon/trainer.py | 50 ++- .../{contrib/amp/lists => kvstore}/__init__.py | 9 +- python/mxnet/kvstore/base.py | 455 + python/mxnet/{ => kvstore}/kvstore.py | 183 - python/mxnet/{ => kvstore}/kvstore_server.py | 6 +- python/mxnet/model.py | 35 +- src/kvstore/kvstore_local.h| 4 +- tests/nightly/dist_device_sync_kvstore_custom.py | 96 + tests/python/unittest/test_gluon_trainer.py| 33 +- tests/python/unittest/test_kvstore_custom.py | 195 + 12 files changed, 937 insertions(+), 154 deletions(-) copy python/mxnet/{contrib/amp/lists => kvstore}/__init__.py (84%) create mode 100644 python/mxnet/kvstore/base.py rename python/mxnet/{ => kvstore}/kvstore.py (85%) rename python/mxnet/{ => kvstore}/kvstore_server.py (97%) create mode 100644 tests/nightly/dist_device_sync_kvstore_custom.py create mode 100644 tests/python/unittest/test_kvstore_custom.py
[incubator-mxnet] branch master updated (696c547 -> bbdc1c3)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 696c547 [BUGFIX] Fix trainer param order (#17068) add bbdc1c3 [reproducibility] multi_sum_sq review, AtomicAdd removal (#17002) No new revisions were added by this update. Summary of changes: src/operator/contrib/multi_sum_sq-inl.h | 10 ++- src/operator/contrib/multi_sum_sq.cc| 20 +++--- src/operator/contrib/multi_sum_sq.cu| 110 tests/python/gpu/test_operator_gpu.py | 30 + 4 files changed, 118 insertions(+), 52 deletions(-)
[incubator-mxnet] branch master updated (042682e -> 696c547)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 042682e [DOC] Fix tutorial link, and better error msg (#17057) add 696c547 [BUGFIX] Fix trainer param order (#17068) No new revisions were added by this update. Summary of changes: python/mxnet/gluon/trainer.py | 5 - tests/python/unittest/test_gluon_trainer.py | 16 2 files changed, 20 insertions(+), 1 deletion(-)
[incubator-mxnet] branch master updated (042682e -> 696c547)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 042682e [DOC] Fix tutorial link, and better error msg (#17057) add 696c547 [BUGFIX] Fix trainer param order (#17068) No new revisions were added by this update. Summary of changes: python/mxnet/gluon/trainer.py | 5 - tests/python/unittest/test_gluon_trainer.py | 16 2 files changed, 20 insertions(+), 1 deletion(-)
[incubator-mxnet] branch master updated (f045018 -> 042682e)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from f045018 [MXNET-978] Higher Order Gradient Support `logp1`, `expm1`, `square`. (#15416) add 042682e [DOC] Fix tutorial link, and better error msg (#17057) No new revisions were added by this update. Summary of changes: python/mxnet/gluon/parameter.py | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-)
[incubator-mxnet] branch master updated (f045018 -> 042682e)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from f045018 [MXNET-978] Higher Order Gradient Support `logp1`, `expm1`, `square`. (#15416) add 042682e [DOC] Fix tutorial link, and better error msg (#17057) No new revisions were added by this update. Summary of changes: python/mxnet/gluon/parameter.py | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-)
[incubator-mxnet] branch master updated (f701f3f -> 61013a8)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from f701f3f [MXNET-1431] Multiple channel support in Gluon PReLU (#16262) add 61013a8 use env var to control stack trace logging (#17038) No new revisions were added by this update. Summary of changes: .github/ISSUE_TEMPLATE/bug_report.md | 2 +- 3rdparty/dmlc-core| 2 +- CMakeLists.txt| 3 ++- Makefile | 2 ++ ci/docker/runtime_functions.sh| 39 +++ docs/static_site/src/pages/api/faq/env_var.md | 6 + 6 files changed, 51 insertions(+), 3 deletions(-)
[incubator-mxnet] branch v1.6.x updated: [BUGFIX] Fix race condition in kvstore.pushpull (#17007) (#17052)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch v1.6.x in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/v1.6.x by this push: new c675520 [BUGFIX] Fix race condition in kvstore.pushpull (#17007) (#17052) c675520 is described below commit c6755208f4f78d9f4ea095ec2ed8e067c8db1ef1 Author: Przemyslaw Tredak AuthorDate: Wed Dec 11 21:11:57 2019 -0800 [BUGFIX] Fix race condition in kvstore.pushpull (#17007) (#17052) * add back gluon test * fix typo * change back gpu ctx * also handle the case there some are pull and some are pushpull * fix typo --- src/kvstore/kvstore_dist_server.h | 35 +-- tests/nightly/dist_device_sync_kvstore.py | 35 +-- 2 files changed, 43 insertions(+), 27 deletions(-) diff --git a/src/kvstore/kvstore_dist_server.h b/src/kvstore/kvstore_dist_server.h index 65ded79..1dc222c 100644 --- a/src/kvstore/kvstore_dist_server.h +++ b/src/kvstore/kvstore_dist_server.h @@ -364,21 +364,34 @@ class KVStoreDistServer { if (log_verbose_) { LOG(INFO) << "sent response to " << update_buf->request.size() << " workers"; } + /** + * Request can be for either push, pull or pushpull + * If pull flag is set, respond immediately with the updated values + * Otherwise, only send the notification + */ + bool has_pull = false; for (const auto& req : update_buf->request) { -/** - * Request can be for either push, pull or pushpull - * If pull flag is set, respond immediately with the updated values - * Otherwise, only send the notification - */ -if (req.pull) { - DefaultStorageResponse(type, key, req, req_data, server); -} else { +has_pull = has_pull || req.pull; + } + if (has_pull) { +// if there is a pull request, perform WaitToRead() once before DefaultStorageResponse +if (has_multi_precision_copy(type)) CopyFromTo(stored, store_[key]); +stored.WaitToRead(); +for (const auto& req : update_buf->request) { + if (req.pull) { +DefaultStorageResponse(type, key, req, req_data, server); + } +} +update_buf->request.clear(); + } else { +// otherwise, send response directly +for (const auto& req : update_buf->request) { server->Response(req); } +update_buf->request.clear(); +if (has_multi_precision_copy(type)) CopyFromTo(stored, store_[key]); +stored.WaitToRead(); } - update_buf->request.clear(); - if (has_multi_precision_copy(type)) CopyFromTo(stored, store_[key]); - stored.WaitToRead(); } else { update_buf->merged.WaitToRead(); } diff --git a/tests/nightly/dist_device_sync_kvstore.py b/tests/nightly/dist_device_sync_kvstore.py index dc2c7bc..f3fe737 100644 --- a/tests/nightly/dist_device_sync_kvstore.py +++ b/tests/nightly/dist_device_sync_kvstore.py @@ -44,7 +44,10 @@ kv = mx.kv.create('dist_device_sync') def init_kv(): # init kv dns keys kv.init(keys, [mx.nd.ones(shape)] * len(keys)) +kv.init('9', mx.nd.ones(shape)) +kv.init('10', mx.nd.ones(shape)) kv.init('99', mx.nd.ones(big_shape)) +kv.init('100', mx.nd.ones(big_shape)) # worker info my_rank = kv.rank nworker = kv.num_workers @@ -55,33 +58,30 @@ def init_kv(): def test_sync_push_pull(): kv, my_rank, nworker = init_kv() num_gpus = 2 -def check_default_keys(kv, my_rank, nworker, nrepeat=3, offset=0, use_pushpull=False): +def check_default_keys(kv, my_rank, nworker, nrepeat=3): # checks pull after push in loop, because behavior during # consecutive pushes doesn't offer any guarantees -for i in range(offset, nrepeat): +for i in range(nrepeat): scale = my_rank + 1 num = (nworker + 1) * nworker * rate * num_gpus / 2 * (i + 1) + 1 arr = [mx.nd.ones(shape, ctx=mx.gpu(j)) * scale for j in range(num_gpus)] val = mx.nd.zeros(shape) -if use_pushpull: -kv.pushpull('3', arr, out=val) -else: -kv.push('3', arr) -kv.pull('3', out=val) +kv.push('9', arr) +kv.pull('9', out=val) +check_diff_to_scalar(val, num) +kv.pushpull('10', arr, out=val) check_diff_to_scalar(val, num) big_arr = [mx.nd.ones(big_shape, ctx=mx.gpu(j)) * scale for j in range(num_gpus)] big_val = mx.nd.zeros(big_shape) -if use_pu
[incubator-mxnet] branch master updated (04ebe45 -> 05af5c4)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 04ebe45 Prevent after-fork number of OMP threads being bigger than 1. (#16999) add 05af5c4 [BUGFIX] Fix race condition in kvstore.pushpull (#17007) No new revisions were added by this update. Summary of changes: src/kvstore/kvstore_dist_server.h | 35 +-- tests/nightly/dist_device_sync_kvstore.py | 35 +-- 2 files changed, 43 insertions(+), 27 deletions(-)
[incubator-mxnet] branch master updated (04ebe45 -> 05af5c4)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 04ebe45 Prevent after-fork number of OMP threads being bigger than 1. (#16999) add 05af5c4 [BUGFIX] Fix race condition in kvstore.pushpull (#17007) No new revisions were added by this update. Summary of changes: src/kvstore/kvstore_dist_server.h | 35 +-- tests/nightly/dist_device_sync_kvstore.py | 35 +-- 2 files changed, 43 insertions(+), 27 deletions(-)
[incubator-mxnet] branch v1.6.x updated (ff27b4b -> a576531)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch v1.6.x in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from ff27b4b [OP] changing data type of 't' to int in lamb_update_phase1 (#16903) add a576531 Multi Precision Lamb Update operator (#16885) No new revisions were added by this update. Summary of changes: python/mxnet/optimizer/optimizer.py | 59 src/operator/optimizer_op-inl.h | 159 +++- src/operator/optimizer_op.cc| 90 +- src/operator/optimizer_op.cu| 5 + tests/python/gpu/test_operator_gpu.py | 1 + tests/python/unittest/test_optimizer.py | 14 +-- 6 files changed, 304 insertions(+), 24 deletions(-)
[incubator-mxnet] branch v1.6.x updated (c973f01 -> ff27b4b)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch v1.6.x in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from c973f01 Backport #16895, #16922, #16878, #16979 and #16900 to 1.6 (#17029) add c7d484e Lamb optimizer update (#16715) add ff27b4b [OP] changing data type of 't' to int in lamb_update_phase1 (#16903) No new revisions were added by this update. Summary of changes: python/mxnet/optimizer/optimizer.py | 52 - src/operator/optimizer_op-inl.h | 188 src/operator/optimizer_op.cc| 81 ++ src/operator/optimizer_op.cu| 7 ++ tests/python/unittest/test_optimizer.py | 73 + 5 files changed, 399 insertions(+), 2 deletions(-)
[incubator-mxnet] branch master updated (986a902 -> 248acfa)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 986a902 introduce gradient update handler to the base estimator (#16900) add 248acfa Multi Precision Lamb Update operator (#16885) No new revisions were added by this update. Summary of changes: python/mxnet/optimizer/optimizer.py | 59 src/operator/optimizer_op-inl.h | 159 +++- src/operator/optimizer_op.cc| 90 +- src/operator/optimizer_op.cu| 5 + tests/python/gpu/test_operator_gpu.py | 1 + tests/python/unittest/test_optimizer.py | 14 +-- 6 files changed, 304 insertions(+), 24 deletions(-)
[incubator-mxnet] branch master updated (986a902 -> 248acfa)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 986a902 introduce gradient update handler to the base estimator (#16900) add 248acfa Multi Precision Lamb Update operator (#16885) No new revisions were added by this update. Summary of changes: python/mxnet/optimizer/optimizer.py | 59 src/operator/optimizer_op-inl.h | 159 +++- src/operator/optimizer_op.cc| 90 +- src/operator/optimizer_op.cu| 5 + tests/python/gpu/test_operator_gpu.py | 1 + tests/python/unittest/test_optimizer.py | 14 +-- 6 files changed, 304 insertions(+), 24 deletions(-)
[incubator-mxnet] branch master updated (fcc42de -> ca4939f)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from fcc42de updating MXNet version to 1.6.0 in base.h for C APIs (#16905) add ca4939f [OP] changing data type of 't' to int in lamb_update_phase1 (#16903) No new revisions were added by this update. Summary of changes: src/operator/optimizer_op-inl.h | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-)
[incubator-mxnet] branch master updated (fcc42de -> ca4939f)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from fcc42de updating MXNet version to 1.6.0 in base.h for C APIs (#16905) add ca4939f [OP] changing data type of 't' to int in lamb_update_phase1 (#16903) No new revisions were added by this update. Summary of changes: src/operator/optimizer_op-inl.h | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-)
[incubator-mxnet] branch benchmark updated (a47b540 -> cc0c356)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch benchmark in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from a47b540 multi-precision lamb update operator add cc0c356 squad multi-lamb No new revisions were added by this update. Summary of changes: python/mxnet/ndarray/contrib.py | 24 +++ python/mxnet/optimizer/optimizer.py | 92 - python/mxnet/test_utils.py | 80 +--- src/operator/contrib/multi_lamb-inl.h | 332 src/operator/contrib/multi_lamb.cc | 245 +++ src/operator/contrib/multi_lamb.cu | 254 tests/python/unittest/test_optimizer.py | 98 ++ 7 files changed, 1096 insertions(+), 29 deletions(-) create mode 100644 src/operator/contrib/multi_lamb-inl.h create mode 100644 src/operator/contrib/multi_lamb.cc create mode 100644 src/operator/contrib/multi_lamb.cu
[incubator-mxnet] branch benchmark updated (a47b540 -> cc0c356)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch benchmark in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from a47b540 multi-precision lamb update operator add cc0c356 squad multi-lamb No new revisions were added by this update. Summary of changes: python/mxnet/ndarray/contrib.py | 24 +++ python/mxnet/optimizer/optimizer.py | 92 - python/mxnet/test_utils.py | 80 +--- src/operator/contrib/multi_lamb-inl.h | 332 src/operator/contrib/multi_lamb.cc | 245 +++ src/operator/contrib/multi_lamb.cu | 254 tests/python/unittest/test_optimizer.py | 98 ++ 7 files changed, 1096 insertions(+), 29 deletions(-) create mode 100644 src/operator/contrib/multi_lamb-inl.h create mode 100644 src/operator/contrib/multi_lamb.cc create mode 100644 src/operator/contrib/multi_lamb.cu
[incubator-mxnet] branch master updated: Lamb optimizer update (#16715)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 85d3ef3 Lamb optimizer update (#16715) 85d3ef3 is described below commit 85d3ef3a40da20a4aac3030950aa0f37f8cb89c5 Author: Rohit Kumar Srivastava AuthorDate: Sat Nov 23 22:19:07 2019 -0800 Lamb optimizer update (#16715) * initial commit lamb optimizer * fixing base lamb optimizer * adding API doc for Lamb Phase 1 and 2 --- python/mxnet/optimizer/optimizer.py | 52 - src/operator/optimizer_op-inl.h | 186 src/operator/optimizer_op.cc| 81 ++ src/operator/optimizer_op.cu| 7 ++ tests/python/unittest/test_optimizer.py | 73 + 5 files changed, 397 insertions(+), 2 deletions(-) diff --git a/python/mxnet/optimizer/optimizer.py b/python/mxnet/optimizer/optimizer.py index b7311b2..00d130b 100644 --- a/python/mxnet/optimizer/optimizer.py +++ b/python/mxnet/optimizer/optimizer.py @@ -34,14 +34,14 @@ from ..ndarray import (sgd_update, sgd_mom_update, adam_update, rmsprop_update, multi_sgd_update, multi_sgd_mom_update, multi_mp_sgd_update, multi_mp_sgd_mom_update, preloaded_multi_sgd_update, preloaded_multi_sgd_mom_update, preloaded_multi_mp_sgd_update, - preloaded_multi_mp_sgd_mom_update) + preloaded_multi_mp_sgd_mom_update, lamb_update_phase1, lamb_update_phase2) from ..ndarray import sparse from ..random import normal from ..util import is_np_array __all__ = [ 'AdaDelta', 'AdaGrad', 'Adam', 'Adamax', 'DCASGD', 'FTML', 'Ftrl', 'LARS', 'LBSGD', -'NAG', 'NDabs', 'Nadam', 'Optimizer', 'RMSProp', 'SGD', 'SGLD', 'Signum', +'NAG', 'NDabs', 'Nadam', 'Optimizer', 'RMSProp', 'SGD', 'SGLD', 'Signum', 'LAMB', 'Test', 'Updater', 'ccSGD', 'create', 'get_updater', 'register' ] @@ -1244,6 +1244,54 @@ class LBSGD(Optimizer): kwargs = {} sgd_update(weight, grad, out=weight, lr=lr, wd=wd, **kwargs) + +@register +class LAMB(Optimizer): +"""LAMB Optimizer. +""" +def __init__(self, learning_rate=0.001, beta1=0.9, beta2=0.999, epsilon=1e-6, + lower_bound=None, upper_bound=None, bias_correction=True, **kwargs): +super(LAMB, self).__init__(learning_rate=learning_rate, **kwargs) +self.beta1 = beta1 +self.beta2 = beta2 +self.epsilon = epsilon +self.lower_bound = lower_bound +self.upper_bound = upper_bound +self.bias_correction = bias_correction + + +def create_state(self, index, weight): +stype = weight.stype +dtype = weight.dtype +return (zeros(weight.shape, weight.context, dtype=dtype, stype=stype), +zeros(weight.shape, weight.context, dtype=dtype, stype=stype)) + +def update(self, index, weight, grad, state): +assert(isinstance(weight, NDArray)) +assert(isinstance(grad, NDArray)) +self._update_count(index) +lr = self._get_lr(index) +wd = self._get_wd(index) +t = self._index_update_count[index] + +kwargs = {'beta1': self.beta1, 'beta2': self.beta2, 'epsilon': self.epsilon, + 'bias_correction': self.bias_correction, 't': t, + 'rescale_grad': self.rescale_grad} +mean, var = state +if self.clip_gradient: +kwargs['clip_gradient'] = self.clip_gradient +g = lamb_update_phase1(weight, grad, mean, var, wd=wd, **kwargs) + +kwargs = {} +if self.lower_bound: +kwargs['lower_bound'] = self.lower_bound +if self.upper_bound: +kwargs['upper_bound'] = self.upper_bound +r_1 = weight.norm() +r_2 = g.norm() +lamb_update_phase2(weight, g, r_1, r_2, lr=lr, out=weight, **kwargs) + + # pylint: enable=line-too-long @register class DCASGD(Optimizer): diff --git a/src/operator/optimizer_op-inl.h b/src/operator/optimizer_op-inl.h index c211d32..698f797 100644 --- a/src/operator/optimizer_op-inl.h +++ b/src/operator/optimizer_op-inl.h @@ -1563,6 +1563,192 @@ inline void AdamUpdateEx(const nnvm::NodeAttrs& attrs, } } +struct LambUpdatePhaseOneParam : public dmlc::Parameter { +float beta1; +float beta2; +float epsi
[incubator-mxnet] branch master updated: Lamb optimizer update (#16715)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git The following commit(s) were added to refs/heads/master by this push: new 85d3ef3 Lamb optimizer update (#16715) 85d3ef3 is described below commit 85d3ef3a40da20a4aac3030950aa0f37f8cb89c5 Author: Rohit Kumar Srivastava AuthorDate: Sat Nov 23 22:19:07 2019 -0800 Lamb optimizer update (#16715) * initial commit lamb optimizer * fixing base lamb optimizer * adding API doc for Lamb Phase 1 and 2 --- python/mxnet/optimizer/optimizer.py | 52 - src/operator/optimizer_op-inl.h | 186 src/operator/optimizer_op.cc| 81 ++ src/operator/optimizer_op.cu| 7 ++ tests/python/unittest/test_optimizer.py | 73 + 5 files changed, 397 insertions(+), 2 deletions(-) diff --git a/python/mxnet/optimizer/optimizer.py b/python/mxnet/optimizer/optimizer.py index b7311b2..00d130b 100644 --- a/python/mxnet/optimizer/optimizer.py +++ b/python/mxnet/optimizer/optimizer.py @@ -34,14 +34,14 @@ from ..ndarray import (sgd_update, sgd_mom_update, adam_update, rmsprop_update, multi_sgd_update, multi_sgd_mom_update, multi_mp_sgd_update, multi_mp_sgd_mom_update, preloaded_multi_sgd_update, preloaded_multi_sgd_mom_update, preloaded_multi_mp_sgd_update, - preloaded_multi_mp_sgd_mom_update) + preloaded_multi_mp_sgd_mom_update, lamb_update_phase1, lamb_update_phase2) from ..ndarray import sparse from ..random import normal from ..util import is_np_array __all__ = [ 'AdaDelta', 'AdaGrad', 'Adam', 'Adamax', 'DCASGD', 'FTML', 'Ftrl', 'LARS', 'LBSGD', -'NAG', 'NDabs', 'Nadam', 'Optimizer', 'RMSProp', 'SGD', 'SGLD', 'Signum', +'NAG', 'NDabs', 'Nadam', 'Optimizer', 'RMSProp', 'SGD', 'SGLD', 'Signum', 'LAMB', 'Test', 'Updater', 'ccSGD', 'create', 'get_updater', 'register' ] @@ -1244,6 +1244,54 @@ class LBSGD(Optimizer): kwargs = {} sgd_update(weight, grad, out=weight, lr=lr, wd=wd, **kwargs) + +@register +class LAMB(Optimizer): +"""LAMB Optimizer. +""" +def __init__(self, learning_rate=0.001, beta1=0.9, beta2=0.999, epsilon=1e-6, + lower_bound=None, upper_bound=None, bias_correction=True, **kwargs): +super(LAMB, self).__init__(learning_rate=learning_rate, **kwargs) +self.beta1 = beta1 +self.beta2 = beta2 +self.epsilon = epsilon +self.lower_bound = lower_bound +self.upper_bound = upper_bound +self.bias_correction = bias_correction + + +def create_state(self, index, weight): +stype = weight.stype +dtype = weight.dtype +return (zeros(weight.shape, weight.context, dtype=dtype, stype=stype), +zeros(weight.shape, weight.context, dtype=dtype, stype=stype)) + +def update(self, index, weight, grad, state): +assert(isinstance(weight, NDArray)) +assert(isinstance(grad, NDArray)) +self._update_count(index) +lr = self._get_lr(index) +wd = self._get_wd(index) +t = self._index_update_count[index] + +kwargs = {'beta1': self.beta1, 'beta2': self.beta2, 'epsilon': self.epsilon, + 'bias_correction': self.bias_correction, 't': t, + 'rescale_grad': self.rescale_grad} +mean, var = state +if self.clip_gradient: +kwargs['clip_gradient'] = self.clip_gradient +g = lamb_update_phase1(weight, grad, mean, var, wd=wd, **kwargs) + +kwargs = {} +if self.lower_bound: +kwargs['lower_bound'] = self.lower_bound +if self.upper_bound: +kwargs['upper_bound'] = self.upper_bound +r_1 = weight.norm() +r_2 = g.norm() +lamb_update_phase2(weight, g, r_1, r_2, lr=lr, out=weight, **kwargs) + + # pylint: enable=line-too-long @register class DCASGD(Optimizer): diff --git a/src/operator/optimizer_op-inl.h b/src/operator/optimizer_op-inl.h index c211d32..698f797 100644 --- a/src/operator/optimizer_op-inl.h +++ b/src/operator/optimizer_op-inl.h @@ -1563,6 +1563,192 @@ inline void AdamUpdateEx(const nnvm::NodeAttrs& attrs, } } +struct LambUpdatePhaseOneParam : public dmlc::Parameter { +float beta1; +float beta2; +float epsi
[incubator-mxnet] branch master updated (47fd3a0 -> 7d4f2f3)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 47fd3a0 Link fixes4 (#16764) add 7d4f2f3 [MXNET-1421] Added (CuDNN)BatchNorm operator to the list of mirrored operators (#16022) No new revisions were added by this update. Summary of changes: src/executor/graph_executor.cc | 2 -- src/operator/nn/cudnn/cudnn_batch_norm-inl.h | 18 +- 2 files changed, 17 insertions(+), 3 deletions(-)
[incubator-mxnet] branch master updated (47fd3a0 -> 7d4f2f3)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 47fd3a0 Link fixes4 (#16764) add 7d4f2f3 [MXNET-1421] Added (CuDNN)BatchNorm operator to the list of mirrored operators (#16022) No new revisions were added by this update. Summary of changes: src/executor/graph_executor.cc | 2 -- src/operator/nn/cudnn/cudnn_batch_norm-inl.h | 18 +- 2 files changed, 17 insertions(+), 3 deletions(-)
[incubator-mxnet] branch master updated (58b824f -> da33da3)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 58b824f fix R docs (#16733) add da33da3 Add MXNet Ops for fast multihead attention (#16408) No new revisions were added by this update. Summary of changes: src/common/cuda_utils.h| 74 + src/operator/contrib/transformer-inl.h | 9 + src/operator/contrib/transformer.cc| 270 src/operator/contrib/transformer.cu| 560 + tests/python/gpu/test_operator_gpu.py | 316 ++- 5 files changed, 1228 insertions(+), 1 deletion(-)
[incubator-mxnet] branch master updated (58b824f -> da33da3)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from 58b824f fix R docs (#16733) add da33da3 Add MXNet Ops for fast multihead attention (#16408) No new revisions were added by this update. Summary of changes: src/common/cuda_utils.h| 74 + src/operator/contrib/transformer-inl.h | 9 + src/operator/contrib/transformer.cc| 270 src/operator/contrib/transformer.cu| 560 + tests/python/gpu/test_operator_gpu.py | 316 ++- 5 files changed, 1228 insertions(+), 1 deletion(-)
[incubator-mxnet] branch master updated (f9baec9 -> b5d07e3)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from f9baec9 [Numpy] implement np.column_stack (#16594) add b5d07e3 Add check if scipy is imported in sparse.py (#16574) No new revisions were added by this update. Summary of changes: python/mxnet/ndarray/sparse.py | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-)
[incubator-mxnet] branch master updated (fc81c64 -> ffec31f)
This is an automated email from the ASF dual-hosted git repository. haibin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git. from fc81c64 Correct Google Analytics Tracker (#16490) add ffec31f Aggregated adamw update (#16398) No new revisions were added by this update. Summary of changes: python/mxnet/ndarray/contrib.py | 56 +++- src/operator/contrib/adamw-inl.h| 368 +--- src/operator/contrib/adamw.cc | 166 +-- src/operator/contrib/adamw.cu | 34 +-- tests/python/gpu/test_operator_gpu.py | 1 + tests/python/unittest/test_contrib_optimizer.py | 236 --- 6 files changed, 666 insertions(+), 195 deletions(-)