This is an automated email from the ASF dual-hosted git repository.

zhasheng pushed a commit to branch v1.3.x
in repository https://gitbox.apache.org/repos/asf/incubator-mxnet.git


The following commit(s) were added to refs/heads/v1.3.x by this push:
     new 05b6dc3  Cherry-pick few commits to release 1.3 branch (#12375)
05b6dc3 is described below

commit 05b6dc31689838ab8d0529b01666193fe4da2d8f
Author: Roshani Nagmote <roshaninagmo...@gmail.com>
AuthorDate: Mon Aug 27 14:03:14 2018 -0700

    Cherry-pick few commits to release 1.3 branch (#12375)
    
    * Add a tutorial for control flow operators. (#12340)
    
    * the first version.
    
    * fix.
    
    * add to test.
    
    * fix.
    
    * fix.
    
    * fix
    
    * fix.
    
    * fix.
    
    * add title.
    
    * add link
    
    * fix.
    
    * Update ONNX API docs references (#12317)
    
    * update onnx API references
    
    * update descriptions
    
    * [MXAPPS-581] Disable an additional long test in the SD nightly (#12343)
    
    * Disable an additional test in the SD nightly that also takes over the
      timeout.
    
    * Documentation update related to sparse support (#12367)
    
    * Update sparse.md
    
    * Update sparse.md
    
    * Update csr.md
    
    * Update row_sparse.md
    
    * Update train.md
---
 docs/api/python/contrib/onnx.md                    |  22 +-
 docs/api/python/ndarray/sparse.md                  |  10 +-
 docs/api/python/symbol/sparse.md                   |   7 +-
 docs/tutorials/control_flow/ControlFlowTutorial.md | 388 +++++++++++++++++++++
 docs/tutorials/index.md                            |   1 +
 docs/tutorials/sparse/csr.md                       |   4 +-
 docs/tutorials/sparse/row_sparse.md                |   7 +-
 docs/tutorials/sparse/train.md                     |   2 +-
 python/mxnet/contrib/onnx/mx2onnx/export_model.py  |   2 +-
 python/mxnet/contrib/onnx/onnx2mx/import_model.py  |   3 +-
 .../straight_dope/test_notebooks_single_gpu.py     |   4 +-
 tests/tutorials/test_tutorials.py                  |   3 +
 12 files changed, 416 insertions(+), 37 deletions(-)

diff --git a/docs/api/python/contrib/onnx.md b/docs/api/python/contrib/onnx.md
index 4499414..f8210ad 100644
--- a/docs/api/python/contrib/onnx.md
+++ b/docs/api/python/contrib/onnx.md
@@ -22,10 +22,9 @@ This document describes all the ONNX-MXNet APIs.
 .. autosummary::
     :nosignatures:
 
-    mxnet.contrib.onnx.import_model
-    mxnet.contrib.onnx.get_model_metadata
-    mxnet.contrib.onnx.import_to_gluon
-    mxnet.contrib.onnx.export_model
+    mxnet.contrib.onnx.onnx2mx.import_model
+    mxnet.contrib.onnx.onnx2mx.import_to_gluon
+    mxnet.contrib.onnx.mx2onnx.export_model
 ```
 
 ## ONNX Tutorials
@@ -33,7 +32,7 @@ This document describes all the ONNX-MXNet APIs.
 ```eval_rst
 .. toctree::
    :maxdepth: 1
-   
+
    /tutorials/onnx/super_resolution.md
    /tutorials/onnx/export_mxnet_to_onnx.md
    /tutorials/onnx/inference_on_onnx_model.md
@@ -43,7 +42,7 @@ This document describes all the ONNX-MXNet APIs.
 ## ONNX Examples
 
 * Face Recognition with 
[ArcFace](https://github.com/onnx/models/tree/master/models/face_recognition/ArcFace)
-* Image Classification with 
[MobileNet](https://github.com/onnx/models/tree/master/models/image_classification/mobilenet),
 
[ResNet](https://github.com/onnx/models/tree/master/models/image_classification/resnet),
 
[SqueezeNet](https://github.com/onnx/models/tree/master/models/image_classification/squeezenet),
 
[VGG](https://github.com/onnx/models/tree/master/models/image_classification/vgg)
 
+* Image Classification with 
[MobileNet](https://github.com/onnx/models/tree/master/models/image_classification/mobilenet),
 
[ResNet](https://github.com/onnx/models/tree/master/models/image_classification/resnet),
 
[SqueezeNet](https://github.com/onnx/models/tree/master/models/image_classification/squeezenet),
 
[VGG](https://github.com/onnx/models/tree/master/models/image_classification/vgg)
 
 ## API Reference
 
@@ -51,11 +50,12 @@ This document describes all the ONNX-MXNet APIs.
 
 ```eval_rst
 
-.. automodule:: mxnet.contrib.onnx.import_model
-.. automodule:: mxnet.contrib.onnx.get_model_metadata
-.. automodule:: mxnet.contrib.onnx.import_to_gluon
-.. automodule:: mxnet.contrib.onnx.export_model
-
+.. automodule:: mxnet.contrib.onnx.onnx2mx.import_model
+    :members: import_model, get_model_metadata
+.. automodule:: mxnet.contrib.onnx.onnx2mx.import_to_gluon
+    :members: import_to_gluon
+.. automodule:: mxnet.contrib.onnx.mx2onnx.export_model
+    :members: export_model
 ```
 
 <script>auto_index("api-reference");</script>
diff --git a/docs/api/python/ndarray/sparse.md 
b/docs/api/python/ndarray/sparse.md
index 85d33b1..2ade059 100644
--- a/docs/api/python/ndarray/sparse.md
+++ b/docs/api/python/ndarray/sparse.md
@@ -16,7 +16,7 @@ This document lists the routines of the *n*-dimensional 
sparse array package:
 ```
 
 The `CSRNDArray` and `RowSparseNDArray` API, defined in the `ndarray.sparse` 
package, provides
-imperative sparse tensor operations on **CPU**.
+imperative sparse tensor operations.
 
 An `CSRNDArray` inherits from `NDArray`, and represents a two-dimensional, 
fixed-size array in compressed sparse row format.
 
@@ -63,16 +63,13 @@ A detailed tutorial is available at
 
 ```eval_rst
 
-.. note:: ``mxnet.ndarray.sparse.RowSparseNDArray`` and 
``mxnet.ndarray.sparse.CSRNDArray`` DO NOT support the ``mxnet.gluon`` 
high-level interface yet.
-
 .. note:: ``mxnet.ndarray.sparse`` is similar to ``mxnet.ndarray`` in some 
aspects. But the differences are not negligible. For instance:
 
-   - Only a subset of operators in ``mxnet.ndarray`` have specialized 
implementations in ``mxnet.ndarray.sparse``.
-     Operators such as Convolution and broadcasting do not have sparse 
implementations yet.
+   - Only a subset of operators in ``mxnet.ndarray`` have efficient sparse 
implementations in ``mxnet.ndarray.sparse``.
+   - If an operator do not occur in the ``mxnet.ndarray.sparse`` namespace, 
that means the operator does not have an efficient sparse implementation yet. 
If sparse inputs are passed to such an operator, it will convert inputs to the 
dense format and fallback to the already available dense implementation.
    - The storage types (``stype``) of sparse operators' outputs depend on the 
storage types of inputs.
      By default the operators not available in ``mxnet.ndarray.sparse`` infer 
"default" (dense) storage type for outputs.
      Please refer to the [API Reference](#api-reference) section for further 
details on specific operators.
-   - GPU support for ``mxnet.ndarray.sparse`` is experimental. Only a few 
sparse operators are supported on GPU such as ``sparse.dot``.
 
 .. note:: ``mxnet.ndarray.sparse.CSRNDArray`` is similar to 
``scipy.sparse.csr_matrix`` in some aspects. But they differ in a few aspects:
 
@@ -559,7 +556,6 @@ We summarize the interface for each class in the following 
sections.
     sgd_update
     sgd_mom_update
     adam_update
-    ftrl_update
     adagrad_update
 ```
 
diff --git a/docs/api/python/symbol/sparse.md b/docs/api/python/symbol/sparse.md
index d26ba07..cd8272c 100644
--- a/docs/api/python/symbol/sparse.md
+++ b/docs/api/python/symbol/sparse.md
@@ -16,7 +16,7 @@ This document lists the routines of the sparse symbolic 
expression package:
 ```
 
 The `Sparse Symbol` API, defined in the `symbol.sparse` package, provides
-sparse neural network graphs and auto-differentiation on CPU.
+sparse neural network graphs and auto-differentiation.
 
 The storage type of a variable is speficied by the `stype` attribute of the 
variable.
 The storage type of a symbolic expression is inferred based on the storage 
types of the variables and the operators.
@@ -43,12 +43,11 @@ array([ 1.,  1.],
 .. note:: most operators provided in ``mxnet.symbol.sparse`` are similar to 
those in
    ``mxnet.symbol`` although there are few differences:
 
-   - Only a subset of operators in ``mxnet.symbol`` have specialized 
implementations in ``mxnet.symbol.sparse``.
-     Operators such as reduction and broadcasting do not have sparse 
implementations yet.
+   - Only a subset of operators in ``mxnet.symbol`` have efficient sparse 
implementations in ``mxnet.symbol.sparse``.
+   - If an operator do not occur in the ``mxnet.symbol.sparse`` namespace, 
that means the operator does not have an efficient sparse implementation yet. 
If sparse inputs are passed to such an operator, it will convert inputs to the 
dense format and fallback to the already available dense implementation.
    - The storage types (``stype``) of sparse operators' outputs depend on the 
storage types of inputs.
      By default the operators not available in ``mxnet.symbol.sparse`` infer 
"default" (dense) storage type for outputs.
      Please refer to the API reference section for further details on specific 
operators.
-   - GPU support for ``mxnet.symbol.sparse`` is experimental.
 
 ```
 
diff --git a/docs/tutorials/control_flow/ControlFlowTutorial.md 
b/docs/tutorials/control_flow/ControlFlowTutorial.md
new file mode 100644
index 0000000..9e4c66f
--- /dev/null
+++ b/docs/tutorials/control_flow/ControlFlowTutorial.md
@@ -0,0 +1,388 @@
+# Hybridize Gluon models with control flows.
+
+MXNet currently provides three control flow operators: `cond`, `foreach` and 
`while_loop`. Like other MXNet operators, they all have a version for NDArray 
and a version for Symbol. These two versions have exactly the same semantics. 
We can take advantage of this and use them in Gluon to hybridize models.
+
+In this tutorial, we use a few examples to demonstrate the use of control flow 
operators in Gluon and show how a model that requires control flow is 
hybridized.
+
+## Prepare running the code
+
+
+```python
+import mxnet as mx
+from mxnet.gluon import HybridBlock
+```
+
+## foreach
+`foreach` is a for loop that iterates over the first dimension of the input 
data (it can be an array or a list of arrays). It is defined with the following 
signature:
+
+```python
+foreach(body, data, init_states, name) => (outputs, states)
+```
+
+It runs the Python function defined in `body` for every slice from the input 
arrays. The signature of the `body` function is defined as follows:
+
+```python
+body(data, states) => (outputs, states)
+```
+
+The inputs of the `body` function have two parts: `data` is a slice of an 
array (if there is only one input array in `foreach`) or a list of slices (if 
there are a list of input arrays); `states` are the arrays from the previous 
iteration. The outputs of the `body` function also have two parts: `outputs` is 
an array or a list of arrays; `states` is the computation states of the current 
iteration. `outputs` from all iterations are concatenated as the outputs of 
`foreach`.
+
+The following pseudocode illustrates the execution of `foreach`.
+
+```python
+def foreach(body, data, init_states):
+    states = init_states
+    outs = []
+
+    for i in range(data.shape[0]):
+        s = data[i]
+        out, states = body(s, states)
+        outs.append(out)
+    outs = mx.nd.stack(*outs)
+    return outs, states
+```
+
+### Example 1: `foreach` works like map
+`foreach` can work like a map function of a functional language. In this case, 
the states of `foreach` can be an empty list, which means the computation 
doesn't carry computation states across iterations.
+
+In this example, we use `foreach` to increase each element's value of an array 
by one.
+
+
+```python
+data = mx.nd.arange(5)
+print(data)
+```
+
+    
+    [ 0.  1.  2.  3.  4.]
+    <NDArray 5 @cpu(0)>
+
+
+
+```python
+def add1(data, _):
+    return data + 1, []
+
+class Map(HybridBlock):
+    def hybrid_forward(self, F, data):
+        out, _ = F.contrib.foreach(add1, data, [])
+        return out
+    
+map_layer = Map()
+out = map_layer(data)
+print(out)
+```
+
+    
+    [[ 1.]
+     [ 2.]
+     [ 3.]
+     [ 4.]
+     [ 5.]]
+    <NDArray 5x1 @cpu(0)>
+
+
+We can hybridize the block and run the computation again. It should generate 
the same result.
+
+
+```python
+map_layer.hybridize()
+out = map_layer(data)
+print(out)
+```
+
+    
+    [[ 1.]
+     [ 2.]
+     [ 3.]
+     [ 4.]
+     [ 5.]]
+    <NDArray 5x1 @cpu(0)>
+
+
+### Example 2: `foreach` works like scan
+`foreach` can work like a scan function in a functional language. In this 
case, the outputs of the Python function is an empty list.
+
+
+```python
+def sum(data, state):
+    return [], state + data
+
+class Scan(HybridBlock):
+    def hybrid_forward(self, F, data):
+        _, state = F.contrib.foreach(sum, data, F.zeros((1)))
+        return state
+scan_layer = Scan()
+state = scan_layer(data)
+print(data)
+print(state)
+```
+
+    
+    [ 0.  1.  2.  3.  4.]
+    <NDArray 5 @cpu(0)>
+    
+    [ 10.]
+    <NDArray 1 @cpu(0)>
+
+
+
+```python
+scan_layer.hybridize()
+state = scan_layer(data)
+print(state)
+```
+
+    
+    [ 10.]
+    <NDArray 1 @cpu(0)>
+
+
+### Example 3: `foreach` with both outputs and states
+This is probably the most common use case of `foreach`. We extend the previous 
scan example and return both output and states.
+
+
+```python
+def sum(data, state):
+    return state + data, state + data
+
+class ScanV2(HybridBlock):
+    def hybrid_forward(self, F, data):
+        out, state = F.contrib.foreach(sum, data, F.zeros((1)))
+        return out, state
+scan_layer = ScanV2()
+out, state = scan_layer(data)
+print(out)
+print(state)
+```
+
+    
+    [[  0.]
+     [  1.]
+     [  3.]
+     [  6.]
+     [ 10.]]
+    <NDArray 5x1 @cpu(0)>
+    
+    [ 10.]
+    <NDArray 1 @cpu(0)>
+
+
+
+```python
+scan_layer.hybridize()
+out, state = scan_layer(data)
+print(out)
+print(state)
+```
+
+    
+    [[  0.]
+     [  1.]
+     [  3.]
+     [  6.]
+     [ 10.]]
+    <NDArray 5x1 @cpu(0)>
+    
+    [ 10.]
+    <NDArray 1 @cpu(0)>
+
+
+### Example 4: use `foreach` to run an RNN on a variable-length sequence
+Previous examples illustrate `foreach` with simple use cases. Here we show an 
example of processing variable-length sequences with `foreach`. The same idea 
is used by `dynamic_rnn` in TensorFlow for processing variable-length sequences.
+
+
+```python
+class DynamicRNNLayer(HybridBlock):
+    def __init__(self, cell, prefix=None, params=None):
+        super(DynamicRNNLayer, self).__init__(prefix=prefix, params=params)
+        self.cell = cell
+    def hybrid_forward(self, F, inputs, begin_state, valid_length):
+        states = begin_state
+        zeros = []
+        for s in states:
+            zeros.append(F.zeros_like(s))
+        # the last state is the iteration number.
+        states.append(F.zeros((1)))
+        def loop_body(inputs, states):
+            cell_states = states[:-1]
+            # Get the iteration number from the states.
+            iter_no = states[-1]
+            out, new_states = self.cell(inputs, cell_states)
+            # Copy the old state if we have reached the end of a sequence.
+            for i, state in enumerate(cell_states):
+                new_states[i] = F.where(F.broadcast_greater(valid_length, 
iter_no),
+                                        new_states[i], state)
+            new_states.append(iter_no + 1)
+            return out, new_states
+
+        outputs, states = F.contrib.foreach(loop_body, inputs, states)
+        outputs = F.SequenceMask(outputs, sequence_length=valid_length,
+                                 use_sequence_length=True, axis=0)
+        # the last state is the iteration number. We don't need it.
+        return outputs, states[:-1]
+
+
+seq_len = 10
+batch_size = 2
+input_size = 5
+hidden_size = 6
+
+rnn_data = mx.nd.normal(loc=0, scale=1, shape=(seq_len, batch_size, 
input_size))
+init_states = [mx.nd.normal(loc=0, scale=1, shape=(batch_size, hidden_size)) 
for i in range(2)]
+valid_length = mx.nd.round(mx.nd.random.uniform(low=1, high=10, 
shape=(batch_size))) 
+
+lstm = DynamicRNNLayer(mx.gluon.rnn.LSTMCell(hidden_size))
+lstm.initialize()
+res, states = lstm(rnn_data, [x for x in init_states], valid_length)
+
+lstm.hybridize()
+res, states = lstm(rnn_data, [x for x in init_states], valid_length)
+```
+
+## while_loop
+`while_loop` defines a while loop. It has the following signature:
+
+```python
+while_loop(cond, body, loop_vars, max_iterations, name) => (outputs, states)
+```
+
+Instead of running over the first dimension of an array, `while_loop` checks a 
condition function in every iteration and runs a `body` function for 
computation. The signature of the `body` function is defined as follows:
+
+```python
+body(state1, state2, ...) => (outputs, states)
+```
+
+The inputs of the `body` function in `while_loop` are a little different from 
the one in `foreach`. It has a variable number of input arguments. Each input 
argument is a loop variable and the number of arguments is determined by the 
number of loop variables. The outputs of the `body` function also have two 
parts: `outputs` is an array or a list of arrays; `states` are loop variables 
and will be passed to the next iteration as inputs of `body`. Like `foreach`, 
both `outputs` and `states`  [...]
+
+### Example 5: scan with while_loop
+`while_loop` is more general than `foreach`. We can also use it to iterate 
over an array and sum all of its values together. In this example, instead of 
summing over the entire array, we only sum over the first 4 elements.
+
+**Note**: the output arrays of the current implementation of `while_loop` is 
determined by `max_iterations`. As such, even though the while loop in this 
example runs 4 iterations, it still outputs an array of 5 elements. The last 
element in the output array is actually filled with an arbitrary value.
+
+
+```python
+class ScanV2(HybridBlock):
+    def hybrid_forward(self, F, data):
+        def sum(state, i):
+            s = state + data[i]
+            return s, [s, i + 1]
+
+        def sum_cond(state, i):
+            return i < 4
+
+        out, state = F.contrib.while_loop(sum_cond, sum,
+                                          [F.zeros((1)), F.zeros((1))], 
max_iterations=5)
+        return out, state
+scan_layer = ScanV2()
+out, state = scan_layer(data)
+print(out)
+print(state)
+```
+
+    
+    [[ 0.]
+     [ 1.]
+     [ 3.]
+     [ 6.]
+     [ 0.]]
+    <NDArray 5x1 @cpu(0)>
+    [
+    [ 6.]
+    <NDArray 1 @cpu(0)>, 
+    [ 4.]
+    <NDArray 1 @cpu(0)>]
+
+
+## cond
+`cond` defines an if condition. It has the following signature:
+
+```python
+cond(pred, then_func, else_func, name)
+```
+
+`cond` checks `pred`, which is a symbol or an NDArray with one element. If its 
value is true, it calls `then_func`. Otherwise, it calls `else_func`. The 
signature of `then_func` and `else_func` are as follows:
+
+```python
+func() => [outputs]
+```
+
+`cond` requires all outputs from `then_func` and `else_func` have the same 
number of Symbols/NDArrays with the same shapes and data types.
+
+### Example 6: skip RNN computation with cond
+Example 4 shows how to process a batch with sequences of different lengths. It 
performs computation for all steps but discards some of the computation results.
+
+In this example, we show how to skip computation after we have reached the end 
of a sequence, whose length is indicated by `length`. The code below only works 
for a batch with one sequence.
+
+
+```python
+class SkipRNNCell(HybridBlock):
+    def __init__(self, cell, prefix=None, params=None):
+        super(SkipRNNCell, self).__init__(prefix=prefix, params=params)
+        self.cell = cell
+    def hybrid_forward(self, F, i, length, data, states):
+        def run_rnn():
+            return self.cell(data, states)
+
+        def copy_states():
+            return F.zeros_like(data), states
+        out, state = F.contrib.cond(i < length, run_rnn, copy_states)
+        return out, state
+
+class RNNLayer(HybridBlock):
+    def __init__(self, cell, prefix=None, params=None):
+        super(RNNLayer, self).__init__(prefix=prefix, params=params)
+        self.cell = SkipRNNCell(cell)
+    def hybrid_forward(self, F, length, data, init_states):
+        def body(data, states):
+            i = states[0]
+            out, states = self.cell(i, length, data, states[1])
+            return out, [i + 1, states]
+        print()
+        out, state = F.contrib.foreach(body, data, [F.zeros((1)), init_states])
+        return out, state
+
+
+seq_len = 5
+batch_size = 1
+input_size = 3
+hidden_size = 3
+
+rnn_data = mx.nd.normal(loc=0, scale=1, shape=(seq_len, batch_size, 
input_size))
+init_states = [mx.nd.normal(loc=0, scale=1, shape=(batch_size, hidden_size)) 
for i in range(2)]
+
+cell = mx.gluon.rnn.LSTMCell(hidden_size)
+layer = RNNLayer(cell)
+layer.initialize()
+
+out, states = layer(mx.nd.array([3]), rnn_data, init_states)
+print(rnn_data)
+print(out)
+```
+
+    ()
+    
+    [[[-1.25296438  0.387312   -0.41055229]]
+    
+     [[ 1.28453672  0.21001032 -0.08666432]]
+    
+     [[ 1.46422136 -1.30581355  0.9344402 ]]
+    
+     [[ 0.5380863  -0.16038011  0.84187603]]
+    
+     [[-1.00553632  3.13221502 -0.4358989 ]]]
+    <NDArray 5x1x3 @cpu(0)>
+    
+    [[[-0.02620504  0.1605694   0.29636264]]
+    
+     [[-0.00474182  0.08719197  0.17757624]]
+    
+     [[ 0.00631597  0.04674901  0.12468992]]
+    
+     [[ 0.          0.          0.        ]]
+    
+     [[ 0.          0.          0.        ]]]
+    <NDArray 5x1x3 @cpu(0)>
+
+
+<!-- INSERT SOURCE DOWNLOAD BUTTONS -->
diff --git a/docs/tutorials/index.md b/docs/tutorials/index.md
index ae08514..1b32333 100644
--- a/docs/tutorials/index.md
+++ b/docs/tutorials/index.md
@@ -96,6 +96,7 @@ Select API:&nbsp;
     * [Fine-Tuning a pre-trained ImageNet model with a new 
dataset](/faq/finetune.html)
     * [Large-Scale Multi-Host Multi-GPU Image 
Classification](/tutorials/vision/large_scale_classification.html)
     * [Importing an ONNX model into 
MXNet](/tutorials/onnx/super_resolution.html)
+    * [Hybridize Gluon models with control 
flows](/tutorials/control_flow/ControlFlowTutorial.html)
 * API Guides
     * Core APIs
         * NDArray
diff --git a/docs/tutorials/sparse/csr.md b/docs/tutorials/sparse/csr.md
index c2842ac..0aede1a 100644
--- a/docs/tutorials/sparse/csr.md
+++ b/docs/tutorials/sparse/csr.md
@@ -512,9 +512,7 @@ Note that in the file the column indices are expected to be 
sorted in ascending
 
 ### GPU Support
 
-By default, `CSRNDArray` operators are executed on CPU. In MXNet, GPU support 
for `CSRNDArray` is experimental with only a few sparse operators such as 
[dot](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html#mxnet.ndarray.sparse.dot).
-
-To create a `CSRNDArray` on a GPU, we need to explicitly specify the context:
+By default, `CSRNDArray` operators are executed on CPU. To create a 
`CSRNDArray` on a GPU, we need to explicitly specify the context:
 
 **Note** If a GPU is not available, an error will be reported in the following 
section. In order to execute it a cpu, set `gpu_device` to `mx.cpu()`.
 
diff --git a/docs/tutorials/sparse/row_sparse.md 
b/docs/tutorials/sparse/row_sparse.md
index c4cab75..27cc0d3 100644
--- a/docs/tutorials/sparse/row_sparse.md
+++ b/docs/tutorials/sparse/row_sparse.md
@@ -541,12 +541,7 @@ Note that only 
[mxnet.optimizer.SGD](https://mxnet.incubator.apache.org/api/pyth
 
 ### GPU Support
 
-By default, RowSparseNDArray operators are executed on CPU. In MXNet, GPU 
support for RowSparseNDArray is limited
-to a few sparse operators such as 
[sgd_update](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html#mxnet.ndarray.sparse.sgd_update),
-[dot](https://mxnet.incubator.apache.org/api/python/ndarray/sparse.html#mxnet.ndarray.sparse.dot)
 and
-[Embedding](https://mxnet.incubator.apache.org/api/python/ndarray/ndarray.html#mxnet.ndarray.Embedding).
-
-To create a RowSparseNDArray on gpu, we need to explicitly specify the context:
+By default, RowSparseNDArray operators are executed on CPU. To create a 
RowSparseNDArray on gpu, we need to explicitly specify the context:
 
 **Note** If a GPU is not available, an error will be reported in the following 
section. In order to execute it on a cpu, set gpu_device to mx.cpu().
 
diff --git a/docs/tutorials/sparse/train.md b/docs/tutorials/sparse/train.md
index 7472fcd..fde4c0e 100644
--- a/docs/tutorials/sparse/train.md
+++ b/docs/tutorials/sparse/train.md
@@ -314,7 +314,7 @@ assert metric.get()[1] < 1, "Achieved MSE (%f) is larger 
than expected (1.0)" %
 
 ### Training the model with multiple machines or multiple devices
 
-To train a sparse model with multiple machines, you need to call `prepare` 
before `forward`, or `save_checkpoint`.
+Distributed training with `row_sparse` weights and gradients are supported in 
MXNet, which significantly reduces communication cost for large models. To 
train a sparse model with multiple machines, you need to call `prepare` before 
`forward`, or `save_checkpoint`.
 Please refer to the example in 
[mxnet/example/sparse/linear_classification](https://github.com/apache/incubator-mxnet/tree/master/example/sparse/linear_classification)
 for more details.
 
diff --git a/python/mxnet/contrib/onnx/mx2onnx/export_model.py 
b/python/mxnet/contrib/onnx/mx2onnx/export_model.py
index 0dbfdc1..33292bf 100644
--- a/python/mxnet/contrib/onnx/mx2onnx/export_model.py
+++ b/python/mxnet/contrib/onnx/mx2onnx/export_model.py
@@ -18,7 +18,7 @@
 # coding: utf-8
 #pylint: disable-msg=too-many-arguments
 
-"""export function"""
+"""Exports an MXNet model to the ONNX model format"""
 from __future__ import absolute_import
 from __future__ import division
 from __future__ import print_function
diff --git a/python/mxnet/contrib/onnx/onnx2mx/import_model.py 
b/python/mxnet/contrib/onnx/onnx2mx/import_model.py
index 4e4d786..e190c3b 100644
--- a/python/mxnet/contrib/onnx/onnx2mx/import_model.py
+++ b/python/mxnet/contrib/onnx/onnx2mx/import_model.py
@@ -16,7 +16,7 @@
 # under the License.
 
 # coding: utf-8
-"""import function"""
+"""Functions for importing ONNX models to MXNet and for checking metadata"""
 # pylint: disable=no-member
 
 from .import_onnx import GraphProto
@@ -72,6 +72,7 @@ def get_model_metadata(model_file):
             'output_tensor_data' : <list of tuples representing the shape of 
the output
                                     of the model>
         }
+
     """
     graph = GraphProto()
     try:
diff --git a/tests/nightly/straight_dope/test_notebooks_single_gpu.py 
b/tests/nightly/straight_dope/test_notebooks_single_gpu.py
index a60498c..a6437cd 100644
--- a/tests/nightly/straight_dope/test_notebooks_single_gpu.py
+++ b/tests/nightly/straight_dope/test_notebooks_single_gpu.py
@@ -40,6 +40,7 @@ NOTEBOOKS_WHITELIST = [
     'chapter07_distributed-learning/multiple-gpus-scratch',
     'chapter07_distributed-learning/multiple-gpus-gluon',
     'chapter07_distributed-learning/training-with-multiple-machines',
+    'chapter08_computer-vision/visual-question-answer', # > 10 mins.
     'chapter11_recommender-systems/intro-recommender-systems',  # Early draft, 
non-working.
     'chapter12_time-series/intro-forecasting-gluon',
     'chapter12_time-series/intro-forecasting-2-gluon',
@@ -228,9 +229,6 @@ class StraightDopeSingleGpuTests(unittest.TestCase):
     def test_fine_tuning(self):
         assert _test_notebook('chapter08_computer-vision/fine-tuning')
 
-    def test_visual_question_answer(self):
-        assert 
_test_notebook('chapter08_computer-vision/visual-question-answer')
-
     # Chapter 9
 
     def test_tree_lstm(self):
diff --git a/tests/tutorials/test_tutorials.py 
b/tests/tutorials/test_tutorials.py
index 2c87682..503df01 100644
--- a/tests/tutorials/test_tutorials.py
+++ b/tests/tutorials/test_tutorials.py
@@ -183,3 +183,6 @@ def test_vision_large_scale_classification():
 
 def test_vision_cnn_visualization():
     assert _test_tutorial_nb('vision/cnn_visualization')
+
+def test_control_flow():
+    assert _test_tutorial_nb('control_flow/ControlFlowTutorial')

Reply via email to