[systemml] 03/05: [SYSTEMML-540] Improve the performance of GPU lstm backward operator by passing the state

niketanpansare Tue, 19 Mar 2019 13:26:29 -0700

This is an automated email from the ASF dual-hosted git repository.

niketanpansare pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/systemml.git


commit 2251f4031e745635ba308af12851e2a5ffa7255d
Author: Niketan Pansare <npan...@us.ibm.com>
AuthorDate: Tue Mar 19 12:30:01 2019 -0700

    [SYSTEMML-540] Improve the performance of GPU lstm backward operator by 
passing the state
    
    - The lstm builtin function extended to return state: [out, c, state] = 
lstm(X, W, b, out0, c0, return_sequences)
    - The lstm_backward builtin function extended to accept state: [dX, dW, db, 
dout0, dc0] = lstm_backward(X, W, b, out0, c0, given_sequences, dout, dc, state)
    - Updated the DML documentation to reflect this change.
    - Updated the release documentation.
    
    Closes #856.
---
 dml-language-reference.md | 21 +++++++++++----------
 release-process.md        | 25 +++++++++++--------------
 2 files changed, 22 insertions(+), 24 deletions(-)

diff --git a/dml-language-reference.md b/dml-language-reference.md
index 6f1c854..f64b6ea 100644
--- a/dml-language-reference.md
+++ b/dml-language-reference.md
@@ -1521,16 +1521,17 @@ The images are assumed to be stored NCHW format, where 
N = batch size, C = #chan
 Hence, the images are internally represented as a matrix with dimension (N, C 
* H * W).
 
 
-| Function name                               | Input matrices           | 
Dimension of first input matrix                           | Dimension of second 
input matrix (if applicable)          | Dimension of (first) output matrix      
                                                    | Input Parameters          
                                                                                
                                                                                
    | Notes       [...]
-|---------------------------------------------|--------------------------|-----------------------------------------------------------|-----------------------------------------------------------|---------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------
 [...]
-| conv2d                                      | input, filter            | 
[batch_size X num_channels* height_image* width_image]    | [num_filters X 
num_channels* height_filter* width_filter] | [batch_size X num_channels_out* 
height_out* width_out]                                      | stride=[stride_h, 
stride_w], padding=[pad_h, pad_w], input_shape=[batch_size, num_channels, 
height_image, width_image], filter_shape=[num_filters, num_channels, 
height_filter, width_filter] | Performs 2D [...]
-| conv2d_backward_filter                      | input, dout              | 
[batch_size X num_channels* height_image* width_image]    | [batch_size X 
num_channels_out* height_out* width_out]    | [num_filters X num_channels* 
height_filter* width_filter]                                   | 
stride=[stride_h, stride_w], padding=[pad_h, pad_w], input_shape=[batch_size, 
num_channels, height_image, width_image], filter_shape=[num_filters, 
num_channels, height_filter, width_filter] | Computes th [...]
-| conv2d_backward_data                        | filter, dout             | 
[num_filters X num_channels* height_filter* width_filter] | [batch_size X 
num_channels_out* height_out* width_out]    | [batch_size X num_channels* 
height_image* width_image]                                      | 
stride=[stride_h, stride_w], padding=[pad_h, pad_w], input_shape=[batch_size, 
num_channels, height_image, width_image], filter_shape=[num_filters, 
num_channels, height_filter, width_filter] | Computes th [...]
-| max_pool, avg_pool                          | input                    | 
[batch_size X num_channels* height_image* width_image]    |                     
                                      | [batch_size X num_channels* height_out* 
width_out]                                          | stride=[stride_h, 
stride_w], padding=[pad_h, pad_w], input_shape=[batch_size, num_channels, 
height_image, width_image], pool_size=[height_pool, width_pool]                 
                  | Performs ma [...]
-| max_pool_backward, avg_pool_backward        | input, dout              | 
[batch_size X num_channels* height_image* width_image]    | [batch_size X 
num_channels* height_out* width_out]        | [batch_size X num_channels* 
height_image* width_image]                                      | 
stride=[stride_h, stride_w], padding=[pad_h, pad_w], input_shape=[batch_size, 
num_channels, height_image, width_image], pool_size=[height_pool, width_pool]   
                                | Computes th [...]
-| bias_add                                    | input, bias              | 
[batch_size X num_channels* height_image* width_image]    | [num_channels X 1]  
                                      | [batch_size X num_channels* 
height_image* width_image]                                      |               
                                                                                
                                                                                
                | Adds the bi [...]
-| bias_multiply                               | input, bias              | 
[batch_size X num_channels* height_image* width_image]    | [num_channels X 1]  
                                      | [batch_size X num_channels* 
height_image* width_image]                                      |               
                                                                                
                                                                                
                | Multiplies  [...]
-| lstm                                        | X,  W, bias, out0, c0    | 
[batch_size X seq_length*num_features]                    | 
[num_features+hidden_size X 4*hidden_size]                | [batch_size X 
seq_length*hidden_size] if return_sequences else  [batch_size X hidden_size]  | 
return_sequences                                                                
                                                                                
                              | Perform com [...]
+| Function name                               | Input matrices                 
                     | Dimension of first input matrix                          
 | Dimension of second input matrix (if applicable)          | Dimension of 
(first) output matrix                                                          
| Input Parameters                                                              
                                                                                
                   [...]
+|---------------------------------------------|-----------------------------------------------------|-----------------------------------------------------------|-----------------------------------------------------------|---------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 [...]
+| conv2d                                      | input, filter                  
                     | [batch_size X num_channels* height_image* width_image]   
 | [num_filters X num_channels* height_filter* width_filter] | [batch_size X 
num_channels_out* height_out* width_out]                                      | 
stride=[stride_h, stride_w], padding=[pad_h, pad_w], input_shape=[batch_size, 
num_channels, height_image, width_image], filter_shape=[num_filters, 
num_channels, height_filter,  [...]
+| conv2d_backward_filter                      | input, dout                    
                     | [batch_size X num_channels* height_image* width_image]   
 | [batch_size X num_channels_out* height_out* width_out]    | [num_filters X 
num_channels* height_filter* width_filter]                                   | 
stride=[stride_h, stride_w], padding=[pad_h, pad_w], input_shape=[batch_size, 
num_channels, height_image, width_image], filter_shape=[num_filters, 
num_channels, height_filter,  [...]
+| conv2d_backward_data                        | filter, dout                   
                     | [num_filters X num_channels* height_filter* 
width_filter] | [batch_size X num_channels_out* height_out* width_out]    | 
[batch_size X num_channels* height_image* width_image]                          
            | stride=[stride_h, stride_w], padding=[pad_h, pad_w], 
input_shape=[batch_size, num_channels, height_image, width_image], 
filter_shape=[num_filters, num_channels, height_filter,  [...]
+| max_pool, avg_pool                          | input                          
                     | [batch_size X num_channels* height_image* width_image]   
 |                                                           | [batch_size X 
num_channels* height_out* width_out]                                          | 
stride=[stride_h, stride_w], padding=[pad_h, pad_w], input_shape=[batch_size, 
num_channels, height_image, width_image], pool_size=[height_pool, width_pool]   
                   [...]
+| max_pool_backward, avg_pool_backward        | input, dout                    
                     | [batch_size X num_channels* height_image* width_image]   
 | [batch_size X num_channels* height_out* width_out]        | [batch_size X 
num_channels* height_image* width_image]                                      | 
stride=[stride_h, stride_w], padding=[pad_h, pad_w], input_shape=[batch_size, 
num_channels, height_image, width_image], pool_size=[height_pool, width_pool]   
                   [...]
+| bias_add                                    | input, bias                    
                     | [batch_size X num_channels* height_image* width_image]   
 | [num_channels X 1]                                        | [batch_size X 
num_channels* height_image* width_image]                                      | 
                                                                                
                                                                                
                 [...]
+| bias_multiply                               | input, bias                    
                     | [batch_size X num_channels* height_image* width_image]   
 | [num_channels X 1]                                        | [batch_size X 
num_channels* height_image* width_image]                                      | 
                                                                                
                                                                                
                 [...]
+| lstm                                        | X,  W, bias, out0, c0          
                     | [N X T*D]                                                
 | [D+M X 4M]                                                | [N X T*M] if 
given_sequences is true else [ N X M ]                                         
| return_sequences                                                              
                                                                                
                   [...]
+| lstm_backward                               | X, W, b, out0, c0, 
given_sequences, dout, dc, state | [N X T*M] if given_sequences is true else [ 
N X M]        | [N X M]                                                   | [N 
X T*D]                                                                          
         | return_sequences                                                     
                                                                                
                            [...]
 
 Note: the builtin functions `batch_norm2d` and `batch_norm2d_backward` are 
deprecated and will be removed in the next release. The `lstm` builtin function 
is in experimental phase and is only supported for the GPU backend. 
 
diff --git a/release-process.md b/release-process.md
index 3798ec7..dec6b15 100644
--- a/release-process.md
+++ b/release-process.md
@@ -255,22 +255,19 @@ this OS X example.
 
 ## Python Tests
 
-For Spark 1.*, the Python tests at (`src/main/python/tests`) can be executed 
in the following manner:
+Compile SystemML distribution:
 
-       PYSPARK_PYTHON=python3 pyspark --driver-class-path SystemML.jar 
test_matrix_agg_fn.py
-       PYSPARK_PYTHON=python3 pyspark --driver-class-path SystemML.jar 
test_matrix_binary_op.py
-       PYSPARK_PYTHON=python3 pyspark --driver-class-path SystemML.jar 
test_mlcontext.py
-       PYSPARK_PYTHON=python3 pyspark --driver-class-path SystemML.jar 
test_mllearn_df.py
-       PYSPARK_PYTHON=python3 pyspark --driver-class-path SystemML.jar 
test_mllearn_numpy.py
+       mvn package -P distribution
+       cd src/main/python/tests/
 
-For Spark 2.*, pyspark can't be used to run the Python tests, so they can be 
executed using
-spark-submit:
+For Spark 2.*, the Python tests at (`src/main/python/tests`) can be executed 
in the following manner:
 
-       spark-submit --driver-class-path SystemML.jar test_matrix_agg_fn.py
-       spark-submit --driver-class-path SystemML.jar test_matrix_binary_op.py
-       spark-submit --driver-class-path SystemML.jar test_mlcontext.py
-       spark-submit --driver-class-path SystemML.jar test_mllearn_df.py
-       spark-submit --driver-class-path SystemML.jar test_mllearn_numpy.py
+       PYSPARK_PYTHON=python3 spark-submit --driver-class-path 
../../../../target/SystemML.jar,../../../../target/systemml-*-SNAPSHOT-extra.jar
 test_matrix_agg_fn.py
+       PYSPARK_PYTHON=python3 spark-submit --driver-class-path 
../../../../target/SystemML.jar,../../../../target/systemml-*-SNAPSHOT-extra.jar
 test_matrix_binary_op.py
+       PYSPARK_PYTHON=python3 spark-submit --driver-class-path 
../../../../target/SystemML.jar,../../../../target/systemml-*-SNAPSHOT-extra.jar
 test_mlcontext.py
+       PYSPARK_PYTHON=python3 spark-submit --driver-class-path 
../../../../target/SystemML.jar,../../../../target/systemml-*-SNAPSHOT-extra.jar
 test_mllearn_df.py
+       PYSPARK_PYTHON=python3 spark-submit --driver-class-path 
../../../../target/SystemML.jar,../../../../target/systemml-*-SNAPSHOT-extra.jar
 test_mllearn_numpy.py
+       PYSPARK_PYTHON=python3 spark-submit --driver-class-path 
../../../../target/SystemML.jar,../../../../target/systemml-*-SNAPSHOT-extra.jar
 test_nn_numpy.py
 
 
 ## Check LICENSE and NOTICE Files
@@ -385,7 +382,7 @@ file and remove all the `@Ignore` annotations from all the 
tests. Then run the N
 # Run other GPU Unit Tests 
 
        rm result.txt
-       for t in AggregateUnaryOpTests  BinaryOpTests  
MatrixMatrixElementWiseOpTests  RightIndexingTests AppendTest  
MatrixMultiplicationOpTest ReorgOpTests ScalarMatrixElementwiseOpTests 
UnaryOpTests
+       for t in AggregateUnaryOpTests  BinaryOpTests  
MatrixMatrixElementWiseOpTests  RightIndexingTests AppendTest  
MatrixMultiplicationOpTest ReorgOpTests ScalarMatrixElementwiseOpTests 
UnaryOpTests LstmTest LstmCPUTest
        do
                mvn -Dit.test="org.apache.sysml.test.gpu."$t verify -PgpuTests 
&> tmp.txt
                SUCCESS=`grep "BUILD SUCCESS" tmp.txt`

[systemml] 03/05: [SYSTEMML-540] Improve the performance of GPU lstm backward operator by passing the state

Reply via email to