[ https://issues.apache.org/jira/browse/SYSTEMML-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15847664#comment-15847664 ]
Niketan Pansare commented on SYSTEMML-1140: ------------------------------------------- Sorry, I forgot to update this JIRA with series of improvements related to this PR: 1. Many CP convolution operators now have sparse support (except im2col). However, since CuDNN doesnot have a sparse equivalent, we only support dense convolution on GPU. 2. Fused operators such as relu_maxpooling and relu_backward has been added to reduce the conversion overhead of sparsity-introducing operators such as relu. In fact, the performance of relu_maxpooling is exactly same as that of maxpooling in CP, making relu a no-op in the fused implementation :) [~mboehm7] I used Mike's Lenet script with MNIST dataset as an example. Please see https://github.com/apache/incubator-systemml/blob/master/scripts/staging/SystemML-NN/examples/Example%20-%20MNIST%20LeNet.ipynb ... Here is the Cache statistics from a sample run after adding the above mentioned fused operators (date: Jan 13th, 2017): Cache hits (Mem, WB, FS, HDFS): 1096424/0/0/2. Cache writes (WB, FS, HDFS): 603950/15/8. Cache times (ACQr/m, RLS, EXP): 3.659/0.456/273.799/1.275 sec. I have seen anywhere betweeh 250 to 500 seconds spent in Cache times. You can also use Mike's Breast Cancer Project as an example workload. > Sparse/Caching performance bugs related to deep learning scripts > ---------------------------------------------------------------- > > Key: SYSTEMML-1140 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1140 > Project: SystemML > Issue Type: Bug > Affects Versions: SystemML 1.0 > Reporter: Niketan Pansare > Priority: Blocker > > We have identified two performance bugs that frequently occurs in deep > learning script. > First, we repeatedly perform unnecessary conversion to sparse format. Also, > the operations such as matrix multiplication (including BLAS and CuBLAS) are > optimized for dense. > > Second, even with large memory budget, we sometimes spend almost 20-30% time > in caching. > [~mboehm7] [~reinwald] [~mwdus...@us.ibm.com] I am labeling this bug as > blocker for SystemML 1.0. Please feel free to assign this issue to yourself. -- This message was sent by Atlassian JIRA (v6.3.15#6346)