[jira] [Commented] (SYSTEMML-1140) Sparse/Caching performance bugs related to deep learning scripts

2017-01-31 Thread Mike Dusenberry (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15847738#comment-15847738
 ] 

Mike Dusenberry commented on SYSTEMML-1140:
---

Awesome, thanks [~mboehm7].  I agree with [~niketanpansare] that this will 
greatly help with the breast cancer project as well.

> Sparse/Caching performance bugs related to deep learning scripts
> 
>
> Key: SYSTEMML-1140
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1140
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 1.0
>Reporter: Niketan Pansare
>Priority: Blocker
>
> We have identified two performance bugs that frequently occurs in deep 
> learning script.
> First, we repeatedly perform unnecessary conversion to sparse format. Also, 
> the operations such as matrix multiplication (including BLAS and CuBLAS) are  
> optimized for dense.
>   
> Second, even with large memory budget, we sometimes spend almost 20-30% time 
> in caching.
> [~mboehm7] [~reinwald] [~mwdus...@us.ibm.com] I am labeling this bug as 
> blocker for SystemML 1.0. Please feel free to assign this issue to yourself.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1140) Sparse/Caching performance bugs related to deep learning scripts

2017-01-31 Thread Niketan Pansare (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15847691#comment-15847691
 ] 

Niketan Pansare commented on SYSTEMML-1140:
---

Thanks [~mboehm7]. That will also help speedup the breast cancer use-case as 
well :)

> Sparse/Caching performance bugs related to deep learning scripts
> 
>
> Key: SYSTEMML-1140
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1140
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 1.0
>Reporter: Niketan Pansare
>Priority: Blocker
>
> We have identified two performance bugs that frequently occurs in deep 
> learning script.
> First, we repeatedly perform unnecessary conversion to sparse format. Also, 
> the operations such as matrix multiplication (including BLAS and CuBLAS) are  
> optimized for dense.
>   
> Second, even with large memory budget, we sometimes spend almost 20-30% time 
> in caching.
> [~mboehm7] [~reinwald] [~mwdus...@us.ibm.com] I am labeling this bug as 
> blocker for SystemML 1.0. Please feel free to assign this issue to yourself.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1140) Sparse/Caching performance bugs related to deep learning scripts

2017-01-31 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15847687#comment-15847687
 ] 

Matthias Boehm commented on SYSTEMML-1140:
--

OK thanks, as it's on algorithm it will take a while but I'll look into it next 
week. 

> Sparse/Caching performance bugs related to deep learning scripts
> 
>
> Key: SYSTEMML-1140
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1140
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 1.0
>Reporter: Niketan Pansare
>Priority: Blocker
>
> We have identified two performance bugs that frequently occurs in deep 
> learning script.
> First, we repeatedly perform unnecessary conversion to sparse format. Also, 
> the operations such as matrix multiplication (including BLAS and CuBLAS) are  
> optimized for dense.
>   
> Second, even with large memory budget, we sometimes spend almost 20-30% time 
> in caching.
> [~mboehm7] [~reinwald] [~mwdus...@us.ibm.com] I am labeling this bug as 
> blocker for SystemML 1.0. Please feel free to assign this issue to yourself.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1140) Sparse/Caching performance bugs related to deep learning scripts

2017-01-31 Thread Niketan Pansare (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15847664#comment-15847664
 ] 

Niketan Pansare commented on SYSTEMML-1140:
---

Sorry, I forgot to update this JIRA with series of improvements related to this 
PR:
1. Many CP convolution operators now have sparse support (except im2col). 
However, since CuDNN doesnot have a sparse equivalent, we only support dense 
convolution on GPU.
2. Fused operators such as relu_maxpooling and relu_backward has been added to 
reduce the conversion overhead of sparsity-introducing operators such as relu. 
In fact, the performance of relu_maxpooling is exactly same as that of 
maxpooling in CP, making relu a no-op in the fused implementation :)

[~mboehm7] I used Mike's Lenet script with MNIST dataset as an example. Please 
see 
https://github.com/apache/incubator-systemml/blob/master/scripts/staging/SystemML-NN/examples/Example%20-%20MNIST%20LeNet.ipynb
 ... Here is the Cache statistics from a sample run after adding the above 
mentioned fused operators (date: Jan 13th, 2017):

Cache hits (Mem, WB, FS, HDFS): 1096424/0/0/2.
Cache writes (WB, FS, HDFS): 603950/15/8.
Cache times (ACQr/m, RLS, EXP): 3.659/0.456/273.799/1.275 sec.

I have seen anywhere betweeh 250 to 500 seconds spent in Cache times.

You can also use Mike's Breast Cancer Project as an example workload.

> Sparse/Caching performance bugs related to deep learning scripts
> 
>
> Key: SYSTEMML-1140
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1140
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 1.0
>Reporter: Niketan Pansare
>Priority: Blocker
>
> We have identified two performance bugs that frequently occurs in deep 
> learning script.
> First, we repeatedly perform unnecessary conversion to sparse format. Also, 
> the operations such as matrix multiplication (including BLAS and CuBLAS) are  
> optimized for dense.
>   
> Second, even with large memory budget, we sometimes spend almost 20-30% time 
> in caching.
> [~mboehm7] [~reinwald] [~mwdus...@us.ibm.com] I am labeling this bug as 
> blocker for SystemML 1.0. Please feel free to assign this issue to yourself.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1140) Sparse/Caching performance bugs related to deep learning scripts

2017-01-06 Thread Mike Dusenberry (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15805317#comment-15805317
 ] 

Mike Dusenberry commented on SYSTEMML-1140:
---

Copying from GitHub [PR 329 | 
https://github.com/apache/incubator-systemml/pull/329]:

I think that given the newer & expanded range of ML workloads that are now 
being run on SystemML, we should go back through the system and challenge 
previous assumptions that were made, such as the sparsity threshold. We've 
already started this by breaking previous assumptions of tall-skinny matrices, 
for example. A big issue we're seeing now is models that contain several 
computational transformation ("layers") rather than just a couple.

Let's pause [...] and focus on [...] a more general approach of reevaluating 
previous assumptions in the system. Sparse support, caching, and native BLAS 
come to mind immediately.

> Sparse/Caching performance bugs related to deep learning scripts
> 
>
> Key: SYSTEMML-1140
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1140
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 1.0
>Reporter: Niketan Pansare
>Priority: Blocker
>
> We have identified two performance bugs that frequently occurs in deep 
> learning script.
> First, we repeatedly perform unnecessary conversion to sparse format. Also, 
> the operations such as matrix multiplication (including BLAS and CuBLAS) are  
> optimized for dense.
>   
> Second, even with large memory budget, we sometimes spend almost 20-30% time 
> in caching.
> [~mboehm7] [~reinwald] [~mwdus...@us.ibm.com] I am labeling this bug as 
> blocker for SystemML 1.0. Please feel free to assign this issue to yourself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)