[jira] [Commented] (SYSTEMML-1595) Missing Block Sizes For PersistentWrites & TransientWrites

2017-05-09 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16003892#comment-16003892
 ] 

Matthias Boehm commented on SYSTEMML-1595:
--

yes, I already modified the persistent to transient write rewrite and I'm 
currently in the process of adding some additional tests.

> Missing Block Sizes For PersistentWrites & TransientWrites
> --
>
> Key: SYSTEMML-1595
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1595
> Project: SystemML
>  Issue Type: Bug
>Reporter: Mike Dusenberry
> Attachments: scenario1.dml
>
>
> In the attached script, the resulting PersisentWrites for {{doutc1_agg}} & 
> {{dWc1_agg}} end up having unknown block sizes, despite the input DAGs for 
> those variables having known block sizes.  Due to this, when we use MLContext 
> and mark those variables as outputs, the PersistentWrites will be rewritten 
> to TransientWrites, and the block sizes will remain unknown.
> To run:
> {code}
> spark-submit $SYSTEMML_HOME/target/SystemML.jar -f scenario1.dml -explain 
> recompile_hops
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1595) Missing Block Sizes For PersistentWrites & TransientWrites

2017-05-09 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16003809#comment-16003809
 ] 

Matthias Boehm commented on SYSTEMML-1595:
--

As it turns out the "unknown" blocksizes originate from the default format 
"text" which is not a blocked representation. By specifying {{format="binary"}} 
in the write statements, the blocksizes are correctly set. 

However, every transient write should have proper block sizes and we need to 
make sure they are set correctly when modifying persistent writes to transient 
writes.

> Missing Block Sizes For PersistentWrites & TransientWrites
> --
>
> Key: SYSTEMML-1595
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1595
> Project: SystemML
>  Issue Type: Bug
>Reporter: Mike Dusenberry
> Attachments: scenario1.dml
>
>
> In the attached script, the resulting PersisentWrites for {{doutc1_agg}} & 
> {{dWc1_agg}} end up having unknown block sizes, despite the input DAGs for 
> those variables having known block sizes.  Due to this, when we use MLContext 
> and mark those variables as outputs, the PersistentWrites will be rewritten 
> to TransientWrites, and the block sizes will remain unknown.
> To run:
> {code}
> spark-submit $SYSTEMML_HOME/target/SystemML.jar -f scenario1.dml -explain 
> recompile_hops
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1561) Improve constant folding during compilation

2017-05-09 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16003628#comment-16003628
 ] 

Matthias Boehm commented on SYSTEMML-1561:
--

and it's great to see that the recompilation times are still in a reasonable 
range: 5978 DAGs in 3.2s - generally, we try to keep recompilation of average 
DAGs at around 1ms. 

> Improve constant folding during compilation
> ---
>
> Key: SYSTEMML-1561
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1561
> Project: SystemML
>  Issue Type: Improvement
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Fix For: SystemML 1.0
>
> Attachments: scenario1_plan.txt, scenario1.py, scenario2_plan.txt, 
> scenario2.py
>
>
> In our `nn` library, our convolution and pooling layers have to pass around 
> the spatial dimensions (height and width) of the images that are stretched 
> out into rows of the input/output matrices.  These output dimensions are 
> computed within the forward functions of the above layers as small scalar 
> equations.  From a mathematical standpoint, these sizes can be determined at 
> compile time, and it is nice to have these size equations in DML (v.s. hiding 
> them inside the engine within built-in functions).  However, we do not 
> currently evaluate these expressions during compilation, and thus we are left 
> with unknown sizes even during recompilation.  This naturally leads to max 
> memory estimates and thus often leads to unnecessary distributed runtime ops 
> rather than simple CP ones.
> I have two related scenarios for which this is a problem.  They both involve 
> the {{Houtc1}} & {{Woutc1}} values that are returned from a 
> `conv2d::forward(...)` function.  These represent the spatial dimensions of 
> the volume with each of the rows of the output {{outc1}} of the function, and 
> the third dimension is {{F1}}.  Thus, {{outc1}} has a number of columns equal 
> to {{F1*Houtc1*Wouc1}}.
> In the first scenario ({{scenario1.py}}), a random matrix {{doutc1}} is 
> created that should have the same dimensions as {{outc1}}.  For the columns, 
> if I use {{cols=ncol(outc1)}} in this rand statement, the size will be 
> propagated and CP ops will be compiled and run.  I I instead use 
> {{cols=F1*Houtc1*Woutc1}}, the size will forever be unknown, even during 
> recompilation, and thus Spark ops will be compiled and run.  I have included 
> the recompile hops plan ({{scenario1_plan.txt}}).
> In the second scenario ({{scenario2.py}}), a {{max_pool2d::forward(...)}} 
> function is inserted after the {{conv2d::forward(...)}} function that 
> requires the {{Houtc1}} and {{Woutc1}} variables to be supplied as arguments. 
>  Since those latter variables are not executed during compilation time, the 
> max pooling sizes remain unknown, even during recompilation, and thus Spark 
> ops will be compiled and run.  I have included the recompile hops plan 
> ({{scenario2_plan.txt}}).
> We should either improve or fix our constant folding rewrites so that these 
> scenarios are fixed, as they are necessary for performant deep learning 
> applications.  Note too that this issue will be present in other non-deep 
> learning scenarios as well.
> Mailing list thread: 
> https://www.mail-archive.com/dev@systemml.incubator.apache.org/msg01657.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1561) Improve constant folding during compilation

2017-05-09 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16003618#comment-16003618
 ] 

Matthias Boehm commented on SYSTEMML-1561:
--

that's awesome - just one question: do we understand what reduced the number of 
cache writes to HDFS (export) from 2100 to 8?

> Improve constant folding during compilation
> ---
>
> Key: SYSTEMML-1561
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1561
> Project: SystemML
>  Issue Type: Improvement
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Fix For: SystemML 1.0
>
> Attachments: scenario1_plan.txt, scenario1.py, scenario2_plan.txt, 
> scenario2.py
>
>
> In our `nn` library, our convolution and pooling layers have to pass around 
> the spatial dimensions (height and width) of the images that are stretched 
> out into rows of the input/output matrices.  These output dimensions are 
> computed within the forward functions of the above layers as small scalar 
> equations.  From a mathematical standpoint, these sizes can be determined at 
> compile time, and it is nice to have these size equations in DML (v.s. hiding 
> them inside the engine within built-in functions).  However, we do not 
> currently evaluate these expressions during compilation, and thus we are left 
> with unknown sizes even during recompilation.  This naturally leads to max 
> memory estimates and thus often leads to unnecessary distributed runtime ops 
> rather than simple CP ones.
> I have two related scenarios for which this is a problem.  They both involve 
> the {{Houtc1}} & {{Woutc1}} values that are returned from a 
> `conv2d::forward(...)` function.  These represent the spatial dimensions of 
> the volume with each of the rows of the output {{outc1}} of the function, and 
> the third dimension is {{F1}}.  Thus, {{outc1}} has a number of columns equal 
> to {{F1*Houtc1*Wouc1}}.
> In the first scenario ({{scenario1.py}}), a random matrix {{doutc1}} is 
> created that should have the same dimensions as {{outc1}}.  For the columns, 
> if I use {{cols=ncol(outc1)}} in this rand statement, the size will be 
> propagated and CP ops will be compiled and run.  I I instead use 
> {{cols=F1*Houtc1*Woutc1}}, the size will forever be unknown, even during 
> recompilation, and thus Spark ops will be compiled and run.  I have included 
> the recompile hops plan ({{scenario1_plan.txt}}).
> In the second scenario ({{scenario2.py}}), a {{max_pool2d::forward(...)}} 
> function is inserted after the {{conv2d::forward(...)}} function that 
> requires the {{Houtc1}} and {{Woutc1}} variables to be supplied as arguments. 
>  Since those latter variables are not executed during compilation time, the 
> max pooling sizes remain unknown, even during recompilation, and thus Spark 
> ops will be compiled and run.  I have included the recompile hops plan 
> ({{scenario2_plan.txt}}).
> We should either improve or fix our constant folding rewrites so that these 
> scenarios are fixed, as they are necessary for performant deep learning 
> applications.  Note too that this issue will be present in other non-deep 
> learning scenarios as well.
> Mailing list thread: 
> https://www.mail-archive.com/dev@systemml.incubator.apache.org/msg01657.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SYSTEMML-1593) Performance issues rexpand to ultra-sparse matrix

2017-05-08 Thread Matthias Boehm (JIRA)
Matthias Boehm created SYSTEMML-1593:


 Summary: Performance issues rexpand to ultra-sparse matrix
 Key: SYSTEMML-1593
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1593
 Project: SystemML
  Issue Type: Bug
Reporter: Matthias Boehm


For a detailed description see 
https://www.mail-archive.com/dev@systemml.incubator.apache.org/msg01741.html

The issue is caused by (1) wrong input partitioning (small vector input to huge 
output only leverages a small degree of parallelism), and (2) unnecessary 
shuffle. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (SYSTEMML-1591) Improve efficiency sparse-unsafe cellwise operations

2017-05-07 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm closed SYSTEMML-1591.


> Improve efficiency sparse-unsafe cellwise operations
> 
>
> Key: SYSTEMML-1591
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1591
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>
> For sparse-unsafe cellwise operations, we currently iterate over all cells 
> and use binary search to access the individual values. This is unnecessarily 
> inefficient to should be reworked in favor of a sequential scan with gap 
> handling, which would also allow us to consolidate the different code paths 
> for sparse-safe and -unsafe operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (SYSTEMML-1591) Improve efficiency sparse-unsafe cellwise operations

2017-05-07 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm resolved SYSTEMML-1591.
--
   Resolution: Done
 Assignee: Matthias Boehm
Fix Version/s: (was: SystemML 0.14)
   SystemML 1.0

> Improve efficiency sparse-unsafe cellwise operations
> 
>
> Key: SYSTEMML-1591
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1591
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>
> For sparse-unsafe cellwise operations, we currently iterate over all cells 
> and use binary search to access the individual values. This is unnecessarily 
> inefficient to should be reworked in favor of a sequential scan with gap 
> handling, which would also allow us to consolidate the different code paths 
> for sparse-safe and -unsafe operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (SYSTEMML-1590) Codegen crashes for unsupported row aggregates

2017-05-07 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm resolved SYSTEMML-1590.
--
   Resolution: Fixed
 Assignee: Matthias Boehm
Fix Version/s: SystemML 1.0

> Codegen crashes for unsupported row aggregates
> --
>
> Key: SYSTEMML-1590
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1590
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 0.14
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>
> For row aggregate patterns with unsupported aggregation functions such as 
> {{rowIndexMax(X)}}, codegen currently crashes with exceptions as follows:
> {code}
> Caused by: java.lang.RuntimeException: 8 ua(maxindexR)
>   at 
> org.apache.sysml.hops.codegen.template.TemplateRow.rConstructCplan(TemplateRow.java:300)
>   at 
> org.apache.sysml.hops.codegen.template.TemplateRow.rConstructCplan(TemplateRow.java:157)
>   at 
> org.apache.sysml.hops.codegen.template.TemplateRow.rConstructCplan(TemplateRow.java:157)
>   at 
> org.apache.sysml.hops.codegen.template.TemplateRow.constructCplan(TemplateRow.java:124)
>   at 
> org.apache.sysml.hops.codegen.SpoofCompiler.rConstructCPlans(SpoofCompiler.java:561)
>   at 
> org.apache.sysml.hops.codegen.SpoofCompiler.rConstructCPlans(SpoofCompiler.java:573)
>   at 
> org.apache.sysml.hops.codegen.SpoofCompiler.constructCPlans(SpoofCompiler.java:477)
>   at 
> org.apache.sysml.hops.codegen.SpoofCompiler.optimize(SpoofCompiler.java:346)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (SYSTEMML-1590) Codegen crashes for unsupported row aggregates

2017-05-07 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm closed SYSTEMML-1590.


> Codegen crashes for unsupported row aggregates
> --
>
> Key: SYSTEMML-1590
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1590
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 0.14
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>
> For row aggregate patterns with unsupported aggregation functions such as 
> {{rowIndexMax(X)}}, codegen currently crashes with exceptions as follows:
> {code}
> Caused by: java.lang.RuntimeException: 8 ua(maxindexR)
>   at 
> org.apache.sysml.hops.codegen.template.TemplateRow.rConstructCplan(TemplateRow.java:300)
>   at 
> org.apache.sysml.hops.codegen.template.TemplateRow.rConstructCplan(TemplateRow.java:157)
>   at 
> org.apache.sysml.hops.codegen.template.TemplateRow.rConstructCplan(TemplateRow.java:157)
>   at 
> org.apache.sysml.hops.codegen.template.TemplateRow.constructCplan(TemplateRow.java:124)
>   at 
> org.apache.sysml.hops.codegen.SpoofCompiler.rConstructCPlans(SpoofCompiler.java:561)
>   at 
> org.apache.sysml.hops.codegen.SpoofCompiler.rConstructCPlans(SpoofCompiler.java:573)
>   at 
> org.apache.sysml.hops.codegen.SpoofCompiler.constructCPlans(SpoofCompiler.java:477)
>   at 
> org.apache.sysml.hops.codegen.SpoofCompiler.optimize(SpoofCompiler.java:346)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SYSTEMML-1592) Improve handling of sparse outputs and sideway inputs

2017-05-07 Thread Matthias Boehm (JIRA)
Matthias Boehm created SYSTEMML-1592:


 Summary: Improve handling of sparse outputs and sideway inputs
 Key: SYSTEMML-1592
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1592
 Project: SystemML
  Issue Type: Sub-task
Reporter: Matthias Boehm






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SYSTEMML-1591) Improve efficiency sparse-unsafe cellwise operations

2017-05-07 Thread Matthias Boehm (JIRA)
Matthias Boehm created SYSTEMML-1591:


 Summary: Improve efficiency sparse-unsafe cellwise operations
 Key: SYSTEMML-1591
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1591
 Project: SystemML
  Issue Type: Sub-task
Reporter: Matthias Boehm


For sparse-unsafe cellwise operations, we currently iterate over all cells and 
use binary search to access the individual values. This is unnecessarily 
inefficient to should be reworked in favor of a sequential scan with gap 
handling, which would also allow us to consolidate the different code paths for 
sparse-safe and -unsafe operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SYSTEMML-1590) Codegen crashes for unsupported row aggregates

2017-05-07 Thread Matthias Boehm (JIRA)
Matthias Boehm created SYSTEMML-1590:


 Summary: Codegen crashes for unsupported row aggregates
 Key: SYSTEMML-1590
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1590
 Project: SystemML
  Issue Type: Bug
Affects Versions: SystemML 0.14
Reporter: Matthias Boehm


For row aggregate patterns with unsupported aggregation functions such as 
{{rowIndexMax(X)}}, codegen currently crashes with exceptions as follows:

{code}
Caused by: java.lang.RuntimeException: 8 ua(maxindexR)
at 
org.apache.sysml.hops.codegen.template.TemplateRow.rConstructCplan(TemplateRow.java:300)
at 
org.apache.sysml.hops.codegen.template.TemplateRow.rConstructCplan(TemplateRow.java:157)
at 
org.apache.sysml.hops.codegen.template.TemplateRow.rConstructCplan(TemplateRow.java:157)
at 
org.apache.sysml.hops.codegen.template.TemplateRow.constructCplan(TemplateRow.java:124)
at 
org.apache.sysml.hops.codegen.SpoofCompiler.rConstructCPlans(SpoofCompiler.java:561)
at 
org.apache.sysml.hops.codegen.SpoofCompiler.rConstructCPlans(SpoofCompiler.java:573)
at 
org.apache.sysml.hops.codegen.SpoofCompiler.constructCPlans(SpoofCompiler.java:477)
at 
org.apache.sysml.hops.codegen.SpoofCompiler.optimize(SpoofCompiler.java:346)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1527) Use top-level algorithm scripts for application tests

2017-05-07 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16000196#comment-16000196
 ] 

Matthias Boehm commented on SYSTEMML-1527:
--

yes, you could simply (1) delete ./src/test/scripts/applications/glm/GLM.dml, 
(2) modify test.integration.applications.GLMTest to point to 
{{fullDMLScriptName = "scripts/algorithms/GLM.dml"}} if the script type is dml, 
and (3) modify the R script if needed. Other algorithms might also require to 
change the input parameters from positional arguments to named arguments.

I would recommend to first handle all dml scripts and disregard pydml for now. 
Furthermore, you might want to think about a nice abstraction such as 
{{AutomatedTestBase.getScript()}} to ensure consistency across the algorithms. 

> Use top-level algorithm scripts for application tests
> -
>
> Key: SYSTEMML-1527
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1527
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Test
>Reporter: Matthias Boehm
>
> There are numerous dml and pydml application tests that aim to test our 
> existing algorithms. However, these tests use replicated (and mostly 
> outdated) scripts. This task aims to remove the duplicated dml and pydml 
> scripts and to refer directly to the existing algorithm tests. This also 
> includes the update of R comparison scripts.
> See SYSTEMML-1363 for examples.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SYSTEMML-1589) conv2d_bias_add fails w/ NPE on lenet with random data

2017-05-07 Thread Matthias Boehm (JIRA)
Matthias Boehm created SYSTEMML-1589:


 Summary: conv2d_bias_add fails w/ NPE on lenet with random data
 Key: SYSTEMML-1589
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1589
 Project: SystemML
  Issue Type: Bug
Reporter: Matthias Boehm
Assignee: Niketan Pansare


The lenet dml script fails with a null pointer exception for random multi class 
data, generated with
{code}
X_full = rand(rows=6,cols=784);
y_full = round(rand(rows=nrow(X_full), cols=1, min=1, max=10));
{code}

The detailed stacktrace is as follows:
{code}
Caused by: java.lang.NullPointerException
at 
org.apache.sysml.runtime.matrix.data.LibMatrixDNN.getRowInDenseFormat(LibMatrixDNN.java:1355)
at 
org.apache.sysml.runtime.matrix.data.LibMatrixDNN.doIm2colSparse(LibMatrixDNN.java:1382)
at 
org.apache.sysml.runtime.matrix.data.LibMatrixDNN.doIm2col(LibMatrixDNN.java:1421)
at 
org.apache.sysml.runtime.matrix.data.LibMatrixDNN.doLoopedIm2ColConv2d(LibMatrixDNN.java:406)
at 
org.apache.sysml.runtime.matrix.data.LibMatrixDNN.access$400(LibMatrixDNN.java:51)
at 
org.apache.sysml.runtime.matrix.data.LibMatrixDNN$ConvTask.call(LibMatrixDNN.java:1143)
at 
org.apache.sysml.runtime.matrix.data.LibMatrixDNN$ConvTask.call(LibMatrixDNN.java:1076)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1588) Perftest: Datagen descriptive statistics fails on parfor result merge

2017-05-07 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1588:
-
Affects Version/s: SystemML 0.14

> Perftest: Datagen descriptive statistics fails on parfor result merge
> -
>
> Key: SYSTEMML-1588
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1588
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 0.14
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>
> The perftest data generation script for descriptive statistics fails for the 
> 80GB scenario with the following stacktrace
> {code}
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
> hdfs://:/user/biuser/scratch_space/_p123456789_1.2.34.56/_t16/temp863_830
>   at 
> org.apache.sysml.runtime.controlprogram.parfor.ResultMergeRemoteSpark.executeParallelMerge(ResultMergeRemoteSpark.java:116)
>   at 
> org.apache.sysml.runtime.controlprogram.parfor.ResultMergeRemoteSpark.executeSerialMerge(ResultMergeRemoteSpark.java:71)
>   at 
> org.apache.sysml.runtime.controlprogram.ParForProgramBlock.consolidateAndCheckResults(ParForProgramBlock.java:1756)
>   at 
> org.apache.sysml.runtime.controlprogram.ParForProgramBlock.executeLocalParFor(ParForProgramBlock.java:818)
>   at 
> org.apache.sysml.runtime.controlprogram.ParForProgramBlock.execute(ParForProgramBlock.java:639)
>   ... 14 more
> {code}
> This issue is caused by incorrect export handling of inputs which only works 
> for existing inputs on hdfs or dirty in-memory objects but not for pending 
> rdd operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (SYSTEMML-1588) Perftest: Datagen descriptive statistics fails on parfor result merge

2017-05-07 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm reassigned SYSTEMML-1588:


Assignee: Matthias Boehm

> Perftest: Datagen descriptive statistics fails on parfor result merge
> -
>
> Key: SYSTEMML-1588
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1588
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 0.14
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>
> The perftest data generation script for descriptive statistics fails for the 
> 80GB scenario with the following stacktrace
> {code}
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
> hdfs://:/user/biuser/scratch_space/_p123456789_1.2.34.56/_t16/temp863_830
>   at 
> org.apache.sysml.runtime.controlprogram.parfor.ResultMergeRemoteSpark.executeParallelMerge(ResultMergeRemoteSpark.java:116)
>   at 
> org.apache.sysml.runtime.controlprogram.parfor.ResultMergeRemoteSpark.executeSerialMerge(ResultMergeRemoteSpark.java:71)
>   at 
> org.apache.sysml.runtime.controlprogram.ParForProgramBlock.consolidateAndCheckResults(ParForProgramBlock.java:1756)
>   at 
> org.apache.sysml.runtime.controlprogram.ParForProgramBlock.executeLocalParFor(ParForProgramBlock.java:818)
>   at 
> org.apache.sysml.runtime.controlprogram.ParForProgramBlock.execute(ParForProgramBlock.java:639)
>   ... 14 more
> {code}
> This issue is caused by incorrect export handling of inputs which only works 
> for existing inputs on hdfs or dirty in-memory objects but not for pending 
> rdd operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (SYSTEMML-1588) Perftest: Datagen descriptive statistics fails on parfor result merge

2017-05-07 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm closed SYSTEMML-1588.


> Perftest: Datagen descriptive statistics fails on parfor result merge
> -
>
> Key: SYSTEMML-1588
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1588
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 0.14
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>
> The perftest data generation script for descriptive statistics fails for the 
> 80GB scenario with the following stacktrace
> {code}
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
> hdfs://:/user/biuser/scratch_space/_p123456789_1.2.34.56/_t16/temp863_830
>   at 
> org.apache.sysml.runtime.controlprogram.parfor.ResultMergeRemoteSpark.executeParallelMerge(ResultMergeRemoteSpark.java:116)
>   at 
> org.apache.sysml.runtime.controlprogram.parfor.ResultMergeRemoteSpark.executeSerialMerge(ResultMergeRemoteSpark.java:71)
>   at 
> org.apache.sysml.runtime.controlprogram.ParForProgramBlock.consolidateAndCheckResults(ParForProgramBlock.java:1756)
>   at 
> org.apache.sysml.runtime.controlprogram.ParForProgramBlock.executeLocalParFor(ParForProgramBlock.java:818)
>   at 
> org.apache.sysml.runtime.controlprogram.ParForProgramBlock.execute(ParForProgramBlock.java:639)
>   ... 14 more
> {code}
> This issue is caused by incorrect export handling of inputs which only works 
> for existing inputs on hdfs or dirty in-memory objects but not for pending 
> rdd operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (SYSTEMML-1588) Perftest: Datagen descriptive statistics fails on parfor result merge

2017-05-07 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm resolved SYSTEMML-1588.
--
   Resolution: Fixed
Fix Version/s: SystemML 1.0

> Perftest: Datagen descriptive statistics fails on parfor result merge
> -
>
> Key: SYSTEMML-1588
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1588
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 0.14
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>
> The perftest data generation script for descriptive statistics fails for the 
> 80GB scenario with the following stacktrace
> {code}
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
> hdfs://:/user/biuser/scratch_space/_p123456789_1.2.34.56/_t16/temp863_830
>   at 
> org.apache.sysml.runtime.controlprogram.parfor.ResultMergeRemoteSpark.executeParallelMerge(ResultMergeRemoteSpark.java:116)
>   at 
> org.apache.sysml.runtime.controlprogram.parfor.ResultMergeRemoteSpark.executeSerialMerge(ResultMergeRemoteSpark.java:71)
>   at 
> org.apache.sysml.runtime.controlprogram.ParForProgramBlock.consolidateAndCheckResults(ParForProgramBlock.java:1756)
>   at 
> org.apache.sysml.runtime.controlprogram.ParForProgramBlock.executeLocalParFor(ParForProgramBlock.java:818)
>   at 
> org.apache.sysml.runtime.controlprogram.ParForProgramBlock.execute(ParForProgramBlock.java:639)
>   ... 14 more
> {code}
> This issue is caused by incorrect export handling of inputs which only works 
> for existing inputs on hdfs or dirty in-memory objects but not for pending 
> rdd operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1588) Perftest: Datagen descriptive statistics fails on parfor result merge

2017-05-06 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999702#comment-15999702
 ] 

Matthias Boehm commented on SYSTEMML-1588:
--

cc [~nakul02]

> Perftest: Datagen descriptive statistics fails on parfor result merge
> -
>
> Key: SYSTEMML-1588
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1588
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>
> The perftest data generation script for descriptive statistics fails for the 
> 80GB scenario with the following stacktrace
> {code}
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
> hdfs://:/user/biuser/scratch_space/_p123456789_1.2.34.56/_t16/temp863_830
>   at 
> org.apache.sysml.runtime.controlprogram.parfor.ResultMergeRemoteSpark.executeParallelMerge(ResultMergeRemoteSpark.java:116)
>   at 
> org.apache.sysml.runtime.controlprogram.parfor.ResultMergeRemoteSpark.executeSerialMerge(ResultMergeRemoteSpark.java:71)
>   at 
> org.apache.sysml.runtime.controlprogram.ParForProgramBlock.consolidateAndCheckResults(ParForProgramBlock.java:1756)
>   at 
> org.apache.sysml.runtime.controlprogram.ParForProgramBlock.executeLocalParFor(ParForProgramBlock.java:818)
>   at 
> org.apache.sysml.runtime.controlprogram.ParForProgramBlock.execute(ParForProgramBlock.java:639)
>   ... 14 more
> {code}
> This issue is caused by incorrect export handling of inputs which only works 
> for existing inputs on hdfs or dirty in-memory objects but not for pending 
> rdd operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1588) Perftest: Datagen descriptive statistics fails on parfor result merge

2017-05-06 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1588:
-
Description: 
The perftest data generation script for descriptive statistics fails for the 
80GB scenario with the following stacktrace

{code}
Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
org.apache.sysml.runtime.DMLRuntimeException: 
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
hdfs://:/user/biuser/scratch_space/_p123456789_1.2.34.56/_t16/temp863_830
at 
org.apache.sysml.runtime.controlprogram.parfor.ResultMergeRemoteSpark.executeParallelMerge(ResultMergeRemoteSpark.java:116)
at 
org.apache.sysml.runtime.controlprogram.parfor.ResultMergeRemoteSpark.executeSerialMerge(ResultMergeRemoteSpark.java:71)
at 
org.apache.sysml.runtime.controlprogram.ParForProgramBlock.consolidateAndCheckResults(ParForProgramBlock.java:1756)
at 
org.apache.sysml.runtime.controlprogram.ParForProgramBlock.executeLocalParFor(ParForProgramBlock.java:818)
at 
org.apache.sysml.runtime.controlprogram.ParForProgramBlock.execute(ParForProgramBlock.java:639)
... 14 more
{code}

This issue is caused by incorrect export handling of inputs which only works 
for existing inputs on hdfs or dirty in-memory objects but not for pending rdd 
operations.

  was:
The perftest data generation script for descriptive statistics fails for the 
80GB scenario with the following stacktrace

{code}
Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
org.apache.sysml.runtime.DMLRuntimeException: 
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
hdfs://:/user/biuser/scratch_space/_p123456789_1.2.34.56/_t16/temp863_830
at 
org.apache.sysml.runtime.controlprogram.parfor.ResultMergeRemoteSpark.executeParallelMerge(ResultMergeRemoteSpark.java:116)
at 
org.apache.sysml.runtime.controlprogram.parfor.ResultMergeRemoteSpark.executeSerialMerge(ResultMergeRemoteSpark.java:71)
at 
org.apache.sysml.runtime.controlprogram.ParForProgramBlock.consolidateAndCheckResults(ParForProgramBlock.java:1756)
at 
org.apache.sysml.runtime.controlprogram.ParForProgramBlock.executeLocalParFor(ParForProgramBlock.java:818)
at 
org.apache.sysml.runtime.controlprogram.ParForProgramBlock.execute(ParForProgramBlock.java:639)
... 14 more
{code}


> Perftest: Datagen descriptive statistics fails on parfor result merge
> -
>
> Key: SYSTEMML-1588
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1588
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>
> The perftest data generation script for descriptive statistics fails for the 
> 80GB scenario with the following stacktrace
> {code}
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
> hdfs://:/user/biuser/scratch_space/_p123456789_1.2.34.56/_t16/temp863_830
>   at 
> org.apache.sysml.runtime.controlprogram.parfor.ResultMergeRemoteSpark.executeParallelMerge(ResultMergeRemoteSpark.java:116)
>   at 
> org.apache.sysml.runtime.controlprogram.parfor.ResultMergeRemoteSpark.executeSerialMerge(ResultMergeRemoteSpark.java:71)
>   at 
> org.apache.sysml.runtime.controlprogram.ParForProgramBlock.consolidateAndCheckResults(ParForProgramBlock.java:1756)
>   at 
> org.apache.sysml.runtime.controlprogram.ParForProgramBlock.executeLocalParFor(ParForProgramBlock.java:818)
>   at 
> org.apache.sysml.runtime.controlprogram.ParForProgramBlock.execute(ParForProgramBlock.java:639)
>   ... 14 more
> {code}
> This issue is caused by incorrect export handling of inputs which only works 
> for existing inputs on hdfs or dirty in-memory objects but not for pending 
> rdd operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SYSTEMML-1588) Perftest: Datagen descriptive statistics fails on parfor result merge

2017-05-06 Thread Matthias Boehm (JIRA)
Matthias Boehm created SYSTEMML-1588:


 Summary: Perftest: Datagen descriptive statistics fails on parfor 
result merge
 Key: SYSTEMML-1588
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1588
 Project: SystemML
  Issue Type: Bug
Reporter: Matthias Boehm


The perftest data generation script for descriptive statistics fails for the 
80GB scenario with the following stacktrace

{code}
Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
org.apache.sysml.runtime.DMLRuntimeException: 
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
hdfs://:/user/biuser/scratch_space/_p123456789_1.2.34.56/_t16/temp863_830
at 
org.apache.sysml.runtime.controlprogram.parfor.ResultMergeRemoteSpark.executeParallelMerge(ResultMergeRemoteSpark.java:116)
at 
org.apache.sysml.runtime.controlprogram.parfor.ResultMergeRemoteSpark.executeSerialMerge(ResultMergeRemoteSpark.java:71)
at 
org.apache.sysml.runtime.controlprogram.ParForProgramBlock.consolidateAndCheckResults(ParForProgramBlock.java:1756)
at 
org.apache.sysml.runtime.controlprogram.ParForProgramBlock.executeLocalParFor(ParForProgramBlock.java:818)
at 
org.apache.sysml.runtime.controlprogram.ParForProgramBlock.execute(ParForProgramBlock.java:639)
... 14 more
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (SYSTEMML-1395) Wrong parfor spark partition output info w/ unknown sparsity

2017-05-06 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm closed SYSTEMML-1395.


> Wrong parfor spark partition output info w/ unknown sparsity
> 
>
> Key: SYSTEMML-1395
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1395
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (SYSTEMML-1395) Wrong parfor spark partition output info w/ unknown sparsity

2017-05-06 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm resolved SYSTEMML-1395.
--
   Resolution: Fixed
 Assignee: Matthias Boehm
Fix Version/s: SystemML 0.14

> Wrong parfor spark partition output info w/ unknown sparsity
> 
>
> Key: SYSTEMML-1395
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1395
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SYSTEMML-1587) Performance ultra-sparse matrix reads

2017-05-06 Thread Matthias Boehm (JIRA)
Matthias Boehm created SYSTEMML-1587:


 Summary: Performance ultra-sparse matrix reads
 Key: SYSTEMML-1587
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1587
 Project: SystemML
  Issue Type: Task
Reporter: Matthias Boehm


We use the MCSR (modified compressed sparse row) format by default for sparse 
and ultra-sparse matrices because it allows for efficient incremental 
construction, including multi-threaded operations. However, even with 
SYSTEMML-1548, the MCSR is still too inefficient in its memory consumption 
leading to unnecessary garbage collection overhead. 

This task aims to read ultra-sparse matrices (e.g., permutation matrices) into 
CSR format. Since CSR does not allow for efficient incremental construction 
(with multiple unordered input streams), the approach is to use thread-local 
COO representations and finally merge them into a CSR representation. The 
temporary memory requirements are not problematic because size(CSR) + size(COO) 
< size(MCSR) for ultra sparse matrices and the COO representation can be 
partitioned across threads.

Note that this change should be done in a consistent manner for all matrix 
readers (single-threaded/multi-threaded, all formats).




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (SYSTEMML-1561) Improve constant folding during compilation

2017-05-03 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15996123#comment-15996123
 ] 

Matthias Boehm edited comment on SYSTEMML-1561 at 5/4/17 3:36 AM:
--

sounds great - a second chance would be useful for many other scenarios too. 
The 2x runtime improvement is a bit surprising though because very similar 
rewrites would be performed during dynamic recompilation (except constant 
folding, which is covered by size expression evaluation over sub dags of scalar 
operations with symbol table inputs) and dynamic recompilation itself was not 
the bottleneck. I would be very interested to know were this is coming from, 
maybe some cascade of other rewrites/fused operator? You can set 
{{ProgramRewriter.LDEBUG = true}} to see the applied simplification rewrites 
along with line numbers where they originate from. 

For your PR, if you want to ensure that future compiler modifications preserve 
this behavior, please add a test into {{functions.recompile}} or 
{{functions.misc}}, similar to other size-dependent rewrites - the easiest way 
is to construct a case, where without size propagation we would compile/execute 
distributed operations and simply compare the number of compiled/executed Spark 
instructions with expected values.


was (Author: mboehm7):
sounds great - a second chance would be useful for many other scenarios too. 
The 2x runtime improvement is a bit surprising though because very similar 
rewrites would be performed during dynamic recompilation (except constant 
folding, which is covered by size expression over sub dags of scalar operations 
with symbol table inputs) and dynamic recompilation itself was not the 
bottleneck. I would be very interested to know were this is coming from, maybe 
some cascade of other rewrites/fused operator? You can set 
{{ProgramRewriter.LDEBUG = true}} to see the applied simplification rewrites 
along with line numbers where they originate from. 

For your PR, if you want to ensure that future compiler modifications preserve 
this behavior, please add a test into {{functions.recompile}} or 
{{functions.misc}}, similar to other size-dependent rewrites - the easiest way 
is to construct a case, where without size propagation we would compile/execute 
distributed operations and simply compare the number of compiled/executed Spark 
instructions with expected values.

> Improve constant folding during compilation
> ---
>
> Key: SYSTEMML-1561
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1561
> Project: SystemML
>  Issue Type: Improvement
>Reporter: Mike Dusenberry
> Fix For: SystemML 1.0
>
> Attachments: scenario1_plan.txt, scenario1.py, scenario2_plan.txt, 
> scenario2.py
>
>
> In our `nn` library, our convolution and pooling layers have to pass around 
> the spatial dimensions (height and width) of the images that are stretched 
> out into rows of the input/output matrices.  These output dimensions are 
> computed within the forward functions of the above layers as small scalar 
> equations.  From a mathematical standpoint, these sizes can be determined at 
> compile time, and it is nice to have these size equations in DML (v.s. hiding 
> them inside the engine within built-in functions).  However, we do not 
> currently evaluate these expressions during compilation, and thus we are left 
> with unknown sizes even during recompilation.  This naturally leads to max 
> memory estimates and thus often leads to unnecessary distributed runtime ops 
> rather than simple CP ones.
> I have two related scenarios for which this is a problem.  They both involve 
> the {{Houtc1}} & {{Woutc1}} values that are returned from a 
> `conv2d::forward(...)` function.  These represent the spatial dimensions of 
> the volume with each of the rows of the output {{outc1}} of the function, and 
> the third dimension is {{F1}}.  Thus, {{outc1}} has a number of columns equal 
> to {{F1*Houtc1*Wouc1}}.
> In the first scenario ({{scenario1.py}}), a random matrix {{doutc1}} is 
> created that should have the same dimensions as {{outc1}}.  For the columns, 
> if I use {{cols=ncol(outc1)}} in this rand statement, the size will be 
> propagated and CP ops will be compiled and run.  I I instead use 
> {{cols=F1*Houtc1*Woutc1}}, the size will forever be unknown, even during 
> recompilation, and thus Spark ops will be compiled and run.  I have included 
> the recompile hops plan ({{scenario1_plan.txt}}).
> In the second scenario ({{scenario2.py}}), a {{max_pool2d::forward(...)}} 
> function is inserted after the {{conv2d::forward(...)}} function that 
> requires the {{Houtc1}} and {{Woutc1}} variables to be supplied as arguments. 
>  Since those latter variables are not executed during compilation time, the 
> max poolin

[jira] [Commented] (SYSTEMML-1561) Improve constant folding during compilation

2017-05-03 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15996123#comment-15996123
 ] 

Matthias Boehm commented on SYSTEMML-1561:
--

sounds great - a second chance would be useful for many other scenarios too. 
The 2x runtime improvement is a bit surprising though because very similar 
rewrites would be performed during dynamic recompilation (except constant 
folding, which is covered by size expression over sub dags of scalar operations 
with symbol table inputs) and dynamic recompilation itself was not the 
bottleneck. I would be very interested to know were this is coming from, maybe 
some cascade of other rewrites/fused operator? You can set 
{{ProgramRewriter.LDEBUG = true}} to see the applied simplification rewrites 
along with line numbers where they originate from. 

For your PR, if you want to ensure that future compiler modifications preserve 
this behavior, please add a test into {{functions.recompile}} or 
{{functions.misc}}, similar to other size-dependent rewrites - the easiest way 
is to construct a case, where without size propagation we would compile/execute 
distributed operations and simply compare the number of compiled/executed Spark 
instructions with expected values.

> Improve constant folding during compilation
> ---
>
> Key: SYSTEMML-1561
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1561
> Project: SystemML
>  Issue Type: Improvement
>Reporter: Mike Dusenberry
> Fix For: SystemML 1.0
>
> Attachments: scenario1_plan.txt, scenario1.py, scenario2_plan.txt, 
> scenario2.py
>
>
> In our `nn` library, our convolution and pooling layers have to pass around 
> the spatial dimensions (height and width) of the images that are stretched 
> out into rows of the input/output matrices.  These output dimensions are 
> computed within the forward functions of the above layers as small scalar 
> equations.  From a mathematical standpoint, these sizes can be determined at 
> compile time, and it is nice to have these size equations in DML (v.s. hiding 
> them inside the engine within built-in functions).  However, we do not 
> currently evaluate these expressions during compilation, and thus we are left 
> with unknown sizes even during recompilation.  This naturally leads to max 
> memory estimates and thus often leads to unnecessary distributed runtime ops 
> rather than simple CP ones.
> I have two related scenarios for which this is a problem.  They both involve 
> the {{Houtc1}} & {{Woutc1}} values that are returned from a 
> `conv2d::forward(...)` function.  These represent the spatial dimensions of 
> the volume with each of the rows of the output {{outc1}} of the function, and 
> the third dimension is {{F1}}.  Thus, {{outc1}} has a number of columns equal 
> to {{F1*Houtc1*Wouc1}}.
> In the first scenario ({{scenario1.py}}), a random matrix {{doutc1}} is 
> created that should have the same dimensions as {{outc1}}.  For the columns, 
> if I use {{cols=ncol(outc1)}} in this rand statement, the size will be 
> propagated and CP ops will be compiled and run.  I I instead use 
> {{cols=F1*Houtc1*Woutc1}}, the size will forever be unknown, even during 
> recompilation, and thus Spark ops will be compiled and run.  I have included 
> the recompile hops plan ({{scenario1_plan.txt}}).
> In the second scenario ({{scenario2.py}}), a {{max_pool2d::forward(...)}} 
> function is inserted after the {{conv2d::forward(...)}} function that 
> requires the {{Houtc1}} and {{Woutc1}} variables to be supplied as arguments. 
>  Since those latter variables are not executed during compilation time, the 
> max pooling sizes remain unknown, even during recompilation, and thus Spark 
> ops will be compiled and run.  I have included the recompile hops plan 
> ({{scenario2_plan.txt}}).
> We should either improve or fix our constant folding rewrites so that these 
> scenarios are fixed, as they are necessary for performant deep learning 
> applications.  Note too that this issue will be present in other non-deep 
> learning scenarios as well.
> Mailing list thread: 
> https://www.mail-archive.com/dev@systemml.incubator.apache.org/msg01657.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (SYSTEMML-1574) Missing serialized caching for ultra sparse matrices

2017-05-03 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm closed SYSTEMML-1574.


> Missing serialized caching for ultra sparse matrices
> 
>
> Key: SYSTEMML-1574
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1574
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (SYSTEMML-1574) Missing serialized caching for ultra sparse matrices

2017-05-03 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm resolved SYSTEMML-1574.
--
   Resolution: Fixed
 Assignee: Matthias Boehm
Fix Version/s: SystemML 1.0

> Missing serialized caching for ultra sparse matrices
> 
>
> Key: SYSTEMML-1574
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1574
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SYSTEMML-1574) Missing serialized caching for ultra sparse matrices

2017-05-03 Thread Matthias Boehm (JIRA)
Matthias Boehm created SYSTEMML-1574:


 Summary: Missing serialized caching for ultra sparse matrices
 Key: SYSTEMML-1574
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1574
 Project: SystemML
  Issue Type: Bug
Reporter: Matthias Boehm






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1567) Remove conditionals from nn layers

2017-04-30 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15990537#comment-15990537
 ] 

Matthias Boehm commented on SYSTEMML-1567:
--

btw, this sel+ operator has been rendered obsolete and should be removed 
(SYSTEMML-1570)

> Remove conditionals from nn layers
> --
>
> Key: SYSTEMML-1567
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1567
> Project: SystemML
>  Issue Type: Improvement
>  Components: APIs
>Affects Versions: SystemML 1.0
>Reporter: Niketan Pansare
>
> Conditionals in nn layers introduce transient read/write variables that 
> disables fused operators such as CP relu_maxpooling_backward and hence 
> redundant execute sparsity-introducing sel+ operator. This operator causes 
> unnecessary dense-to-sparse-to-dense conversion and becomes the heavy hitter 
> after native BLAS change. Note: some fused operators such as CP 
> relu_maxpooling are still applied because there is no conditional in between 
> those layers.
> Without conditionals in dropout layer: 
> https://github.com/apache/incubator-systemml/blob/master/scripts/nn/layers/dropout.dml#L49-L53
>  
> {code}
> Iter:2000.0, training loss:0.003149394810197065, training accuracy:100.0
> Iter:2000.0, validation loss:191.9888157354513, validation accuracy:96.875
> SystemML Statistics:
> Total elapsed time: 416.609 sec.
> Total compilation time: 0.000 sec.
> Total execution time:   416.609 sec.
> Number of compiled Spark inst:  69.
> Number of executed Spark inst:  2.
> Native mkl calls (LibMatrixMult/LibMatrixDNN):  4270/10553.
> Cache hits (Mem, WB, FS, HDFS): 277973/0/0/0.
> Cache writes (WB, FS, HDFS):143616/0/0.
> Cache times (ACQr/m, RLS, EXP): 0.101/0.080/1.988/0.000 sec.
> HOP DAGs recompiled (PRED, SB): 0/2277.
> HOP DAGs recompile time:6.146 sec.
> Spark ctx create time (lazy):   0.027 sec.
> Spark trans counts (par,bc,col):0/0/0.
> Spark trans times (par,bc,col): 0.000/0.000/0.000 secs.
> Total JIT compile time: 37.746 sec.
> Total JVM GC count: 3949.
> Total JVM GC time:  56.609 sec.
> Heavy hitter instructions (name, time, count):
> -- 1)   conv2d_bias_add 48.984 sec  4514
> -- 2)   conv2d_backward_filter  47.780 sec  4026
> -- 3)   -*  38.246 sec  16104
> -- 4)   +*  35.902 sec  8052
> -- 5)   +   34.227 sec  30566
> -- 6)   ba+*30.643 sec  12566
> -- 7)   relu_maxpooling_backward29.678 sec  4026
> -- 8)   conv2d_backward_data28.520 sec  2013
> -- 9)   *   26.825 sec  35275
> -- 10)  relu_backward   24.842 sec  6039
> {code}
> With conditional, we add sel+ to the heavy hitter:
> {code}
> -- 1)   sel+55.054 sec  6283
> {code}
> [~mwdus...@us.ibm.com] Since you created the layers, I think you should 
> decide how best to restructure the DML. My recommendation would be to create 
> two layers in case of conditionals.
> [~mboehm7] [~reinwald]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SYSTEMML-1570) Remove fused sel+ operator

2017-04-30 Thread Matthias Boehm (JIRA)
Matthias Boehm created SYSTEMML-1570:


 Summary: Remove fused sel+ operator
 Key: SYSTEMML-1570
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1570
 Project: SystemML
  Issue Type: Task
Reporter: Matthias Boehm
 Fix For: SystemML 1.0


The fused operator sel+ (select positive values) is applied for patterns like 
(X>0)*X and max(X,0) in order to eliminate unnecessary intermediates. It stems 
from a time when max was sparse-unsafe and hence inefficient over sparse data. 
However, meanwhile we mark scalar operators as conditionally sparse-safe 
depending on the given scalar constant c, which applies for max if c<=0. Hence, 
this sel+ operator is meanwhile completely useless and should be removed.

Furthermore, we should also generalize the rewrites to rewrite the selection of 
negative values (X<0)*X to min(X,0)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1567) Remove conditionals from nn layers

2017-04-30 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15990491#comment-15990491
 ] 

Matthias Boehm commented on SYSTEMML-1567:
--

I'd recommend to close this issue and rather use it to improve the compiler 
because this branch should be automatically removed once we propagate scalars 
into functions (related to SYSTEMML-1554 and SYSTEMML-1561) and once we do a 
subsequent second pass of function inlining.

> Remove conditionals from nn layers
> --
>
> Key: SYSTEMML-1567
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1567
> Project: SystemML
>  Issue Type: Improvement
>  Components: APIs
>Affects Versions: SystemML 1.0
>Reporter: Niketan Pansare
>
> Conditionals in nn layers introduce transient read/write variables that 
> disables fused operators such as CP relu_maxpooling_backward and hence 
> redundant execute sparsity-introducing sel+ operator. This operator causes 
> unnecessary dense-to-sparse-to-dense conversion and becomes the heavy hitter 
> after native BLAS change. Note: some fused operators such as CP 
> relu_maxpooling are still applied because there is no conditional in between 
> those layers.
> Without conditionals in dropout layer: 
> https://github.com/apache/incubator-systemml/blob/master/scripts/nn/layers/dropout.dml#L49-L53
>  
> {code}
> Iter:2000.0, training loss:0.003149394810197065, training accuracy:100.0
> Iter:2000.0, validation loss:191.9888157354513, validation accuracy:96.875
> SystemML Statistics:
> Total elapsed time: 416.609 sec.
> Total compilation time: 0.000 sec.
> Total execution time:   416.609 sec.
> Number of compiled Spark inst:  69.
> Number of executed Spark inst:  2.
> Native mkl calls (LibMatrixMult/LibMatrixDNN):  4270/10553.
> Cache hits (Mem, WB, FS, HDFS): 277973/0/0/0.
> Cache writes (WB, FS, HDFS):143616/0/0.
> Cache times (ACQr/m, RLS, EXP): 0.101/0.080/1.988/0.000 sec.
> HOP DAGs recompiled (PRED, SB): 0/2277.
> HOP DAGs recompile time:6.146 sec.
> Spark ctx create time (lazy):   0.027 sec.
> Spark trans counts (par,bc,col):0/0/0.
> Spark trans times (par,bc,col): 0.000/0.000/0.000 secs.
> Total JIT compile time: 37.746 sec.
> Total JVM GC count: 3949.
> Total JVM GC time:  56.609 sec.
> Heavy hitter instructions (name, time, count):
> -- 1)   conv2d_bias_add 48.984 sec  4514
> -- 2)   conv2d_backward_filter  47.780 sec  4026
> -- 3)   -*  38.246 sec  16104
> -- 4)   +*  35.902 sec  8052
> -- 5)   +   34.227 sec  30566
> -- 6)   ba+*30.643 sec  12566
> -- 7)   relu_maxpooling_backward29.678 sec  4026
> -- 8)   conv2d_backward_data28.520 sec  2013
> -- 9)   *   26.825 sec  35275
> -- 10)  relu_backward   24.842 sec  6039
> {code}
> With conditional, we add sel+ to the heavy hitter:
> {code}
> -- 1)   sel+55.054 sec  6283
> {code}
> [~mwdus...@us.ibm.com] Since you created the layers, I think you should 
> decide how best to restructure the DML. My recommendation would be to create 
> two layers in case of conditionals.
> [~mboehm7] [~reinwald]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1561) Improve constant folding during compilation

2017-04-27 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988262#comment-15988262
 ] 

Matthias Boehm commented on SYSTEMML-1561:
--

so my guess is, this issue right here will be resolved once we rework the 
entire scalar propagation into functions and across the entire program.

> Improve constant folding during compilation
> ---
>
> Key: SYSTEMML-1561
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1561
> Project: SystemML
>  Issue Type: Improvement
>Reporter: Mike Dusenberry
> Attachments: scenario1_plan.txt, scenario1.py, scenario2_plan.txt, 
> scenario2.py
>
>
> In our `nn` library, our convolution and pooling layers have to pass around 
> the spatial dimensions (height and width) of the images that are stretched 
> out into rows of the input/output matrices.  These output dimensions are 
> computed within the forward functions of the above layers as small scalar 
> equations.  From a mathematical standpoint, these sizes can be determined at 
> compile time, and it is nice to have these size equations in DML (v.s. hiding 
> them inside the engine within built-in functions).  However, we do not 
> currently evaluate these expressions during compilation, and thus we are left 
> with unknown sizes even during recompilation.  This naturally leads to max 
> memory estimates and thus often leads to unnecessary distributed runtime ops 
> rather than simple CP ones.
> I have two related scenarios for which this is a problem.  They both involve 
> the {{Houtc1}} & {{Woutc1}} values that are returned from a 
> `conv2d::forward(...)` function.  These represent the spatial dimensions of 
> the volume with each of the rows of the output {{outc1}} of the function, and 
> the third dimension is {{F1}}.  Thus, {{outc1}} has a number of columns equal 
> to {{F1*Houtc1*Wouc1}}.
> In the first scenario ({{scenario1.py}}), a random matrix {{doutc1}} is 
> created that should have the same dimensions as {{outc1}}.  For the columns, 
> if I use {{cols=ncol(outc1)}} in this rand statement, the size will be 
> propagated and CP ops will be compiled and run.  I I instead use 
> {{cols=F1*Houtc1*Woutc1}}, the size will forever be unknown, even during 
> recompilation, and thus Spark ops will be compiled and run.  I have included 
> the recompile hops plan ({{scenario1_plan.txt}}).
> In the second scenario ({{scenario2.py}}), a {{max_pool2d::forward(...)}} 
> function is inserted after the {{conv2d::forward(...)}} function that 
> requires the {{Houtc1}} and {{Woutc1}} variables to be supplied as arguments. 
>  Since those latter variables are not executed during compilation time, the 
> max pooling sizes remain unknown, even during recompilation, and thus Spark 
> ops will be compiled and run.  I have included the recompile hops plan 
> ({{scenario2_plan.txt}}).
> We should either improve or fix our constant folding rewrites so that these 
> scenarios are fixed, as they are necessary for performant deep learning 
> applications.  Note too that this issue will be present in other non-deep 
> learning scenarios as well.
> Mailing list thread: 
> https://www.mail-archive.com/dev@systemml.incubator.apache.org/msg01657.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1561) Improve constant folding during compilation

2017-04-27 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988258#comment-15988258
 ] 

Matthias Boehm commented on SYSTEMML-1561:
--

ok, just to clarify: even the recompile explain output does not show the 
worst-case size estimates and computed size expressions as they are only 
transiently inferred and used for memory estimates. For example, given a right 
indexing B = A[x:y, z] with unknown scalars x, y, and z we would still use a 
worst-case estimate of nrow(A) x 1 for B. Anyway, if you've seen that the 
extended scalar propagation solves it, than that's fine, but likely only works 
because this forward function is inlined.  

> Improve constant folding during compilation
> ---
>
> Key: SYSTEMML-1561
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1561
> Project: SystemML
>  Issue Type: Improvement
>Reporter: Mike Dusenberry
> Attachments: scenario1_plan.txt, scenario1.py, scenario2_plan.txt, 
> scenario2.py
>
>
> In our `nn` library, our convolution and pooling layers have to pass around 
> the spatial dimensions (height and width) of the images that are stretched 
> out into rows of the input/output matrices.  These output dimensions are 
> computed within the forward functions of the above layers as small scalar 
> equations.  From a mathematical standpoint, these sizes can be determined at 
> compile time, and it is nice to have these size equations in DML (v.s. hiding 
> them inside the engine within built-in functions).  However, we do not 
> currently evaluate these expressions during compilation, and thus we are left 
> with unknown sizes even during recompilation.  This naturally leads to max 
> memory estimates and thus often leads to unnecessary distributed runtime ops 
> rather than simple CP ones.
> I have two related scenarios for which this is a problem.  They both involve 
> the {{Houtc1}} & {{Woutc1}} values that are returned from a 
> `conv2d::forward(...)` function.  These represent the spatial dimensions of 
> the volume with each of the rows of the output {{outc1}} of the function, and 
> the third dimension is {{F1}}.  Thus, {{outc1}} has a number of columns equal 
> to {{F1*Houtc1*Wouc1}}.
> In the first scenario ({{scenario1.py}}), a random matrix {{doutc1}} is 
> created that should have the same dimensions as {{outc1}}.  For the columns, 
> if I use {{cols=ncol(outc1)}} in this rand statement, the size will be 
> propagated and CP ops will be compiled and run.  I I instead use 
> {{cols=F1*Houtc1*Woutc1}}, the size will forever be unknown, even during 
> recompilation, and thus Spark ops will be compiled and run.  I have included 
> the recompile hops plan ({{scenario1_plan.txt}}).
> In the second scenario ({{scenario2.py}}), a {{max_pool2d::forward(...)}} 
> function is inserted after the {{conv2d::forward(...)}} function that 
> requires the {{Houtc1}} and {{Woutc1}} variables to be supplied as arguments. 
>  Since those latter variables are not executed during compilation time, the 
> max pooling sizes remain unknown, even during recompilation, and thus Spark 
> ops will be compiled and run.  I have included the recompile hops plan 
> ({{scenario2_plan.txt}}).
> We should either improve or fix our constant folding rewrites so that these 
> scenarios are fixed, as they are necessary for performant deep learning 
> applications.  Note too that this issue will be present in other non-deep 
> learning scenarios as well.
> Mailing list thread: 
> https://www.mail-archive.com/dev@systemml.incubator.apache.org/msg01657.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1561) Improve constant folding during compilation

2017-04-27 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15986181#comment-15986181
 ] 

Matthias Boehm commented on SYSTEMML-1561:
--

sorry I don't have a lot of free cycles right now - but could look into it end 
of next week. [~niketanpansare] it would be good if you could have a detailed 
look.

Generally, I think this is probably just a misunderstanding. We perform 
constant folding during initial compilation but not during dynamic 
recompilation and this scenario seem (without a closer look) to require the 
latter. During dynamic recompilation, we compute worst-case size estimates 
including the evaluation of scalar sub trees but these estimates are simply not 
exposed in our explain output. However, of course there might be issues. 


> Improve constant folding during compilation
> ---
>
> Key: SYSTEMML-1561
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1561
> Project: SystemML
>  Issue Type: Improvement
>Reporter: Mike Dusenberry
> Attachments: scenario1_plan.txt, scenario1.py, scenario2_plan.txt, 
> scenario2.py
>
>
> In our `nn` library, our convolution and pooling layers have to pass around 
> the spatial dimensions (height and width) of the images that are stretched 
> out into rows of the input/output matrices.  These output dimensions are 
> computed within the forward functions of the above layers as small scalar 
> equations.  From a mathematical standpoint, these sizes can be determined at 
> compile time, and it is nice to have these size equations in DML (v.s. hiding 
> them inside the engine within built-in functions).  However, we do not 
> currently evaluate these expressions during compilation, and thus we are left 
> with unknown sizes even during recompilation.  This naturally leads to max 
> memory estimates and thus often leads to unnecessary distributed runtime ops 
> rather than simple CP ones.
> I have two related scenarios for which this is a problem.  They both involve 
> the {{Houtc1}} & {{Woutc1}} values that are returned from a 
> `conv2d::forward(...)` function.  These represent the spatial dimensions of 
> the volume with each of the rows of the output {{outc1}} of the function, and 
> the third dimension is {{F1}}.  Thus, {{outc1}} has a number of columns equal 
> to {{F1*Houtc1*Wouc1}}.
> In the first scenario ({{scenario1.py}}), a random matrix {{doutc1}} is 
> created that should have the same dimensions as {{outc1}}.  For the columns, 
> if I use {{cols=ncol(outc1)}} in this rand statement, the size will be 
> propagated and CP ops will be compiled and run.  I I instead use 
> {{cols=F1*Houtc1*Woutc1}}, the size will forever be unknown, even during 
> recompilation, and thus Spark ops will be compiled and run.  I have included 
> the recompile hops plan ({{scenario1_plan.txt}}).
> In the second scenario ({{scenario2.py}}), a {{max_pool2d::forward(...)}} 
> function is inserted after the {{conv2d::forward(...)}} function that 
> requires the {{Houtc1}} and {{Woutc1}} variables to be supplied as arguments. 
>  Since those latter variables are not executed during compilation time, the 
> max pooling sizes remain unknown, even during recompilation, and thus Spark 
> ops will be compiled and run.  I have included the recompile hops plan 
> ({{scenario2_plan.txt}}).
> We should either improve or fix our constant folding rewrites so that these 
> scenarios are fixed, as they are necessary for performant deep learning 
> applications.  Note too that this issue will be present in other non-deep 
> learning scenarios as well.
> Mailing list thread: 
> https://www.mail-archive.com/dev@systemml.incubator.apache.org/msg01657.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1466) Update `convnet.dml` to use distributed SGD.

2017-04-25 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15983880#comment-15983880
 ] 

Matthias Boehm commented on SYSTEMML-1466:
--

just to summarize: the particular issue here was caused by missing status 
handling for repeated rdd exports. It's now fixed in master along with same 
related mlcontext tests.

> Update `convnet.dml` to use distributed SGD.
> 
>
> Key: SYSTEMML-1466
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1466
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Attachments: convnet_distrib_sgd.dml, run_convnet_distrib_sgd.py, 
> run_convnet_distrib_sgd-stats.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (SYSTEMML-1466) Update `convnet.dml` to use distributed SGD.

2017-04-25 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15983880#comment-15983880
 ] 

Matthias Boehm edited comment on SYSTEMML-1466 at 4/26/17 12:22 AM:


just to summarize: the particular issue here was caused by missing status 
handling for repeated rdd exports. It's now fixed in master along with some 
related mlcontext tests.


was (Author: mboehm7):
just to summarize: the particular issue here was caused by missing status 
handling for repeated rdd exports. It's now fixed in master along with same 
related mlcontext tests.

> Update `convnet.dml` to use distributed SGD.
> 
>
> Key: SYSTEMML-1466
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1466
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Attachments: convnet_distrib_sgd.dml, run_convnet_distrib_sgd.py, 
> run_convnet_distrib_sgd-stats.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1466) Update `convnet.dml` to use distributed SGD.

2017-04-25 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15983580#comment-15983580
 ] 

Matthias Boehm commented on SYSTEMML-1466:
--

https://www.mail-archive.com/dev@systemml.incubator.apache.org/msg01665.html - 
let's collect all the information first, right now I only have a guess.

> Update `convnet.dml` to use distributed SGD.
> 
>
> Key: SYSTEMML-1466
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1466
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Attachments: convnet_distrib_sgd.dml, run_convnet_distrib_sgd.py, 
> run_convnet_distrib_sgd-stats.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1466) Update `convnet.dml` to use distributed SGD.

2017-04-25 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15983432#comment-15983432
 ] 

Matthias Boehm commented on SYSTEMML-1466:
--

ok, after a closer look I found two special cases that might lead to the 
described issue and I'll push a fix soon. However, in addition, I think there 
might be a general issue with the cleanup semantics of MLContext - to exclude 
this issue, you can remove /tmp/systemml by hand if the patch does not solve 
the problem.

> Update `convnet.dml` to use distributed SGD.
> 
>
> Key: SYSTEMML-1466
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1466
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Attachments: convnet_distrib_sgd.dml, run_convnet_distrib_sgd.py, 
> run_convnet_distrib_sgd-stats.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (SYSTEMML-1560) Cache-conscious compressed tsmm operations

2017-04-24 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm closed SYSTEMML-1560.


> Cache-conscious compressed tsmm operations
> --
>
> Key: SYSTEMML-1560
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1560
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (SYSTEMML-1560) Cache-conscious compressed tsmm operations

2017-04-24 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm resolved SYSTEMML-1560.
--
   Resolution: Done
 Assignee: Matthias Boehm
Fix Version/s: SystemML 1.0

> Cache-conscious compressed tsmm operations
> --
>
> Key: SYSTEMML-1560
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1560
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SYSTEMML-1560) Cache-conscious compressed tsmm operations

2017-04-24 Thread Matthias Boehm (JIRA)
Matthias Boehm created SYSTEMML-1560:


 Summary: Cache-conscious compressed tsmm operations
 Key: SYSTEMML-1560
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1560
 Project: SystemML
  Issue Type: Task
Reporter: Matthias Boehm






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (SYSTEMML-1558) Order statistics over compressed matrix blocks

2017-04-24 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm reassigned SYSTEMML-1558:


Assignee: Matthias Boehm

> Order statistics over compressed matrix blocks
> --
>
> Key: SYSTEMML-1558
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1558
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SYSTEMML-1558) Order statistics over compressed matrix blocks

2017-04-24 Thread Matthias Boehm (JIRA)
Matthias Boehm created SYSTEMML-1558:


 Summary: Order statistics over compressed matrix blocks
 Key: SYSTEMML-1558
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1558
 Project: SystemML
  Issue Type: Task
Reporter: Matthias Boehm






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1556) CLI argument parsing issues w/ escaped quotes

2017-04-23 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980776#comment-15980776
 ] 

Matthias Boehm commented on SYSTEMML-1556:
--

thanks [~nakul02] - I already fixed it for the failing tests but are there any 
plans to fix the general issue?

> CLI argument parsing issues w/ escaped quotes
> -
>
> Key: SYSTEMML-1556
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1556
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>
> Our previous parsing of command line arguments was able to handle escaped 
> quotes such as {{Log=\" \"}} whereas the new CLI parsing cuts it off as 
> {{LOG="}} leading to test failures of the codegen algorithm tests on windows.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1556) CLI argument parsing issues w/ escaped quotes

2017-04-23 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980312#comment-15980312
 ] 

Matthias Boehm commented on SYSTEMML-1556:
--

cc [~nakul02] [~deron] any ideas?

> CLI argument parsing issues w/ escaped quotes
> -
>
> Key: SYSTEMML-1556
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1556
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>
> Our previous parsing of command line arguments was able to handle escaped 
> quotes such as {{Log=\" \"}} whereas the new CLI parsing cuts it off as 
> {{LOG="}} leading to test failures of the codegen algorithm tests on windows.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1556) CLI argument parsing issues w/ escaped quotes

2017-04-23 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1556:
-
Description: Our previous parsing of command line arguments was able to 
handle escaped quotes such as {{Log=\" \"}} whereas the new CLI parsing cuts it 
off as {{LOG="}} leading to test failures of the codegen algorithm tests on 
windows.  (was: Our previous parsing of command line arguments was able to 
handle escaped quotes such as {{Log=\" \"}} whereas the new CLI parsing cuts it 
off as {{LOG="}} leading to test failures of the codegen experiments on 
windows.)

> CLI argument parsing issues w/ escaped quotes
> -
>
> Key: SYSTEMML-1556
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1556
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>
> Our previous parsing of command line arguments was able to handle escaped 
> quotes such as {{Log=\" \"}} whereas the new CLI parsing cuts it off as 
> {{LOG="}} leading to test failures of the codegen algorithm tests on windows.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SYSTEMML-1556) CLI argument parsing issues w/ escaped quotes

2017-04-23 Thread Matthias Boehm (JIRA)
Matthias Boehm created SYSTEMML-1556:


 Summary: CLI argument parsing issues w/ escaped quotes
 Key: SYSTEMML-1556
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1556
 Project: SystemML
  Issue Type: Bug
Reporter: Matthias Boehm


Our previous parsing of command line arguments was able to handle escaped 
quotes such as {{Log=\" \"}} whereas the new CLI parsing cuts it off as 
{{LOG="}} leading to test failures of the codegen experiments on windows.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1555) Decouple literal replacement from in-place recompilation

2017-04-22 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1555:
-
Description: 
The current literal replacement framework contains basic scalar literal 
replacement as well as the replacement of small matrix operations with their 
literal results. If this framework is invoked with temporary matrix objects 
created during size propagation any matrix operation would obviously fail. So 
far, this created no problems because literal replacement was tied to 
recompilations that are not in-place, i.e., recompilations that create a deep 
copy of the hop dag, which in turn only happens for single-dag recompilations.

This task aims to decouple the literal replacement from in-place recompilations 
in order to increase the literal replacement potential and allow for a more 
flexible use of this literal replacement framework.

  was:
The current literal replacement framework contains basic scalar literal 
replacement as well as the replacement of small matrix operations with their 
literal results. If this framework is invoked with temporary matrix objects 
created during size propagation any matrix operation would obviously fail. So 
far, this created no problems because literal replacement was tied to 
recompilation that are not in-place, i.e., recompilations that create a deep 
copy of the hop dag, which in turn only happens for single-dag recompilations.

This task aims to decouple the literal replacement from in-place recompilations 
in order to increase the literal replacement potential and allow for a more 
flexible use of this literal replacement framework.


> Decouple literal replacement from in-place recompilation
> 
>
> Key: SYSTEMML-1555
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1555
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler
>Reporter: Matthias Boehm
> Fix For: SystemML 1.0
>
>
> The current literal replacement framework contains basic scalar literal 
> replacement as well as the replacement of small matrix operations with their 
> literal results. If this framework is invoked with temporary matrix objects 
> created during size propagation any matrix operation would obviously fail. So 
> far, this created no problems because literal replacement was tied to 
> recompilations that are not in-place, i.e., recompilations that create a deep 
> copy of the hop dag, which in turn only happens for single-dag recompilations.
> This task aims to decouple the literal replacement from in-place 
> recompilations in order to increase the literal replacement potential and 
> allow for a more flexible use of this literal replacement framework.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SYSTEMML-1555) Decouple literal replacement from in-place recompilation

2017-04-22 Thread Matthias Boehm (JIRA)
Matthias Boehm created SYSTEMML-1555:


 Summary: Decouple literal replacement from in-place recompilation
 Key: SYSTEMML-1555
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1555
 Project: SystemML
  Issue Type: Sub-task
Reporter: Matthias Boehm


The current literal replacement framework contains basic scalar literal 
replacement as well as the replacement of small matrix operations with their 
literal results. If this framework is invoked with temporary matrix objects 
created during size propagation any matrix operation would obviously fail. So 
far, this created no problems because literal replacement was tied to 
recompilation that are not in-place, i.e., recompilations that create a deep 
copy of the hop dag, which in turn only happens for single-dag recompilations.

This task aims to decouple the literal replacement from in-place recompilations 
in order to increase the literal replacement potential and allow for a more 
flexible use of this literal replacement framework.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (SYSTEMML-1548) Performance ultra-sparse matrix read

2017-04-22 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm resolved SYSTEMML-1548.
--
Resolution: Done

> Performance ultra-sparse matrix read
> 
>
> Key: SYSTEMML-1548
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1548
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>
> Reading ultra-sparse matrices shows for certain data sizes and memory 
> configurations poor performance due to garbage collection overheads.
> In detail, this task covers two scenarios that will be addressed 
> independently:
> 1) Large heap: In case of large heaps, the problem are temporarily 
> deserialized sparse blocks which are not reused due to inefficient reset, 
> leading to lots of garbage and hence high cost for full garbage collection. 
> This will be addressed by using our CSR sparse blocks for ultra-sparse blocks 
> because CSR has a smaller memory footprint and allows for efficient reset.
> 2) Small heap: In case of a small heap not the temporary blocks but the 
> memory overhead of the target sparse matrix becomes the bottleneck. This is 
> due to a relatively large memory overhead per sparse row which is not 
> amortized if a row has just one or very few non-zeros. This will be addressed 
> via a modification of the MCSR representation for ultra-sparse matrices. Note 
> that we cannot use CSR or COO here because we want to support efficient 
> multi-threaded incremental construction and subsequent operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (SYSTEMML-1548) Performance ultra-sparse matrix read

2017-04-22 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm closed SYSTEMML-1548.


> Performance ultra-sparse matrix read
> 
>
> Key: SYSTEMML-1548
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1548
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>
> Reading ultra-sparse matrices shows for certain data sizes and memory 
> configurations poor performance due to garbage collection overheads.
> In detail, this task covers two scenarios that will be addressed 
> independently:
> 1) Large heap: In case of large heaps, the problem are temporarily 
> deserialized sparse blocks which are not reused due to inefficient reset, 
> leading to lots of garbage and hence high cost for full garbage collection. 
> This will be addressed by using our CSR sparse blocks for ultra-sparse blocks 
> because CSR has a smaller memory footprint and allows for efficient reset.
> 2) Small heap: In case of a small heap not the temporary blocks but the 
> memory overhead of the target sparse matrix becomes the bottleneck. This is 
> due to a relatively large memory overhead per sparse row which is not 
> amortized if a row has just one or very few non-zeros. This will be addressed 
> via a modification of the MCSR representation for ultra-sparse matrices. Note 
> that we cannot use CSR or COO here because we want to support efficient 
> multi-threaded incremental construction and subsequent operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1466) Update `convnet.dml` to use distributed SGD.

2017-04-21 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15979766#comment-15979766
 ] 

Matthias Boehm commented on SYSTEMML-1466:
--

well, this looks as if the matrix object meta data is set incorrectly in 
mlcontext or wherever this intermediate is created. The export only writes out 
dirty in-memory matrices or results of pending rdd operations, both of which 
are guaranteed not to exist on hdfs. I can have a look early next week once I'm 
done with a couple of other issues.

> Update `convnet.dml` to use distributed SGD.
> 
>
> Key: SYSTEMML-1466
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1466
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Attachments: convnet_distrib_sgd.dml, run_convnet_distrib_sgd.py, 
> run_convnet_distrib_sgd-stats.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1554) IPA Scalar Transient Read Replacement

2017-04-21 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15979623#comment-15979623
 ] 

Matthias Boehm commented on SYSTEMML-1554:
--

thanks [~mwdus...@us.ibm.com] for bringing this up - I wanted to do this for a 
long time (SYSTEMML-427) but until now it never got high priority. 

> IPA Scalar Transient Read Replacement
> -
>
> Key: SYSTEMML-1554
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1554
> Project: SystemML
>  Issue Type: Improvement
>Reporter: Mike Dusenberry
> Attachments: convnet_distrib_sgd.dml, parfor_oom_convnet_plan.txt, 
> parfor_oom_convnet.py, parfor_oom_plan.txt, parfor_oom.py
>
>
> Currently, during IPA we collect all variables (scalars & matrices) eligible 
> for propagation across blocks (i.e. not updated in block), and then propagate 
> the only the matrix sizes across the blocks.  It seems plausible that we 
> could also replace all eligible scalar transient reads with literals based on 
> the variables that have already been collected.  The benefit is that many ops 
> will be able to determine their respective output sizes during regular 
> compilation, instead of having to wait until dynamic recompilation, and thus 
> we can reduce the pressure on dynamic recompilation.
> Are there drawbacks to this approach?  The use case is that I was seeing a 
> large number of memory warnings while training a convolutional net due to the 
> sizes being unknown during regular compilation, yet the engine only having CP 
> versions of the ops.  Additionally, I was running into actual heap space OOM 
> errors for situations that should not run out of memory, and thus I started 
> exploring.
> I've attached an example script and the explain plan (hops & runtime) w/ and 
> w/o the IPA scalar replacement.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (SYSTEMML-1551) Multi-thread rowwise rexpand

2017-04-21 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm resolved SYSTEMML-1551.
--
Resolution: Done
  Assignee: Matthias Boehm

> Multi-thread rowwise rexpand
> 
>
> Key: SYSTEMML-1551
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1551
> Project: SystemML
>  Issue Type: Sub-task
>  Components: APIs, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (SYSTEMML-1550) Missing worst-cased estimates on valid dimension/size checks

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm resolved SYSTEMML-1550.
--
   Resolution: Fixed
 Assignee: Matthias Boehm
Fix Version/s: SystemML 1.0

> Missing worst-cased estimates on valid dimension/size checks
> 
>
> Key: SYSTEMML-1550
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1550
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>
> In addition to our traditional memory estimates (based on exact or worst-case 
> dimensions and sparsity), operator selection also performs a check for valid 
> integer dimensions and maximum dense size. However, this dimension/size check 
> does not take worst-case estimates into account. Therefore, we currently 
> compile unnecessary distributed operations for operations that produce 
> ultra-sparse outputs without exactly known sparsity such as rexpand. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (SYSTEMML-1550) Missing worst-cased estimates on valid dimension/size checks

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm closed SYSTEMML-1550.


> Missing worst-cased estimates on valid dimension/size checks
> 
>
> Key: SYSTEMML-1550
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1550
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>
> In addition to our traditional memory estimates (based on exact or worst-case 
> dimensions and sparsity), operator selection also performs a check for valid 
> integer dimensions and maximum dense size. However, this dimension/size check 
> does not take worst-case estimates into account. Therefore, we currently 
> compile unnecessary distributed operations for operations that produce 
> ultra-sparse outputs without exactly known sparsity such as rexpand. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (SYSTEMML-1548) Performance ultra-sparse matrix read

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm reassigned SYSTEMML-1548:


Assignee: Matthias Boehm

> Performance ultra-sparse matrix read
> 
>
> Key: SYSTEMML-1548
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1548
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>
> Reading ultra-sparse matrices shows for certain data sizes and memory 
> configurations poor performance due to garbage collection overheads.
> In detail, this task covers two scenarios that will be addressed 
> independently:
> 1) Large heap: In case of large heaps, the problem are temporarily 
> deserialized sparse blocks which are not reused due to inefficient reset, 
> leading to lots of garbage and hence high cost for full garbage collection. 
> This will be addressed by using our CSR sparse blocks for ultra-sparse blocks 
> because CSR has a smaller memory footprint and allows for efficient reset.
> 2) Small heap: In case of a small heap not the temporary blocks but the 
> memory overhead of the target sparse matrix becomes the bottleneck. This is 
> due to a relatively large memory overhead per sparse row which is not 
> amortized if a row has just one or very few non-zeros. This will be addressed 
> via a modification of the MCSR representation for ultra-sparse matrices. Note 
> that we cannot use CSR or COO here because we want to support efficient 
> multi-threaded incremental construction and subsequent operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1548) Performance ultra-sparse matrix read

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1548:
-
Fix Version/s: SystemML 1.0

> Performance ultra-sparse matrix read
> 
>
> Key: SYSTEMML-1548
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1548
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>
> Reading ultra-sparse matrices shows for certain data sizes and memory 
> configurations poor performance due to garbage collection overheads.
> In detail, this task covers two scenarios that will be addressed 
> independently:
> 1) Large heap: In case of large heaps, the problem are temporarily 
> deserialized sparse blocks which are not reused due to inefficient reset, 
> leading to lots of garbage and hence high cost for full garbage collection. 
> This will be addressed by using our CSR sparse blocks for ultra-sparse blocks 
> because CSR has a smaller memory footprint and allows for efficient reset.
> 2) Small heap: In case of a small heap not the temporary blocks but the 
> memory overhead of the target sparse matrix becomes the bottleneck. This is 
> due to a relatively large memory overhead per sparse row which is not 
> amortized if a row has just one or very few non-zeros. This will be addressed 
> via a modification of the MCSR representation for ultra-sparse matrices. Note 
> that we cannot use CSR or COO here because we want to support efficient 
> multi-threaded incremental construction and subsequent operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SYSTEMML-1551) Multi-thread rowwise rexpand

2017-04-20 Thread Matthias Boehm (JIRA)
Matthias Boehm created SYSTEMML-1551:


 Summary: Multi-thread rowwise rexpand
 Key: SYSTEMML-1551
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1551
 Project: SystemML
  Issue Type: Sub-task
Reporter: Matthias Boehm






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1391) Drop java 6 and 7 support

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1391:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Drop java 6 and 7 support
> -
>
> Key: SYSTEMML-1391
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1391
> Project: SystemML
>  Issue Type: Sub-task
>  Components: APIs, Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Deron Eriksson
> Fix For: SystemML 0.14
>
>
> This task drops Java 6 and 7 support by changing the build to Java 8 
> compliance level, which allows us to modernize the code base going forward.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1302) Removal parfor perftesttool and cost estimator

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1302:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Removal parfor perftesttool and cost estimator
> --
>
> Key: SYSTEMML-1302
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1302
> Project: SystemML
>  Issue Type: Sub-task
>  Components: APIs, Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1430) Robust memory handling broadcasts

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1430:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Robust memory handling broadcasts
> -
>
> Key: SYSTEMML-1430
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1430
> Project: SystemML
>  Issue Type: Sub-task
>  Components: APIs, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>
> This task aims to make the memory handling of re-used and temporary broadcast 
> variables more robust in order to avoid unnecessary OOMs. 
> 1) Explicitly destroy temporary broadcast variables because we would 
> otherwise never clean them up and Spark's ContextCleaner seems to be a best 
> effort daemon that triggers every 30min or on garbage collection (which might 
> be too late if a large object is allocated)
> 2) Keep track of currently softly reachable (re-used) broadcasts in order to 
> take their size into account when deciding on guarded collect.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1390) Avoid unnecessary caching of parfor spark datapartition-execute input

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1390:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Avoid unnecessary caching of parfor spark datapartition-execute input
> -
>
> Key: SYSTEMML-1390
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1390
> Project: SystemML
>  Issue Type: Sub-task
>  Components: APIs, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>
> This task aims to avoid unnecessary input caching for parfor spark 
> datapartition-execute jobs (with grouping) in order to reduce the memory 
> pressure and thus garbage collection overhead during shuffle and subsequent 
> execution. We only apply this for the general case with grouping and if the 
> input is a persisted rdd which has not been cached yet.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1429) Handle selected replication factor in parfor spark data partitioning

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1429:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Handle selected replication factor in parfor spark data partitioning
> 
>
> Key: SYSTEMML-1429
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1429
> Project: SystemML
>  Issue Type: Sub-task
>  Components: APIs, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1404) Robust ultra-sparse matrix binary block conversion

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1404:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Robust ultra-sparse matrix binary block conversion
> --
>
> Key: SYSTEMML-1404
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1404
> Project: SystemML
>  Issue Type: Sub-task
>  Components: APIs, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1350) Performance parfor spark datapartition-execute

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1350:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Performance parfor spark datapartition-execute
> --
>
> Key: SYSTEMML-1350
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1350
> Project: SystemML
>  Issue Type: Sub-task
>  Components: APIs, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>
> Our fused parfor spark datapartition-execute job - as used for large 
> scenarios of univariate statistics - exhibits some unnecessary runtime 
> overheads. In detail, the potential improvements includes:
> 1) Incremental nnz maintenance on partition collect
> 2) Reuse of dense partitions per task (avoid reallocation)
> 3) Explicitly control the number of output partitions (avoid OOMs, reduce 
> memory pressure)
> 4) Avoid unnecessary rdd export on parfor data partitioning
> The points (3) and (4) also apply to the parfor spark datapartition job.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1317) Cleanup unnecessary deep copies on rdd aggregation/merge

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1317:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Cleanup unnecessary deep copies on rdd aggregation/merge
> 
>
> Key: SYSTEMML-1317
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1317
> Project: SystemML
>  Issue Type: Sub-task
>  Components: APIs, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1367) Parfor datapartition-execute w/o shuffling for special cases

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1367:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Parfor datapartition-execute w/o shuffling for special cases
> 
>
> Key: SYSTEMML-1367
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1367
> Project: SystemML
>  Issue Type: Sub-task
>  Components: APIs, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1378) Native dataset support in parfor spark datapartition-execute

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1378:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Native dataset support in parfor spark datapartition-execute
> 
>
> Key: SYSTEMML-1378
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1378
> Project: SystemML
>  Issue Type: Sub-task
>  Components: APIs, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>
> This task aims for a deeper integration of Spark Datasets into SystemML. 
> Consider the following example scenario, invoked through MLContext with X 
> being a DataSet:
> {code}
> X = read(...)
> parfor( i in 1:nrow(X) ) {
> Xi = X[i, ]
> v[i, 1] = ... some computation over Xi
> }
> {code}
> Currently, we would convert the input dataset to binary block (1st shuffle) 
> at API level and subsequently pass it into SystemML. For large data, we would 
> then compile a single parfor data-partition execute job that slices row 
> fragments, collects row fragments int partitions (2nd shuffle), and finally 
> executes the parfor body per partition. 
> Native dataset support would allow us to avoid these two shuffles and compute 
> the entire parfor in a data-local manner. In detail, this involves the 
> following extensions:
> * API level: Keep lineage of input dataset leveraging our existing lineage 
> mechanism in {{MatrixObject}}
> * Parfor datapartition-execute: SYSTEMML-1367 already introduced the 
> data-local processing for special cases (if ncol<=blocksize). Given the 
> lineage, we can simply probe the input to datapartition-execute and, for row 
> partitioning, use directly the dataset instead of the reblocked matrix rdd in 
> a data-local manner. This does not just avoid the 2nd shuffle but due to lazy 
> evaluation also the 1st shuffle if no operation other than parfor accesses X 
> (except zipwithindex if no ids are passed in, as this transformation triggers 
> computation)
> * Cleanup: Prevent cleanup (unpersist) of lineage objects of type dataset as 
> they are passed from outside.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1315) Robust memory handling rdd operations w/ large outputs

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1315:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Robust memory handling rdd operations w/ large outputs
> --
>
> Key: SYSTEMML-1315
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1315
> Project: SystemML
>  Issue Type: Sub-task
>  Components: APIs, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1311) New libsvm-binary data converters

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1311:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> New libsvm-binary data converters
> -
>
> Key: SYSTEMML-1311
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1311
> Project: SystemML
>  Issue Type: Sub-task
>  Components: APIs, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1314) Robust memory handling parallelized rdds

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1314:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Robust memory handling parallelized rdds
> 
>
> Key: SYSTEMML-1314
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1314
> Project: SystemML
>  Issue Type: Sub-task
>  Components: APIs, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1310) Parfor block partitioning (mini batches)

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1310:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Parfor block partitioning (mini batches)
> 
>
> Key: SYSTEMML-1310
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1310
> Project: SystemML
>  Issue Type: Sub-task
>  Components: APIs, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>
> This task aims to add basic support for block partitioning of fixed size 
> batches of rows or columns in the parfor optimizer and runtime. For now, we 
> will only apply this for the spark execution backend. The basic supported 
> form looks as follows, where we require that batches are aligned with the 
> systemml blocksize.
> {code}
> N = 20;
> parfor(bi in 1 : nrow(X)/N) {
>Xbi = X[((bi-1)*N+1):(bi*N), ];
>R[bi, 1] = sum(Xbi);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1309) Parfor spark buffer pool handling

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1309:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Parfor spark buffer pool handling
> -
>
> Key: SYSTEMML-1309
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1309
> Project: SystemML
>  Issue Type: Sub-task
>  Components: APIs, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>
> In contrast to parfor mr jobs, where every task has its own, process-local 
> buffer pool, on spark with multi-threaded executors, multiple tasks share a 
> common buffer pool. This is advantageous because common inputs are just read 
> once. However, it also requires a synchronized buffer pool initialization and 
> cleanup per executor. Especially the cleanup (e.g., of created cache 
> directories) is tricky because spark does not provide an executor close call. 
> Hence, our approach is to use a robust version of deleteOnExit that is 
> independent of the exit code and also removes remaining files that are 
> unknown during delete registration.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1454) Improved literal replacement nrow/ncol

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1454:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Improved literal replacement nrow/ncol
> --
>
> Key: SYSTEMML-1454
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1454
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1336) Improve parfor exec type selection (w/ potential data partitioning)

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1336:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Improve parfor exec type selection (w/ potential data partitioning)
> ---
>
> Key: SYSTEMML-1336
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1336
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>
> This task aims to address suboptimal parfor optimizer choices for 
> partitionable scenarios with large driver memory. Currently, we only apply 
> partitioning, if the right indexing operation does not fit in memory of the 
> driver or remote tasks. The execution type selection is then unaware of 
> potential partitioning, and does not revert this decision - this is 
> problematic, because the large input likely exceeds the memory budget of 
> remote tasks, ultimately causing the optimizer to fall back to a local parfor 
> with very small degree of parallelism k.
> On our perftest 8GB Univariate stats scenario (with 20GB driver, i.e., 14GB 
> memory budget), this lead to a local parfor with k=1 and thus, unnecessarily 
> high execution time.
> {code}
> Total elapsed time:   781.233 sec.
> Total compilation time:   2.059 sec.
> Total execution time: 779.175 sec.
> Number of compiled Spark inst:0.
> Number of executed Spark inst:0.
> Cache hits (Mem, WB, FS, HDFS):   27904/0/0/2.
> Cache writes (WB, FS, HDFS):  3134/0/1.
> Cache times (ACQr/m, RLS, EXP):   9.200/0.022/0.301/0.300 sec.
> HOP DAGs recompiled (PRED, SB):   0/100.
> HOP DAGs recompile time:  0.238 sec.
> Spark ctx create time (lazy): 0.000 sec.
> Spark trans counts (par,bc,col):0/0/0.
> Spark trans times (par,bc,col):   0.000/0.000/0.000 secs.
> ParFor loops optimized:   1.
> ParFor optimize time: 1.985 sec.
> ParFor initialize time:   0.007 sec.
> ParFor result merge time: 0.003 sec.
> ParFor total update in-place: 0/0/13900
> Total JIT compile time:   13.542 sec.
> Total JVM GC count:   29.
> Total JVM GC time:3.49 sec.
> Heavy hitter instructions (name, time, count):
> -- 1) cm  479.000 sec 2700
> -- 2) qsort   228.928 sec 900
> -- 3) qpick   20.598 sec  1800
> -- 4) rangeReIndex16.051 sec  2999
> -- 5) uamean  12.867 sec  900
> -- 6) uacmax  9.870 sec   1
> -- 7) ctable  3.158 sec   100
> -- 8) uamin   2.589 sec   1000
> -- 9) uamax   2.560 sec   1101
> -- 10)write   0.300 sec   1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1326) Remove redundancy in rewrites (consolidate in HopRewriteUtils)

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1326:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Remove redundancy in rewrites (consolidate in HopRewriteUtils)
> --
>
> Key: SYSTEMML-1326
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1326
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>
> This task aims to remove redundancy across existing static and dynamic 
> rewrites in order to improve maintainability and remove potential for subtle 
> bugs such as missing size propagation, missing line numbers, or missing 
> blocksize configurations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1327) Remove redundancy instruction generation (across backends)

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1327:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Remove redundancy instruction generation (across backends)
> --
>
> Key: SYSTEMML-1327
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1327
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1369) Loop vectorization rewrite for indexed copies

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1369:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Loop vectorization rewrite for indexed copies
> -
>
> Key: SYSTEMML-1369
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1369
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>
> For complex models, users often have to resort to a task-parallel parfor 
> implementation because it's not easily expressed in vectorized form. In this 
> context, we often encounter indexed copies that could have been easily 
> vectorized. For example, the below snippet stems from a recent benchmark 
> comparison against SystemML, which implemented Hidden Markov Model learning:
> {code}
> parfor (i in 1:ncol(labels)) {
>topics[id, i] = labels[1, i]
> }
> {code}
> This task aims to automatically rewrite such loops into vectorized form - for 
> above example into {{topics[id, 1:ncol(labels)] = labels[1, 
> 1:ncol(labels)]}}, where the unnecessary right indexing is subsequently 
> remove by following HOP rewrites.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1456) Inconsistent generated code compared to existing builtin functions

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1456:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Inconsistent generated code compared to existing builtin functions
> --
>
> Key: SYSTEMML-1456
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1456
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1458) Missing common subexpression elimination on codegen cplan construction

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1458:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Missing common subexpression elimination on codegen cplan construction
> --
>
> Key: SYSTEMML-1458
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1458
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1459) Wrong codegen materialization decisions due to dangling hop parents

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1459:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Wrong codegen materialization decisions due to dangling hop parents
> ---
>
> Key: SYSTEMML-1459
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1459
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>
> The cost-based codegen plan selector uses hops with multiple consumers as 
> candidates for decisions on materalization points. On end-to-end algorithm 
> scenarios, we encountered side effects from specific dynamic simplification 
> rewrites which did not properly cleanup parent-child pointers of removed hops.
> Additionally, the rewrite fuseBinarySubDAGToUnaryOperation showed issues due 
> to multiple applied pattern, where the first pattern modified the dag, but 
> the second pattern matched against the original dag structure and hence 
> together these rewrites create a wrong modified output DAG. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1525) Wrong size propagation of row and multi-agg codegen templates

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1525:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Wrong size propagation of row and multi-agg codegen templates
> -
>
> Key: SYSTEMML-1525
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1525
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1526) Missing codegen plan cleanup for multi-aggregate templates

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1526:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Missing codegen plan cleanup for multi-aggregate templates
> --
>
> Key: SYSTEMML-1526
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1526
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>
> The codegen plan cleanup removes unnecessary row/column indexing operations 
> over the main input as well as entire plans with right indexing over the main 
> input. However, so far this did not apply to multi-aggregate templates where 
> the indexing operations themselves can be the output nodes.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (SYSTEMML-1537) Codegen cost model missing various operations

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm closed SYSTEMML-1537.


> Codegen cost model missing various operations
> -
>
> Key: SYSTEMML-1537
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1537
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1515) Generalize cell template for sideways row vectors inputs

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1515:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Generalize cell template for sideways row vectors inputs
> 
>
> Key: SYSTEMML-1515
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1515
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1513) Create additional common unary and binary row vector operations

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1513:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Create additional common unary and binary row vector operations
> ---
>
> Key: SYSTEMML-1513
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1513
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>
> This task aims to add support for the following common unary and binary row 
> vector operations:
> * Unary: abs, round, ceil, floor, sign, pow2, mult2, sqrt 
> * Binary: min, max, plus, pow



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (SYSTEMML-1537) Codegen cost model missing various operations

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm resolved SYSTEMML-1537.
--
   Resolution: Fixed
 Assignee: Matthias Boehm
Fix Version/s: SystemML 1.0

> Codegen cost model missing various operations
> -
>
> Key: SYSTEMML-1537
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1537
> Project: SystemML
>  Issue Type: Bug
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1512) Performance spark rowwise codegen instructions

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1512:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Performance spark rowwise codegen instructions
> --
>
> Key: SYSTEMML-1512
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1512
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1508) Support log and exp row vector operations

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1508:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Support log and exp row vector operations
> -
>
> Key: SYSTEMML-1508
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1508
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1448) Support rowMins/rowMaxs in row aggregate template

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1448:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Support rowMins/rowMaxs in row aggregate template
> -
>
> Key: SYSTEMML-1448
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1448
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1472) Avoid dense allocation of empty sideway inputs

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1472:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Avoid dense allocation of empty sideway inputs
> --
>
> Key: SYSTEMML-1472
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1472
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1507) Additional meta operator template: SpoofRowwise

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1507:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Additional meta operator template: SpoofRowwise
> ---
>
> Key: SYSTEMML-1507
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1507
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>
> This task aims to add further codegen templates besides the existing 
> SpoofCellwise, SpoofRowAggregate, SpoofOuterProduct, SpoofMultiAggregate.
> Rowwise Templates: In addition to our row aggregate template (which supports 
> rowwise column aggregations) we want to be able to fuse arbitrary rowwise 
> operations with and without aggregate in order to avoid unnecessary 
> intermediates for expressions like B = (A <= rowMins(A)); C = B / rowSums 
> (B), which should become a single operator instead of 4 scans and two matrix 
> and two vector intermediates. Similar to our existing SpoofCellwise template 
> (which supports no aggregation, row aggregation, and full aggregation), this 
> SpoofRowwise template should support no aggregation, row aggregation, and 
> column aggregation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1478) Support cross-partition multi-aggregates with partial shared reads

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1478:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Support cross-partition multi-aggregates with partial shared reads
> --
>
> Key: SYSTEMML-1478
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1478
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1441) Vector primitives for row comparisons

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1441:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Vector primitives for row comparisons
> -
>
> Key: SYSTEMML-1441
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1441
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>
> This task aims to add compiler and runtime support for the following vector 
> primitives:
> * vectEqualAdd / vectEqualScalar
> * vectNonEqualAdd / vectEqualScalar
> * vectLessAdd / vectLessScalar
> * vectLessEqualAdd / vectLessEqualScalar
> * vectGreaterAdd / vectGreaterScalar
> * vectGreaterEqualAdd / vectGreaterEqualScalar



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1447) Support 'replace' in row aggregate and cell templates

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1447:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Support 'replace' in row aggregate and cell templates
> -
>
> Key: SYSTEMML-1447
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1447
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1446) Multi-threaded compilation of fused operators

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1446:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Multi-threaded compilation of fused operators
> -
>
> Key: SYSTEMML-1446
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1446
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1439) Fuse row aggregate w/ colvector output into cell template

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1439:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Fuse row aggregate w/ colvector output into cell template
> -
>
> Key: SYSTEMML-1439
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1439
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>
> 1) vector-matrix w/ cell template input
> 2) matrix-vector w/ cell template output



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1438) Support min/max/sumsq in cell templates w/ aggregation

2017-04-20 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1438:
-
Fix Version/s: (was: SystemML 1.0)
   SystemML 0.14

> Support min/max/sumsq in cell templates w/ aggregation
> --
>
> Key: SYSTEMML-1438
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1438
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.14
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   3   4   5   6   7   8   9   10   >