[jira] [Commented] (SYSTEMML-1242) Perftest: OutOfMemoryError in MultiLogReg for 80g sparse

2017-02-14 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15865441#comment-15865441
 ] 

Matthias Boehm commented on SYSTEMML-1242:
--

ok with the changes in SYSTEMML-1261 the script runs now through but we should 
realize the other two items as well.

> Perftest: OutOfMemoryError in MultiLogReg for 80g sparse
> 
>
> Key: SYSTEMML-1242
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1242
> Project: SystemML
>  Issue Type: Bug
>  Components: Test
>Affects Versions: SystemML 0.13
> Environment: spark 2.1.0
>Reporter: Imran Younus
> Fix For: SystemML 0.13
>
> Attachments: sparkDML.sh
>
>
> when running {{runMultiLogReg.sh}} script, {{MultiLogReg.dml}} ends with 
> OutOfMemory error for the case of 10M_1K sparse data and {{icpt = 1}}. Here 
> is the end of the log file:
> {code}
> 17/02/04 17:20:33 INFO api.DMLScript: SystemML Statistics:
> Total elapsed time:   697.694 sec.
> Total compilation time:   2.543 sec.
> Total execution time: 695.151 sec.
> Number of compiled Spark inst:73.
> Number of executed Spark inst:16.
> Cache hits (Mem, WB, FS, HDFS):   46/9/1/7.
> Cache writes (WB, FS, HDFS):  27/1/1.
> Cache times (ACQr/m, RLS, EXP):   281.541/0.003/131.589/48.737 sec.
> HOP DAGs recompiled (PRED, SB):   0/15.
> HOP DAGs recompile time:  0.067 sec.
> Spark ctx create time (lazy): 31.078 sec.
> Spark trans counts (par,bc,col):5/4/0.
> Spark trans times (par,bc,col):   46.748/0.392/0.000 secs.
> Total JIT compile time:   151.254 sec.
> Total JVM GC count:   144.
> Total JVM GC time:220.671 sec.
> Heavy hitter instructions (name, time, count):
> -- 1) ba+*144.194 sec 3
> -- 2) rand109.939 sec 9
> -- 3) uark+   105.011 sec 2
> -- 4) r'  100.933 sec 3
> -- 5) sp_/80.387 sec  1
> -- 6) sp_mapmm45.491 sec  2
> -- 7) sp_tak+*40.655 sec  1
> -- 8) append  9.480 sec   1
> -- 9) rangeReIndex7.347 sec   2
> -- 10)sp_-6.392 sec   3
> 17/02/04 17:20:33 INFO api.DMLScript: END DML run 02/04/2017 17:20:33
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:363)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:339)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlockUnsafe(MatrixBlock.java:408)
>   at 
> org.apache.sysml.runtime.io.MatrixReader.createOutputMatrixBlock(MatrixReader.java:107)
>   at 
> org.apache.sysml.runtime.io.ReaderBinaryBlockParallel.readMatrixFromHDFS(ReaderBinaryBlockParallel.java:59)
>   at 
> org.apache.sysml.runtime.util.DataConverter.readMatrixFromHDFS(DataConverter.java:203)
>   at 
> org.apache.sysml.runtime.util.DataConverter.readMatrixFromHDFS(DataConverter.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromHDFS(MatrixObject.java:425)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromHDFS(MatrixObject.java:60)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.readBlobFromHDFS(CacheableData.java:920)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromRDD(MatrixObject.java:478)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromRDD(MatrixObject.java:60)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.acquireRead(CacheableData.java:411)
>   at 
> org.apache.sysml.runtime.controlprogram.context.ExecutionContext.getMatrixInput(ExecutionContext.java:209)
>   at 
> org.apache.sysml.runtime.instructions.cp.AggregateBinaryCPInstruction.processInstruction(AggregateBinaryCPInstruction.java:74)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.IfProgramBlock.execute(IfProgramBlock.java:139)
>   at 
> org.apache.sysml.runtime.controlprogram.WhileProgramBlock.execute(WhileProgramBlock.java:165)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   at org.apache.sysml.api.DMLScript.execute(DMLScript.java:684)
>   at org.apache.sysml.api.DMLScript.executeScript(DMLScript.jav

[jira] [Commented] (SYSTEMML-1250) Binary artifact missing antlr-runtime and wink-json4j classes

2017-02-14 Thread Glenn Weidner (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866231#comment-15866231
 ] 

Glenn Weidner commented on SYSTEMML-1250:
-

I'm unable to reproduce the missing classes behavior when building locally with 
distribution profile enabled.  Also the latest artifacts (e.g, 
systemml-0.13.0-incubating-SNAPSHOT-bin.tgz, 
systemml-0.13.0-incubating-SNAPSHOT.jar) produced from nightly build at 
https://sparktc.ibmcloud.com/repo/latest/ include the antlr-runtime and 
wink-json4j classes inside systemml-0.13.0-incubating-SNAPSHOT.jar that is 
bundled in systemml-0.13.0-incubating-SNAPSHOT-bin.tgz.

> Binary artifact missing antlr-runtime and wink-json4j classes
> -
>
> Key: SYSTEMML-1250
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1250
> Project: SystemML
>  Issue Type: Bug
>  Components: Build
>Reporter: Glenn Weidner
>
> The -bin artifact (both 0.11 and 0.12) are missing org/antlr/v4/runtime and 
> org/apache/wink/json4j classes.  Since the -bin has a lib folder, the 
> corresponding jars can be included there.  For comparison, these classes are 
> included in systemml-0.12.0-incubating.jar at 
> https://repository.apache.org/content/repositories/releases/org/apache/systemml/systemml/0.12.0-incubating/,
>  and although there is a jar by that same name inside the -bin artifact, it 
> does not include the classes.  Similar content observed for 
> systemml-0.11.0-incubating.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-540) Deep Learning

2017-02-14 Thread Xin Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866412#comment-15866412
 ] 

Xin Wu commented on SYSTEMML-540:
-

[~mwdus...@us.ibm.com] Is this Epic for post-0.13 version? Thanks!

> Deep Learning
> -
>
> Key: SYSTEMML-540
> URL: https://issues.apache.org/jira/browse/SYSTEMML-540
> Project: SystemML
>  Issue Type: Epic
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
>
> This epic covers the addition of deep learning to SystemML, including:
> * Core DML layer abstractions for deep (convolutional, recurrent) neural 
> nets, with simple forward/backward API: affine, convolution (start with 2D), 
> max-pooling, non-linearities (relu, sigmoid, softmax), dropout, loss 
> functions.
> * Modularized DML optimizers: (mini-batch, stochastic) gradient descent (w/ 
> momentum, etc.).
> * Additional DML language support as necessary (tensors, built-in functions 
> such as convolution, function pointers, list structures, etc.).
> * Integration with other deep learning frameworks (Caffe, Torch, Theano, 
> TensoFlow, etc.) via automatic DML code generation.
> * etc.
> ---
> *Plan*:
> \[*DONE*\] Phase 1:  *MVPs*
> * Create mathematically correct DML deep learning library for running basic 
> feed-forward and convolutional neural nets on a singlenode.
> * Create mathematically correct built-in operators for convolution and max 
> pooling for singlenode operation.
> \[*CURRENT*\] Phase 2:  *Singlenode*
> * Improve performance of DML deep learning library in singlenode operation.
> * Expand DML deep learning library to include additional commonly-used 
> layers, such as RNNs and LSTMs, as well as additional optimizers.
> * Improve built-in operators for convolution and max pooling to be highly 
> performant in singlenode operation.
> * Implement performant GPU acceleration for built-in operators (and 
> end-to-end deep learning algorithms) in singlenode operation.
> * Add general engine improvements to improve bottlenecks, such as 
> left-indexing within DML-bodied functions.
> * Add end-to-end deep learning algorithm examples, such as a "LeNet" 
> convolutional neural net.
> Phase 3: *Distributed*
> * Expand deep learning support to include *distributed operations* with large 
> models.  This includes improvements to the DML deep learning library, the 
> built-in operators, the GPU acceleration, and general engine improvements.
> Phase 4: *APIs/Wrappers*
> * Explore integration with Caffe, creating a SystemML interpreter for Caffe 
> model definitions.
> * Explore integration with Keras, creating a SystemML backend for Keras.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-540) Deep Learning

2017-02-14 Thread Mike Dusenberry (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Dusenberry updated SYSTEMML-540:
-
Fix Version/s: SystemML 0.10
   SystemML 0.13
   SystemML 0.12
   SystemML 1.0
   SystemML 0.11

> Deep Learning
> -
>
> Key: SYSTEMML-540
> URL: https://issues.apache.org/jira/browse/SYSTEMML-540
> Project: SystemML
>  Issue Type: Epic
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Fix For: SystemML 0.10, SystemML 0.11, SystemML 1.0, SystemML 
> 0.12, SystemML 0.13
>
>
> This epic covers the addition of deep learning to SystemML, including:
> * Core DML layer abstractions for deep (convolutional, recurrent) neural 
> nets, with simple forward/backward API: affine, convolution (start with 2D), 
> max-pooling, non-linearities (relu, sigmoid, softmax), dropout, loss 
> functions.
> * Modularized DML optimizers: (mini-batch, stochastic) gradient descent (w/ 
> momentum, etc.).
> * Additional DML language support as necessary (tensors, built-in functions 
> such as convolution, function pointers, list structures, etc.).
> * Integration with other deep learning frameworks (Caffe, Torch, Theano, 
> TensoFlow, etc.) via automatic DML code generation.
> * etc.
> ---
> *Plan*:
> \[*DONE*\] Phase 1:  *MVPs*
> * Create mathematically correct DML deep learning library for running basic 
> feed-forward and convolutional neural nets on a singlenode.
> * Create mathematically correct built-in operators for convolution and max 
> pooling for singlenode operation.
> \[*CURRENT*\] Phase 2:  *Singlenode*
> * Improve performance of DML deep learning library in singlenode operation.
> * Expand DML deep learning library to include additional commonly-used 
> layers, such as RNNs and LSTMs, as well as additional optimizers.
> * Improve built-in operators for convolution and max pooling to be highly 
> performant in singlenode operation.
> * Implement performant GPU acceleration for built-in operators (and 
> end-to-end deep learning algorithms) in singlenode operation.
> * Add general engine improvements to improve bottlenecks, such as 
> left-indexing within DML-bodied functions.
> * Add end-to-end deep learning algorithm examples, such as a "LeNet" 
> convolutional neural net.
> Phase 3: *Distributed*
> * Expand deep learning support to include *distributed operations* with large 
> models.  This includes improvements to the DML deep learning library, the 
> built-in operators, the GPU acceleration, and general engine improvements.
> Phase 4: *APIs/Wrappers*
> * Explore integration with Caffe, creating a SystemML interpreter for Caffe 
> model definitions.
> * Explore integration with Keras, creating a SystemML backend for Keras.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (SYSTEMML-1261) Fix transitive Spark execution type selection for ba+*

2017-02-14 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm resolved SYSTEMML-1261.
--
Resolution: Fixed

> Fix transitive Spark execution type selection for ba+*
> --
>
> Key: SYSTEMML-1261
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1261
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Test
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.13
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-540) Deep Learning

2017-02-14 Thread Mike Dusenberry (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866464#comment-15866464
 ] 

Mike Dusenberry commented on SYSTEMML-540:
--

[~xwu0226] This Epic officially has spanned versions 0.10, 0.11, 0.12, and will 
be worked on for the upcoming 0.13 release and still going into 1.0.  I've 
updated the {{Fix Version}} accordingly.

> Deep Learning
> -
>
> Key: SYSTEMML-540
> URL: https://issues.apache.org/jira/browse/SYSTEMML-540
> Project: SystemML
>  Issue Type: Epic
>Affects Versions: SystemML 0.10, SystemML 0.11, SystemML 1.0, SystemML 
> 0.12, SystemML 0.13
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Fix For: SystemML 0.10, SystemML 0.11, SystemML 1.0, SystemML 
> 0.12, SystemML 0.13
>
>
> This epic covers the addition of deep learning to SystemML, including:
> * Core DML layer abstractions for deep (convolutional, recurrent) neural 
> nets, with simple forward/backward API: affine, convolution (start with 2D), 
> max-pooling, non-linearities (relu, sigmoid, softmax), dropout, loss 
> functions.
> * Modularized DML optimizers: (mini-batch, stochastic) gradient descent (w/ 
> momentum, etc.).
> * Additional DML language support as necessary (tensors, built-in functions 
> such as convolution, function pointers, list structures, etc.).
> * Integration with other deep learning frameworks (Caffe, Torch, Theano, 
> TensoFlow, etc.) via automatic DML code generation.
> * etc.
> ---
> *Plan*:
> \[*DONE*\] Phase 1:  *MVPs*
> * Create mathematically correct DML deep learning library for running basic 
> feed-forward and convolutional neural nets on a singlenode.
> * Create mathematically correct built-in operators for convolution and max 
> pooling for singlenode operation.
> \[*CURRENT*\] Phase 2:  *Singlenode*
> * Improve performance of DML deep learning library in singlenode operation.
> * Expand DML deep learning library to include additional commonly-used 
> layers, such as RNNs and LSTMs, as well as additional optimizers.
> * Improve built-in operators for convolution and max pooling to be highly 
> performant in singlenode operation.
> * Implement performant GPU acceleration for built-in operators (and 
> end-to-end deep learning algorithms) in singlenode operation.
> * Add general engine improvements to improve bottlenecks, such as 
> left-indexing within DML-bodied functions.
> * Add end-to-end deep learning algorithm examples, such as a "LeNet" 
> convolutional neural net.
> Phase 3: *Distributed*
> * Expand deep learning support to include *distributed operations* with large 
> models.  This includes improvements to the DML deep learning library, the 
> built-in operators, the GPU acceleration, and general engine improvements.
> Phase 4: *APIs/Wrappers*
> * Explore integration with Caffe, creating a SystemML interpreter for Caffe 
> model definitions.
> * Explore integration with Keras, creating a SystemML backend for Keras.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-540) Deep Learning

2017-02-14 Thread Mike Dusenberry (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Dusenberry updated SYSTEMML-540:
-
Affects Version/s: SystemML 0.13
   SystemML 0.12
   SystemML 1.0
   SystemML 0.11
   SystemML 0.10

> Deep Learning
> -
>
> Key: SYSTEMML-540
> URL: https://issues.apache.org/jira/browse/SYSTEMML-540
> Project: SystemML
>  Issue Type: Epic
>Affects Versions: SystemML 0.10, SystemML 0.11, SystemML 1.0, SystemML 
> 0.12, SystemML 0.13
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Fix For: SystemML 0.10, SystemML 0.11, SystemML 1.0, SystemML 
> 0.12, SystemML 0.13
>
>
> This epic covers the addition of deep learning to SystemML, including:
> * Core DML layer abstractions for deep (convolutional, recurrent) neural 
> nets, with simple forward/backward API: affine, convolution (start with 2D), 
> max-pooling, non-linearities (relu, sigmoid, softmax), dropout, loss 
> functions.
> * Modularized DML optimizers: (mini-batch, stochastic) gradient descent (w/ 
> momentum, etc.).
> * Additional DML language support as necessary (tensors, built-in functions 
> such as convolution, function pointers, list structures, etc.).
> * Integration with other deep learning frameworks (Caffe, Torch, Theano, 
> TensoFlow, etc.) via automatic DML code generation.
> * etc.
> ---
> *Plan*:
> \[*DONE*\] Phase 1:  *MVPs*
> * Create mathematically correct DML deep learning library for running basic 
> feed-forward and convolutional neural nets on a singlenode.
> * Create mathematically correct built-in operators for convolution and max 
> pooling for singlenode operation.
> \[*CURRENT*\] Phase 2:  *Singlenode*
> * Improve performance of DML deep learning library in singlenode operation.
> * Expand DML deep learning library to include additional commonly-used 
> layers, such as RNNs and LSTMs, as well as additional optimizers.
> * Improve built-in operators for convolution and max pooling to be highly 
> performant in singlenode operation.
> * Implement performant GPU acceleration for built-in operators (and 
> end-to-end deep learning algorithms) in singlenode operation.
> * Add general engine improvements to improve bottlenecks, such as 
> left-indexing within DML-bodied functions.
> * Add end-to-end deep learning algorithm examples, such as a "LeNet" 
> convolutional neural net.
> Phase 3: *Distributed*
> * Expand deep learning support to include *distributed operations* with large 
> models.  This includes improvements to the DML deep learning library, the 
> built-in operators, the GPU acceleration, and general engine improvements.
> Phase 4: *APIs/Wrappers*
> * Explore integration with Caffe, creating a SystemML interpreter for Caffe 
> model definitions.
> * Explore integration with Keras, creating a SystemML backend for Keras.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-540) Deep Learning

2017-02-14 Thread Mike Dusenberry (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Dusenberry updated SYSTEMML-540:
-
Component/s: Runtime
 Parser
 Compiler
 Algorithms

> Deep Learning
> -
>
> Key: SYSTEMML-540
> URL: https://issues.apache.org/jira/browse/SYSTEMML-540
> Project: SystemML
>  Issue Type: Epic
>  Components: Algorithms, Compiler, Parser, Runtime
>Affects Versions: SystemML 0.10, SystemML 0.11, SystemML 1.0, SystemML 
> 0.12, SystemML 0.13
>Reporter: Mike Dusenberry
>Assignee: Mike Dusenberry
> Fix For: SystemML 0.10, SystemML 0.11, SystemML 1.0, SystemML 
> 0.12, SystemML 0.13
>
>
> This epic covers the addition of deep learning to SystemML, including:
> * Core DML layer abstractions for deep (convolutional, recurrent) neural 
> nets, with simple forward/backward API: affine, convolution (start with 2D), 
> max-pooling, non-linearities (relu, sigmoid, softmax), dropout, loss 
> functions.
> * Modularized DML optimizers: (mini-batch, stochastic) gradient descent (w/ 
> momentum, etc.).
> * Additional DML language support as necessary (tensors, built-in functions 
> such as convolution, function pointers, list structures, etc.).
> * Integration with other deep learning frameworks (Caffe, Torch, Theano, 
> TensoFlow, etc.) via automatic DML code generation.
> * etc.
> ---
> *Plan*:
> \[*DONE*\] Phase 1:  *MVPs*
> * Create mathematically correct DML deep learning library for running basic 
> feed-forward and convolutional neural nets on a singlenode.
> * Create mathematically correct built-in operators for convolution and max 
> pooling for singlenode operation.
> \[*CURRENT*\] Phase 2:  *Singlenode*
> * Improve performance of DML deep learning library in singlenode operation.
> * Expand DML deep learning library to include additional commonly-used 
> layers, such as RNNs and LSTMs, as well as additional optimizers.
> * Improve built-in operators for convolution and max pooling to be highly 
> performant in singlenode operation.
> * Implement performant GPU acceleration for built-in operators (and 
> end-to-end deep learning algorithms) in singlenode operation.
> * Add general engine improvements to improve bottlenecks, such as 
> left-indexing within DML-bodied functions.
> * Add end-to-end deep learning algorithm examples, such as a "LeNet" 
> convolutional neural net.
> Phase 3: *Distributed*
> * Expand deep learning support to include *distributed operations* with large 
> models.  This includes improvements to the DML deep learning library, the 
> built-in operators, the GPU acceleration, and general engine improvements.
> Phase 4: *APIs/Wrappers*
> * Explore integration with Caffe, creating a SystemML interpreter for Caffe 
> model definitions.
> * Explore integration with Keras, creating a SystemML backend for Keras.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1242) Perftest: OutOfMemoryError in MultiLogReg for 80g sparse

2017-02-14 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866480#comment-15866480
 ] 

Matthias Boehm commented on SYSTEMML-1242:
--

[~gweidner] do you still have the logs from the runs for our 0.12 release? 
Looking over the results from SYSTEMML-1217 it appears that the tests for 
intercept 1 and 2 failed there to.

> Perftest: OutOfMemoryError in MultiLogReg for 80g sparse
> 
>
> Key: SYSTEMML-1242
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1242
> Project: SystemML
>  Issue Type: Bug
>  Components: Test
>Affects Versions: SystemML 0.13
> Environment: spark 2.1.0
>Reporter: Imran Younus
> Fix For: SystemML 0.13
>
> Attachments: sparkDML.sh
>
>
> when running {{runMultiLogReg.sh}} script, {{MultiLogReg.dml}} ends with 
> OutOfMemory error for the case of 10M_1K sparse data and {{icpt = 1}}. Here 
> is the end of the log file:
> {code}
> 17/02/04 17:20:33 INFO api.DMLScript: SystemML Statistics:
> Total elapsed time:   697.694 sec.
> Total compilation time:   2.543 sec.
> Total execution time: 695.151 sec.
> Number of compiled Spark inst:73.
> Number of executed Spark inst:16.
> Cache hits (Mem, WB, FS, HDFS):   46/9/1/7.
> Cache writes (WB, FS, HDFS):  27/1/1.
> Cache times (ACQr/m, RLS, EXP):   281.541/0.003/131.589/48.737 sec.
> HOP DAGs recompiled (PRED, SB):   0/15.
> HOP DAGs recompile time:  0.067 sec.
> Spark ctx create time (lazy): 31.078 sec.
> Spark trans counts (par,bc,col):5/4/0.
> Spark trans times (par,bc,col):   46.748/0.392/0.000 secs.
> Total JIT compile time:   151.254 sec.
> Total JVM GC count:   144.
> Total JVM GC time:220.671 sec.
> Heavy hitter instructions (name, time, count):
> -- 1) ba+*144.194 sec 3
> -- 2) rand109.939 sec 9
> -- 3) uark+   105.011 sec 2
> -- 4) r'  100.933 sec 3
> -- 5) sp_/80.387 sec  1
> -- 6) sp_mapmm45.491 sec  2
> -- 7) sp_tak+*40.655 sec  1
> -- 8) append  9.480 sec   1
> -- 9) rangeReIndex7.347 sec   2
> -- 10)sp_-6.392 sec   3
> 17/02/04 17:20:33 INFO api.DMLScript: END DML run 02/04/2017 17:20:33
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:363)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:339)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlockUnsafe(MatrixBlock.java:408)
>   at 
> org.apache.sysml.runtime.io.MatrixReader.createOutputMatrixBlock(MatrixReader.java:107)
>   at 
> org.apache.sysml.runtime.io.ReaderBinaryBlockParallel.readMatrixFromHDFS(ReaderBinaryBlockParallel.java:59)
>   at 
> org.apache.sysml.runtime.util.DataConverter.readMatrixFromHDFS(DataConverter.java:203)
>   at 
> org.apache.sysml.runtime.util.DataConverter.readMatrixFromHDFS(DataConverter.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromHDFS(MatrixObject.java:425)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromHDFS(MatrixObject.java:60)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.readBlobFromHDFS(CacheableData.java:920)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromRDD(MatrixObject.java:478)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromRDD(MatrixObject.java:60)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.acquireRead(CacheableData.java:411)
>   at 
> org.apache.sysml.runtime.controlprogram.context.ExecutionContext.getMatrixInput(ExecutionContext.java:209)
>   at 
> org.apache.sysml.runtime.instructions.cp.AggregateBinaryCPInstruction.processInstruction(AggregateBinaryCPInstruction.java:74)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.IfProgramBlock.execute(IfProgramBlock.java:139)
>   at 
> org.apache.sysml.runtime.controlprogram.WhileProgramBlock.execute(WhileProgramBlock.java:165)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   at org.apache.sysml.api.DMLScript.execute(DMLScript.java:684)

[jira] [Assigned] (SYSTEMML-446) Phase 1: Exploit GPU BLAS libraries (integration)

2017-02-14 Thread Niketan Pansare (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niketan Pansare reassigned SYSTEMML-446:


Assignee: Niketan Pansare  (was: Tanuj Kr Aasawat)

> Phase 1: Exploit GPU BLAS libraries (integration)
> -
>
> Key: SYSTEMML-446
> URL: https://issues.apache.org/jira/browse/SYSTEMML-446
> Project: SystemML
>  Issue Type: Task
>  Components: Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Niketan Pansare
> Fix For: SystemML 0.11
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (SYSTEMML-940) Add support for cusparse geam

2017-02-14 Thread Niketan Pansare (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niketan Pansare reassigned SYSTEMML-940:


Assignee: Nakul Jindal

> Add support for cusparse geam 
> --
>
> Key: SYSTEMML-940
> URL: https://issues.apache.org/jira/browse/SYSTEMML-940
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Niketan Pansare
>Assignee: Nakul Jindal
> Fix For: SystemML 0.11
>
>
> Sparse dgeam functionality is required for LibMatrixCUDA's transpose() and 
> matScalarElementwiseMultiplyDivide(), cellwiseMatMatAddSubtract()
> [~nakul02]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-931) Error while allocating CSRPointer

2017-02-14 Thread Niketan Pansare (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866487#comment-15866487
 ] 

Niketan Pansare commented on SYSTEMML-931:
--

[~nakul02] I believe this issue has been fixed. Can you please confirm ?

> Error while allocating CSRPointer
> -
>
> Key: SYSTEMML-931
> URL: https://issues.apache.org/jira/browse/SYSTEMML-931
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: Niketan Pansare
>
> org.apache.sysml.test.integration.functions.reorg.FullTransposeTest's 
> testTransposeRowVectorSparseSP() and testTransposeMatrixSparseSP() test cases 
> are failing while allocating CSRPointer.
> Caused by: jcuda.CudaException: cudaErrorMemoryAllocation
>   at jcuda.runtime.JCuda.checkResult(JCuda.java:437)
>   at jcuda.runtime.JCuda.cudaMalloc(JCuda.java:3811)
>   at 
> org.apache.sysml.runtime.instructions.gpu.context.JCudaObject$CSRPointer.allocateEmpty(JCudaObject.java:156)
>   at 
> org.apache.sysml.runtime.instructions.gpu.context.JCudaObject.allocateMemoryOnDevice(JCudaObject.java:464)
> [~nakul02]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1250) Binary artifact missing antlr-runtime and wink-json4j classes

2017-02-14 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866488#comment-15866488
 ] 

Felix Schüler commented on SYSTEMML-1250:
-

This is a major issue for anyone wanting to try our standalone/bin release and 
we should probably release a 12.1 with those classes included ASAP. Currently 
the release is not usable without manually adding those dependencies.

> Binary artifact missing antlr-runtime and wink-json4j classes
> -
>
> Key: SYSTEMML-1250
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1250
> Project: SystemML
>  Issue Type: Bug
>  Components: Build
>Reporter: Glenn Weidner
>
> The -bin artifact (both 0.11 and 0.12) are missing org/antlr/v4/runtime and 
> org/apache/wink/json4j classes.  Since the -bin has a lib folder, the 
> corresponding jars can be included there.  For comparison, these classes are 
> included in systemml-0.12.0-incubating.jar at 
> https://repository.apache.org/content/repositories/releases/org/apache/systemml/systemml/0.12.0-incubating/,
>  and although there is a jar by that same name inside the -bin artifact, it 
> does not include the classes.  Similar content observed for 
> systemml-0.11.0-incubating.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (SYSTEMML-941) Add support for cusparse axpy

2017-02-14 Thread Niketan Pansare (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niketan Pansare reassigned SYSTEMML-941:


Assignee: Nakul Jindal

> Add support for cusparse axpy
> -
>
> Key: SYSTEMML-941
> URL: https://issues.apache.org/jira/browse/SYSTEMML-941
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Niketan Pansare
>Assignee: Nakul Jindal
> Fix For: SystemML 0.11
>
>
> See LibMatrixCUDA's vectorScalarMultiply()
> [~nakul02]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-941) Add support for cusparse axpy

2017-02-14 Thread Niketan Pansare (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866491#comment-15866491
 ] 

Niketan Pansare commented on SYSTEMML-941:
--

[~nakul02] Are you OK with closing this JIRA as resolved ?

> Add support for cusparse axpy
> -
>
> Key: SYSTEMML-941
> URL: https://issues.apache.org/jira/browse/SYSTEMML-941
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Niketan Pansare
>Assignee: Nakul Jindal
> Fix For: SystemML 0.11
>
>
> See LibMatrixCUDA's vectorScalarMultiply()
> [~nakul02]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-940) Add support for cusparse geam

2017-02-14 Thread Niketan Pansare (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866490#comment-15866490
 ] 

Niketan Pansare commented on SYSTEMML-940:
--

[~nakul02] Are you OK with closing this JIRA as resolved ?

> Add support for cusparse geam 
> --
>
> Key: SYSTEMML-940
> URL: https://issues.apache.org/jira/browse/SYSTEMML-940
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Niketan Pansare
>Assignee: Nakul Jindal
> Fix For: SystemML 0.11
>
>
> Sparse dgeam functionality is required for LibMatrixCUDA's transpose() and 
> matScalarElementwiseMultiplyDivide(), cellwiseMatMatAddSubtract()
> [~nakul02]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-940) Add support for cusparse geam

2017-02-14 Thread Nakul Jindal (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866493#comment-15866493
 ] 

Nakul Jindal commented on SYSTEMML-940:
---

Yes, please go ahead.

> Add support for cusparse geam 
> --
>
> Key: SYSTEMML-940
> URL: https://issues.apache.org/jira/browse/SYSTEMML-940
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Niketan Pansare
>Assignee: Nakul Jindal
> Fix For: SystemML 0.11
>
>
> Sparse dgeam functionality is required for LibMatrixCUDA's transpose() and 
> matScalarElementwiseMultiplyDivide(), cellwiseMatMatAddSubtract()
> [~nakul02]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (SYSTEMML-935) Improve the performance of sparse TSMM either by using/implement sparse dsyrk

2017-02-14 Thread Niketan Pansare (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niketan Pansare reassigned SYSTEMML-935:


Assignee: Nakul Jindal

> Improve the performance of sparse TSMM either by using/implement sparse dsyrk
> -
>
> Key: SYSTEMML-935
> URL: https://issues.apache.org/jira/browse/SYSTEMML-935
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: Niketan Pansare
>Assignee: Nakul Jindal
>
> Either by adding custom kernel or using cuSparse API.
> See org.apache.sysml.runtime.matrix.data.LibMatrixCUDA's matmultTSMM() 
> method. Please move this to Phase 2 if custom kernel is required
> [~nakul02]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-941) Add support for cusparse axpy

2017-02-14 Thread Nakul Jindal (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866497#comment-15866497
 ] 

Nakul Jindal commented on SYSTEMML-941:
---

Yes

> Add support for cusparse axpy
> -
>
> Key: SYSTEMML-941
> URL: https://issues.apache.org/jira/browse/SYSTEMML-941
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Niketan Pansare
>Assignee: Nakul Jindal
> Fix For: SystemML 0.11
>
>
> See LibMatrixCUDA's vectorScalarMultiply()
> [~nakul02]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (SYSTEMML-931) Error while allocating CSRPointer

2017-02-14 Thread Nakul Jindal (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nakul Jindal reassigned SYSTEMML-931:
-

Assignee: Nakul Jindal

> Error while allocating CSRPointer
> -
>
> Key: SYSTEMML-931
> URL: https://issues.apache.org/jira/browse/SYSTEMML-931
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: Niketan Pansare
>Assignee: Nakul Jindal
>
> org.apache.sysml.test.integration.functions.reorg.FullTransposeTest's 
> testTransposeRowVectorSparseSP() and testTransposeMatrixSparseSP() test cases 
> are failing while allocating CSRPointer.
> Caused by: jcuda.CudaException: cudaErrorMemoryAllocation
>   at jcuda.runtime.JCuda.checkResult(JCuda.java:437)
>   at jcuda.runtime.JCuda.cudaMalloc(JCuda.java:3811)
>   at 
> org.apache.sysml.runtime.instructions.gpu.context.JCudaObject$CSRPointer.allocateEmpty(JCudaObject.java:156)
>   at 
> org.apache.sysml.runtime.instructions.gpu.context.JCudaObject.allocateMemoryOnDevice(JCudaObject.java:464)
> [~nakul02]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (SYSTEMML-1039) Add support for aggregate unary operations on GPU

2017-02-14 Thread Nakul Jindal (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nakul Jindal reassigned SYSTEMML-1039:
--

Assignee: Nakul Jindal

> Add support for aggregate unary operations on GPU
> -
>
> Key: SYSTEMML-1039
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1039
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
> Environment: 1. Update UnaryOp to enable GPU for these instructions.
> 2. Add AggregateUnaryGPUInstruction and invoke kernels corresponding to 
> https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/apache/sysml/runtime/matrix/data/MatrixBlock.java#L4451
> Note nrow, ncol and length should always be a CP instruction. Also, we can 
> break this into subtasks for sparse if necessary.
> The opcodes related to this tasks are: uak+, uark+, uack+, uasqk+, uarsqk+, 
> uacsqk+, uamean, uarmean, uacmean, uavar, uarvar, uacvar, uamax, uarmax, 
> uarimax, uacmax, uamin, uarmin, uarimin, uacmin, ua+, uar+, uac+, ua*, 
> uatrace, uaktrace and the operations of these are (with/without kahan) sum, 
> trace, rowsum, colsum, sum of squares, row sum of squares, col sum of 
> squares, cumsum, cumprod, cummin, cummax, min, max, min_index, max_index, 
> mean, variance and product.
> [~nakul02] This is a mega-task. Hopefully we can generalize our kernel 
> implementation to make this easier.
>Reporter: Niketan Pansare
>Assignee: Nakul Jindal
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-939) LibMatrixCUDA's vectorScalarMultiply() produces incorrect results.

2017-02-14 Thread Niketan Pansare (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866496#comment-15866496
 ] 

Niketan Pansare commented on SYSTEMML-939:
--

[~nakul02] I believe this issue has been fixed. Can you please confirm ?

> LibMatrixCUDA's vectorScalarMultiply() produces incorrect results.
> --
>
> Key: SYSTEMML-939
> URL: https://issues.apache.org/jira/browse/SYSTEMML-939
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: Niketan Pansare
>
> Please use org.apache.sysml.test.integration.functions.aggregate.MinTest's 
> testGeneral() and uncomment lines 79 in MatrixScalarArithmeticGPUInstruction 
> once it is tested.
> [~tanuj]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (SYSTEMML-1228) SparkListener signatures have changed from Spark 2.0 to 2.1 and fail project build

2017-02-14 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felix Schüler closed SYSTEMML-1228.
---
   Resolution: Fixed
 Assignee: Niketan Pansare  (was: Felix Schüler)
Fix Version/s: SystemML 0.13

Resolved by https://github.com/apache/incubator-systemml/pull/376

> SparkListener signatures have changed from Spark 2.0 to 2.1 and fail project 
> build
> --
>
> Key: SYSTEMML-1228
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1228
> Project: SystemML
>  Issue Type: Bug
>Reporter: Felix Schüler
>Assignee: Niketan Pansare
> Fix For: SystemML 0.13
>
>
> The method signatures in SparkListener have changed and fail our compilation. 
> This uses methods from a Spark internal class and is therefore not backwards 
> compatible 2.1 --> 2.0.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (SYSTEMML-1242) Perftest: OutOfMemoryError in MultiLogReg for 80g sparse

2017-02-14 Thread Imran Younus (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Imran Younus reassigned SYSTEMML-1242:
--

Assignee: Matthias Boehm

> Perftest: OutOfMemoryError in MultiLogReg for 80g sparse
> 
>
> Key: SYSTEMML-1242
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1242
> Project: SystemML
>  Issue Type: Bug
>  Components: Test
>Affects Versions: SystemML 0.13
> Environment: spark 2.1.0
>Reporter: Imran Younus
>Assignee: Matthias Boehm
> Fix For: SystemML 0.13
>
> Attachments: sparkDML.sh
>
>
> when running {{runMultiLogReg.sh}} script, {{MultiLogReg.dml}} ends with 
> OutOfMemory error for the case of 10M_1K sparse data and {{icpt = 1}}. Here 
> is the end of the log file:
> {code}
> 17/02/04 17:20:33 INFO api.DMLScript: SystemML Statistics:
> Total elapsed time:   697.694 sec.
> Total compilation time:   2.543 sec.
> Total execution time: 695.151 sec.
> Number of compiled Spark inst:73.
> Number of executed Spark inst:16.
> Cache hits (Mem, WB, FS, HDFS):   46/9/1/7.
> Cache writes (WB, FS, HDFS):  27/1/1.
> Cache times (ACQr/m, RLS, EXP):   281.541/0.003/131.589/48.737 sec.
> HOP DAGs recompiled (PRED, SB):   0/15.
> HOP DAGs recompile time:  0.067 sec.
> Spark ctx create time (lazy): 31.078 sec.
> Spark trans counts (par,bc,col):5/4/0.
> Spark trans times (par,bc,col):   46.748/0.392/0.000 secs.
> Total JIT compile time:   151.254 sec.
> Total JVM GC count:   144.
> Total JVM GC time:220.671 sec.
> Heavy hitter instructions (name, time, count):
> -- 1) ba+*144.194 sec 3
> -- 2) rand109.939 sec 9
> -- 3) uark+   105.011 sec 2
> -- 4) r'  100.933 sec 3
> -- 5) sp_/80.387 sec  1
> -- 6) sp_mapmm45.491 sec  2
> -- 7) sp_tak+*40.655 sec  1
> -- 8) append  9.480 sec   1
> -- 9) rangeReIndex7.347 sec   2
> -- 10)sp_-6.392 sec   3
> 17/02/04 17:20:33 INFO api.DMLScript: END DML run 02/04/2017 17:20:33
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:363)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:339)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlockUnsafe(MatrixBlock.java:408)
>   at 
> org.apache.sysml.runtime.io.MatrixReader.createOutputMatrixBlock(MatrixReader.java:107)
>   at 
> org.apache.sysml.runtime.io.ReaderBinaryBlockParallel.readMatrixFromHDFS(ReaderBinaryBlockParallel.java:59)
>   at 
> org.apache.sysml.runtime.util.DataConverter.readMatrixFromHDFS(DataConverter.java:203)
>   at 
> org.apache.sysml.runtime.util.DataConverter.readMatrixFromHDFS(DataConverter.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromHDFS(MatrixObject.java:425)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromHDFS(MatrixObject.java:60)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.readBlobFromHDFS(CacheableData.java:920)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromRDD(MatrixObject.java:478)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromRDD(MatrixObject.java:60)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.acquireRead(CacheableData.java:411)
>   at 
> org.apache.sysml.runtime.controlprogram.context.ExecutionContext.getMatrixInput(ExecutionContext.java:209)
>   at 
> org.apache.sysml.runtime.instructions.cp.AggregateBinaryCPInstruction.processInstruction(AggregateBinaryCPInstruction.java:74)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.IfProgramBlock.execute(IfProgramBlock.java:139)
>   at 
> org.apache.sysml.runtime.controlprogram.WhileProgramBlock.execute(WhileProgramBlock.java:165)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   at org.apache.sysml.api.DMLScript.execute(DMLScript.java:684)
>   at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:360)
>   at org.apache.sysml.api.DMLScript.main(DMLScript.java:221)
>   at sun.reflect.NativeMe

[jira] [Assigned] (SYSTEMML-1243) Perftest: OutOfMemoryError in stratstats.dml for 800MB case

2017-02-14 Thread Imran Younus (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Imran Younus reassigned SYSTEMML-1243:
--

Assignee: Matthias Boehm

> Perftest: OutOfMemoryError in stratstats.dml for 800MB case
> ---
>
> Key: SYSTEMML-1243
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1243
> Project: SystemML
>  Issue Type: Bug
>  Components: Test
>Affects Versions: SystemML 0.13
> Environment: spark 2.1.0
>Reporter: Imran Younus
>Assignee: Matthias Boehm
> Fix For: SystemML 0.13
>
> Attachments: sparkDML.sh
>
>
> when running {{runAllStats.sh}} script, {{stratstats.dml}} ends with 
> OutOfMemory error for 100k_1k data set. Here is end of log file:
> {code}
> 17/02/06 16:09:25 INFO api.DMLScript: SystemML Statistics:
> Total elapsed time:   1435.880 sec.
> Total compilation time:   2.433 sec.
> Total execution time: 1433.447 sec.
> Number of compiled Spark inst:190.
> Number of executed Spark inst:3.
> Cache hits (Mem, WB, FS, HDFS):   72343/3/4/7.
> Cache writes (WB, FS, HDFS):  10419/5/0.
> Cache times (ACQr/m, RLS, EXP):   387.598/0.039/277.658/0.000 sec.
> HOP DAGs recompiled (PRED, SB):   0/107.
> HOP DAGs recompile time:  0.207 sec.
> Functions recompiled: 3.
> Functions recompile time: 0.026 sec.
> Spark ctx create time (lazy): 36.537 sec.
> Spark trans counts (par,bc,col):3/3/0.
> Spark trans times (par,bc,col):   0.404/0.147/0.000 secs.
> Total JIT compile time:   63.262 sec.
> Total JVM GC count:   57.
> Total JVM GC time:34.538 sec.
> Heavy hitter instructions (name, time, count):
> -- 1) wdivmm  1078.568 sec5
> -- 2) ba+*286.854 sec 22
> -- 3) sp_mapmm37.244 sec  3
> -- 4) fStat_tailprob  2.071 sec   3
> -- 5) rangeReIndex1.608 sec   30601
> -- 6) ==  0.974 sec   11
> -- 7) ^2  0.793 sec   13
> -- 8) cdf 0.603 sec   10200
> -- 9) replace 0.349 sec   10
> -- 10)r'  0.278 sec   106
> 17/02/06 16:09:25 INFO api.DMLScript: END DML run 02/06/2017 16:09:25
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:363)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:339)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseOrSparseBlock(MatrixBlock.java:346)
>   at 
> org.apache.sysml.runtime.matrix.data.LibMatrixMult.matrixMultWDivMM(LibMatrixMult.java:752)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.quaternaryOperations(MatrixBlock.java:5475)
>   at 
> org.apache.sysml.runtime.instructions.cp.QuaternaryCPInstruction.processInstruction(QuaternaryCPInstruction.java:128)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   at org.apache.sysml.api.DMLScript.execute(DMLScript.java:684)
>   at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:360)
>   at org.apache.sysml.api.DMLScript.main(DMLScript.java:221)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
>   at 
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
>   at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
>   at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
>   at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> 17/02/06 16:09:27 INFO util.ShutdownHookManager: Shutdown hook called
> 17/02/06 16:09:27 INFO util.ShutdownHookManager: Deleting directory 
> /tmp/spark-6ca71fe9-1f44-4aa9-b57f-83e8ea0b1a33
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (SYSTEMML-1238) Python test failing for LinearRegCG

2017-02-14 Thread Imran Younus (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Imran Younus reassigned SYSTEMML-1238:
--

Assignee: Niketan Pansare

> Python test failing for LinearRegCG
> ---
>
> Key: SYSTEMML-1238
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1238
> Project: SystemML
>  Issue Type: Bug
>  Components: Algorithms, APIs
>Affects Versions: SystemML 0.13
>Reporter: Imran Younus
>Assignee: Niketan Pansare
> Attachments: python_LinearReg_test_spark.1.6.log, 
> python_LinearReg_test_spark.2.1.log
>
>
> [~deron] discovered that the one of the python test ({{test_mllearn_df.py}}) 
> with spark 2.1.0 was failing because the test score from linear regression 
> was very low ({{~ 0.24}}). I did a some investigation and it turns out the 
> the model parameters computed by the dml script are incorrect. In 
> systemml.12, the values of betas from linear regression model are 
> {{\[152.919, 938.237\]}}. This is what we expect from normal equation. (I 
> also tested this with sklearn). But the values of betas from systemml.13 
> (with spark 2.1.0) come out to be {{\[153.146, 458.489\]}}. These are not 
> correct and therefore the test score is much lower than expected. The data 
> going into DML script is correct. I printed out the valued of {{X}} and {{Y}} 
> in dml and I didn't see any issue there.
> Attached are the log files for two different tests (systemml0.12 and 0.13) 
> with explain flag.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1250) Binary artifact missing antlr-runtime and wink-json4j classes

2017-02-14 Thread Glenn Weidner (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866537#comment-15866537
 ] 

Glenn Weidner commented on SYSTEMML-1250:
-

Note the 0.12 artifacts in maven repository do not have this issue:
https://repository.apache.org/content/repositories/releases/org/apache/systemml/systemml/0.12.0-incubating/

> Binary artifact missing antlr-runtime and wink-json4j classes
> -
>
> Key: SYSTEMML-1250
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1250
> Project: SystemML
>  Issue Type: Bug
>  Components: Build
>Reporter: Glenn Weidner
>
> The -bin artifact (both 0.11 and 0.12) are missing org/antlr/v4/runtime and 
> org/apache/wink/json4j classes.  Since the -bin has a lib folder, the 
> corresponding jars can be included there.  For comparison, these classes are 
> included in systemml-0.12.0-incubating.jar at 
> https://repository.apache.org/content/repositories/releases/org/apache/systemml/systemml/0.12.0-incubating/,
>  and although there is a jar by that same name inside the -bin artifact, it 
> does not include the classes.  Similar content observed for 
> systemml-0.11.0-incubating.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1250) Binary artifact missing antlr-runtime and wink-json4j classes

2017-02-14 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866550#comment-15866550
 ] 

Felix Schüler commented on SYSTEMML-1250:
-

Interesting. Could we link to those? Or add a link with a note to download it 
from there?

> Binary artifact missing antlr-runtime and wink-json4j classes
> -
>
> Key: SYSTEMML-1250
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1250
> Project: SystemML
>  Issue Type: Bug
>  Components: Build
>Reporter: Glenn Weidner
>
> The -bin artifact (both 0.11 and 0.12) are missing org/antlr/v4/runtime and 
> org/apache/wink/json4j classes.  Since the -bin has a lib folder, the 
> corresponding jars can be included there.  For comparison, these classes are 
> included in systemml-0.12.0-incubating.jar at 
> https://repository.apache.org/content/repositories/releases/org/apache/systemml/systemml/0.12.0-incubating/,
>  and although there is a jar by that same name inside the -bin artifact, it 
> does not include the classes.  Similar content observed for 
> systemml-0.11.0-incubating.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SYSTEMML-1264) Improve error message for undefined functions

2017-02-14 Thread JIRA
Felix Schüler created SYSTEMML-1264:
---

 Summary: Improve error message for undefined functions
 Key: SYSTEMML-1264
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1264
 Project: SystemML
  Issue Type: Bug
Reporter: Felix Schüler
Priority: Minor


When using a function that isn't defined we currently throw a parse issue 
suggesting to create a tempvar instead of the LanguageException.

Consider the example script

{code}
a = 5
println("a is: " + 5)
{code}

This will throw the following parse issue:

{code}
17/02/14 12:11:08 INFO api.DMLScript: BEGIN DML run 02/14/2017 12:11:08
17/02/14 12:11:08 ERROR api.DMLScript: scripts/parseIssue.dml line 2:0 function 
call needs to have lvalue (Quickfix: change it to 'tmpVar = println(...)')

--
The following parse issue was encountered:
scripts/parseIssue.dml [line 2:0] [Validation error] -> println("a is: " + a)
   function call needs to have lvalue (Quickfix: change it to 'tmpVar = 
println(...)')
{code}

After doing that (adding tmpvar = ... in front) it throws the correct error:

{code}
17/02/14 12:12:55 INFO api.DMLScript: BEGIN DML run 02/14/2017 12:12:55
17/02/14 12:12:55 ERROR api.DMLScript: Failed to execute DML script.
org.apache.sysml.parser.LanguageException: ERROR: scripts/parseIssue.dml -- 
line 2, column 0 -- function println is undefined in namespace .defaultNS
at 
org.apache.sysml.parser.StatementBlock.isMergeableFunctionCallBlock(StatementBlock.java:187)
at 
org.apache.sysml.parser.StatementBlock.mergeFunctionCalls(StatementBlock.java:315)
at 
org.apache.sysml.parser.DMLTranslator.liveVariableAnalysis(DMLTranslator.java:209)
at org.apache.sysml.api.DMLScript.execute(DMLScript.java:599)
at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:360)
at org.apache.sysml.api.DMLScript.main(DMLScript.java:221)
Exception in thread "main" org.apache.sysml.api.DMLException: 
org.apache.sysml.parser.LanguageException: ERROR: scripts/parseIssue.dml -- 
line 2, column 0 -- function println is undefined in namespace .defaultNS
at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:374)
at org.apache.sysml.api.DMLScript.main(DMLScript.java:221)
Caused by: org.apache.sysml.parser.LanguageException: ERROR: 
scripts/parseIssue.dml -- line 2, column 0 -- function println is undefined in 
namespace .defaultNS
at 
org.apache.sysml.parser.StatementBlock.isMergeableFunctionCallBlock(StatementBlock.java:187)
at 
org.apache.sysml.parser.StatementBlock.mergeFunctionCalls(StatementBlock.java:315)
at 
org.apache.sysml.parser.DMLTranslator.liveVariableAnalysis(DMLTranslator.java:209)
at org.apache.sysml.api.DMLScript.execute(DMLScript.java:599)
at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:360)
... 1 more
{code}




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (SYSTEMML-941) Add support for cusparse axpy

2017-02-14 Thread Niketan Pansare (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niketan Pansare closed SYSTEMML-941.

   Resolution: Fixed
Fix Version/s: (was: SystemML 0.11)
   SystemML 0.12

> Add support for cusparse axpy
> -
>
> Key: SYSTEMML-941
> URL: https://issues.apache.org/jira/browse/SYSTEMML-941
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Niketan Pansare
>Assignee: Nakul Jindal
> Fix For: SystemML 0.12
>
>
> See LibMatrixCUDA's vectorScalarMultiply()
> [~nakul02]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (SYSTEMML-940) Add support for cusparse geam

2017-02-14 Thread Niketan Pansare (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niketan Pansare closed SYSTEMML-940.

   Resolution: Fixed
Fix Version/s: (was: SystemML 0.11)
   SystemML 0.12

> Add support for cusparse geam 
> --
>
> Key: SYSTEMML-940
> URL: https://issues.apache.org/jira/browse/SYSTEMML-940
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Niketan Pansare
>Assignee: Nakul Jindal
> Fix For: SystemML 0.12
>
>
> Sparse dgeam functionality is required for LibMatrixCUDA's transpose() and 
> matScalarElementwiseMultiplyDivide(), cellwiseMatMatAddSubtract()
> [~nakul02]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-935) Improve the performance of sparse TSMM either by using/implement sparse dsyrk

2017-02-14 Thread Niketan Pansare (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866611#comment-15866611
 ] 

Niketan Pansare commented on SYSTEMML-935:
--

[~nakul02] Can you please confirm if this JIRA can to be closed ?

> Improve the performance of sparse TSMM either by using/implement sparse dsyrk
> -
>
> Key: SYSTEMML-935
> URL: https://issues.apache.org/jira/browse/SYSTEMML-935
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: Niketan Pansare
>Assignee: Nakul Jindal
>
> Either by adding custom kernel or using cuSparse API.
> See org.apache.sysml.runtime.matrix.data.LibMatrixCUDA's matmultTSMM() 
> method. Please move this to Phase 2 if custom kernel is required
> [~nakul02]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-938) Make sparse memory estimation robust by handling unknown nnz.

2017-02-14 Thread Niketan Pansare (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866615#comment-15866615
 ] 

Niketan Pansare commented on SYSTEMML-938:
--

[~nakul02] Can you confirm if this issue is resolved ?

> Make sparse memory estimation robust by handling unknown nnz.
> -
>
> Key: SYSTEMML-938
> URL: https://issues.apache.org/jira/browse/SYSTEMML-938
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: Niketan Pansare
>
> What if CSRPointer.estimateSize(mat.getNnz(), mat.getNumRows()) ?
> [~nakul02]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (SYSTEMML-731) Conduct initial performance experiments for mat mult

2017-02-14 Thread Niketan Pansare (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niketan Pansare reassigned SYSTEMML-731:


Assignee: Nakul Jindal  (was: Niketan Pansare)

> Conduct initial performance experiments for mat mult
> 
>
> Key: SYSTEMML-731
> URL: https://issues.apache.org/jira/browse/SYSTEMML-731
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: Niketan Pansare
>Assignee: Nakul Jindal
> Fix For: SystemML 0.12
>
>
> Before the PR https://github.com/apache/incubator-systemml/pull/165 gets 
> merged, initial performance experiments needs to be conducted for dense-dense 
> mat mult.
> [~nakul02] [~mboehm7]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (SYSTEMML-731) Conduct initial performance experiments for mat mult

2017-02-14 Thread Niketan Pansare (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niketan Pansare closed SYSTEMML-731.

   Resolution: Fixed
Fix Version/s: SystemML 0.12

> Conduct initial performance experiments for mat mult
> 
>
> Key: SYSTEMML-731
> URL: https://issues.apache.org/jira/browse/SYSTEMML-731
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: Niketan Pansare
>Assignee: Nakul Jindal
> Fix For: SystemML 0.12
>
>
> Before the PR https://github.com/apache/incubator-systemml/pull/165 gets 
> merged, initial performance experiments needs to be conducted for dense-dense 
> mat mult.
> [~nakul02] [~mboehm7]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (SYSTEMML-938) Make sparse memory estimation robust by handling unknown nnz.

2017-02-14 Thread Niketan Pansare (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niketan Pansare reassigned SYSTEMML-938:


Assignee: Nakul Jindal

> Make sparse memory estimation robust by handling unknown nnz.
> -
>
> Key: SYSTEMML-938
> URL: https://issues.apache.org/jira/browse/SYSTEMML-938
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: Niketan Pansare
>Assignee: Nakul Jindal
>
> What if CSRPointer.estimateSize(mat.getNnz(), mat.getNumRows()) ?
> [~nakul02]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-943) Create documentation explaining setup/usage for the GPU backend

2017-02-14 Thread Niketan Pansare (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866621#comment-15866621
 ] 

Niketan Pansare commented on SYSTEMML-943:
--

The documentation 
https://github.com/apache/incubator-systemml/blob/master/docs/devdocs/gpu-backend.md
 needs to be moved to doc folder after GPU is marked as stable.

> Create documentation explaining setup/usage for the GPU backend
> ---
>
> Key: SYSTEMML-943
> URL: https://issues.apache.org/jira/browse/SYSTEMML-943
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Niketan Pansare
> Fix For: SystemML 0.11
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-943) Create documentation explaining setup/usage for the GPU backend

2017-02-14 Thread Niketan Pansare (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niketan Pansare updated SYSTEMML-943:
-
Fix Version/s: (was: SystemML 0.11)
   SystemML 1.0

> Create documentation explaining setup/usage for the GPU backend
> ---
>
> Key: SYSTEMML-943
> URL: https://issues.apache.org/jira/browse/SYSTEMML-943
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Niketan Pansare
> Fix For: SystemML 1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (SYSTEMML-943) Create documentation explaining setup/usage for the GPU backend

2017-02-14 Thread Niketan Pansare (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niketan Pansare reassigned SYSTEMML-943:


Assignee: Niketan Pansare

> Create documentation explaining setup/usage for the GPU backend
> ---
>
> Key: SYSTEMML-943
> URL: https://issues.apache.org/jira/browse/SYSTEMML-943
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Niketan Pansare
>Assignee: Niketan Pansare
> Fix For: SystemML 1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (SYSTEMML-942) Enable setting GPU from MLContext (and related APIs)

2017-02-14 Thread Niketan Pansare (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niketan Pansare reassigned SYSTEMML-942:


Assignee: Nakul Jindal

> Enable setting GPU from MLContext (and related APIs)
> 
>
> Key: SYSTEMML-942
> URL: https://issues.apache.org/jira/browse/SYSTEMML-942
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Niketan Pansare
>Assignee: Nakul Jindal
> Fix For: SystemML 0.11
>
>
> On setting GPU, MLContext needs to invoke:
> DMLScript.USE_ACCELERATOR = true;
> GPUContext.createGPUContext(); // Set GPU memory budget
> and upon exit, it should ec.destroyGPUContext();
> This logic should be enabled via Java/Python MLContext, MLPipeline/MLLearn 
> wrappers.
> [~deron] [~mwdus...@us.ibm.com]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1033) Add LU and QR functionality to GPU backend

2017-02-14 Thread Niketan Pansare (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niketan Pansare updated SYSTEMML-1033:
--
Fix Version/s: (was: SystemML 0.11)
   SystemML 1.0

> Add LU and QR functionality to GPU backend
> --
>
> Key: SYSTEMML-1033
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1033
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Runtime
>Reporter: Niketan Pansare
> Fix For: SystemML 1.0
>
>
> For LU: See JCublas2's cublasDgetrfBatched method
> For QR: See JCublas2's cublasDgeqrfBatched method
> The key changes required:
> 1. Add GPU backend in 
> https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/apache/sysml/hops/FunctionOp.java#L239
> 2. Add MultiReturnBuiltinGPUInstruction that invokes above functions either 
> directly or through LibMatrixCUDA.
> [~nakul02] Do you want to take a pass at this ?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-942) Enable setting GPU from MLContext (and related APIs)

2017-02-14 Thread Niketan Pansare (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niketan Pansare updated SYSTEMML-942:
-
Fix Version/s: (was: SystemML 0.11)
   SystemML 1.0

> Enable setting GPU from MLContext (and related APIs)
> 
>
> Key: SYSTEMML-942
> URL: https://issues.apache.org/jira/browse/SYSTEMML-942
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler, Runtime
>Reporter: Niketan Pansare
>Assignee: Nakul Jindal
> Fix For: SystemML 1.0
>
>
> On setting GPU, MLContext needs to invoke:
> DMLScript.USE_ACCELERATOR = true;
> GPUContext.createGPUContext(); // Set GPU memory budget
> and upon exit, it should ec.destroyGPUContext();
> This logic should be enabled via Java/Python MLContext, MLPipeline/MLLearn 
> wrappers.
> [~deron] [~mwdus...@us.ibm.com]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (SYSTEMML-1033) Add LU and QR functionality to GPU backend

2017-02-14 Thread Niketan Pansare (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niketan Pansare reassigned SYSTEMML-1033:
-

Assignee: Nakul Jindal

> Add LU and QR functionality to GPU backend
> --
>
> Key: SYSTEMML-1033
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1033
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Runtime
>Reporter: Niketan Pansare
>Assignee: Nakul Jindal
> Fix For: SystemML 1.0
>
>
> For LU: See JCublas2's cublasDgetrfBatched method
> For QR: See JCublas2's cublasDgeqrfBatched method
> The key changes required:
> 1. Add GPU backend in 
> https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/apache/sysml/hops/FunctionOp.java#L239
> 2. Add MultiReturnBuiltinGPUInstruction that invokes above functions either 
> directly or through LibMatrixCUDA.
> [~nakul02] Do you want to take a pass at this ?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-446) Phase 1: Exploit GPU BLAS libraries (integration)

2017-02-14 Thread Niketan Pansare (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niketan Pansare updated SYSTEMML-446:
-
Fix Version/s: (was: SystemML 0.11)
   SystemML 0.13

> Phase 1: Exploit GPU BLAS libraries (integration)
> -
>
> Key: SYSTEMML-446
> URL: https://issues.apache.org/jira/browse/SYSTEMML-446
> Project: SystemML
>  Issue Type: Task
>  Components: Compiler, Runtime
>Reporter: Matthias Boehm
>Assignee: Niketan Pansare
> Fix For: SystemML 0.13
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1034) Implement solve builtin function using cublas kernels

2017-02-14 Thread Niketan Pansare (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niketan Pansare updated SYSTEMML-1034:
--
Fix Version/s: (was: SystemML 0.11)
   SystemML 1.0

> Implement solve builtin function using cublas kernels
> -
>
> Key: SYSTEMML-1034
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1034
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Runtime
>Reporter: Niketan Pansare
> Fix For: SystemML 1.0
>
>
> 1. Extend BinaryOp to enable GPU for solve
> 2. Add MatrixMatrixBuiltinGPUInstruction and use JCuBlas2's 
> cublasDtrsmBatched and cublasDgeqrfBatched (or cublasDgetrfBatched) methods.
> For reference implementation, see 
> https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/apache/sysml/runtime/matrix/data/LibCommonsMath.java#L97
> [~nakul02]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (SYSTEMML-1253) Apply wdivmm-basic only for non-sparse factors

2017-02-14 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm resolved SYSTEMML-1253.
--
   Resolution: Fixed
Fix Version/s: SystemML 0.13

> Apply wdivmm-basic only for non-sparse factors
> --
>
> Key: SYSTEMML-1253
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1253
> Project: SystemML
>  Issue Type: Sub-task
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.13
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SYSTEMML-1265) ArrayIndexOutOfBoundsException in interQuartileMean(X, W)

2017-02-14 Thread JIRA
Felix Schüler created SYSTEMML-1265:
---

 Summary: ArrayIndexOutOfBoundsException in interQuartileMean(X, W)
 Key: SYSTEMML-1265
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1265
 Project: SystemML
  Issue Type: Bug
Reporter: Felix Schüler
Priority: Minor


The interQuartileMean(X, W) function seems to be broken. The following script 
throws an {code} ArrayIndexOutOfBoundsException {code}

{code}
A = matrix("0.0 1.0 2.0 1.0 0.0", rows=5, cols=1)
W = matrix("0.0 0.5 1.0 0.5 0.0", rows=5, cols=1)
a = interQuartileMean(A)
b = interQuartileMean(A, W)
print("a is: " + a + " and b is: " + b)
{code}

Error:
{code}
17/02/14 13:56:09 INFO api.DMLScript: BEGIN DML run 02/14/2017 13:56:09
17/02/14 13:56:10 INFO api.DMLScript: SystemML Statistics:
Total execution time:   0.368 sec.
Number of executed MR Jobs: 0.

17/02/14 13:56:10 INFO api.DMLScript: END DML run 02/14/2017 13:56:10
17/02/14 13:56:10 ERROR api.DMLScript: Failed to execute DML script.
org.apache.sysml.runtime.DMLRuntimeException: 
org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
block generated from statement block between lines 1 and 5 -- Error evaluating 
instruction: CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
at 
org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:130)
at org.apache.sysml.api.DMLScript.execute(DMLScript.java:684)
at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:360)
at org.apache.sysml.api.DMLScript.main(DMLScript.java:221)
Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
in program block generated from statement block between lines 1 and 5 -- Error 
evaluating instruction: 
CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:320)
at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
at 
org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
... 3 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 9
at 
org.apache.sysml.runtime.matrix.data.MatrixBlock.quickGetValue(MatrixBlock.java:576)
at 
org.apache.sysml.runtime.matrix.data.MatrixBlock.interQuartileMean(MatrixBlock.java:4693)
at 
org.apache.sysml.runtime.instructions.cp.QuantilePickCPInstruction.processInstruction(QuantilePickCPInstruction.java:177)
at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
... 6 more
Exception in thread "main" org.apache.sysml.api.DMLException: 
org.apache.sysml.runtime.DMLRuntimeException: 
org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
block generated from statement block between lines 1 and 5 -- Error evaluating 
instruction: CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:374)
at org.apache.sysml.api.DMLScript.main(DMLScript.java:221)
Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
block generated from statement block between lines 1 and 5 -- Error evaluating 
instruction: CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
at 
org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:130)
at org.apache.sysml.api.DMLScript.execute(DMLScript.java:684)
at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:360)
... 1 more
Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
in program block generated from statement block between lines 1 and 5 -- Error 
evaluating instruction: 
CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:320)
at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
at 
org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
... 3 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 9
at 
org.apache.sysml.runtime.matrix.data.MatrixBlock.quickGetValue(MatrixBlock.java:576)
at 
org.apache.sysml.runtime.matrix.data.MatrixBlock.interQuartileMean(MatrixBlock.java:4693)
at 
org.apache.sysml.runtime.instructions.cp.QuantilePickCPInstruction.processInstruction(QuantilePickCPInstruction.java:177)
at 
org.apache.sysml.runtime.controlprogram.ProgramBloc

[jira] [Updated] (SYSTEMML-1265) ArrayIndexOutOfBoundsException in interQuartileMean(X, W)

2017-02-14 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Felix Schüler updated SYSTEMML-1265:

Description: 
The interQuartileMean(X, W) function seems to be broken. The following script 
throws an {code} ArrayIndexOutOfBoundsException {code}

{code}
A = matrix("0.0 1.0 2.0 1.0 0.0", rows=5, cols=1)
W = matrix("0.0 0.5 1.0 0.5 0.0", rows=5, cols=1)

b = interQuartileMean(A, W)
print("b is: " + b)
{code}

Error:
{code}
17/02/14 13:56:09 INFO api.DMLScript: BEGIN DML run 02/14/2017 13:56:09
17/02/14 13:56:10 INFO api.DMLScript: SystemML Statistics:
Total execution time:   0.368 sec.
Number of executed MR Jobs: 0.

17/02/14 13:56:10 INFO api.DMLScript: END DML run 02/14/2017 13:56:10
17/02/14 13:56:10 ERROR api.DMLScript: Failed to execute DML script.
org.apache.sysml.runtime.DMLRuntimeException: 
org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
block generated from statement block between lines 1 and 5 -- Error evaluating 
instruction: CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
at 
org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:130)
at org.apache.sysml.api.DMLScript.execute(DMLScript.java:684)
at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:360)
at org.apache.sysml.api.DMLScript.main(DMLScript.java:221)
Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
in program block generated from statement block between lines 1 and 5 -- Error 
evaluating instruction: 
CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:320)
at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
at 
org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
... 3 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 9
at 
org.apache.sysml.runtime.matrix.data.MatrixBlock.quickGetValue(MatrixBlock.java:576)
at 
org.apache.sysml.runtime.matrix.data.MatrixBlock.interQuartileMean(MatrixBlock.java:4693)
at 
org.apache.sysml.runtime.instructions.cp.QuantilePickCPInstruction.processInstruction(QuantilePickCPInstruction.java:177)
at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
... 6 more
Exception in thread "main" org.apache.sysml.api.DMLException: 
org.apache.sysml.runtime.DMLRuntimeException: 
org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
block generated from statement block between lines 1 and 5 -- Error evaluating 
instruction: CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:374)
at org.apache.sysml.api.DMLScript.main(DMLScript.java:221)
Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
block generated from statement block between lines 1 and 5 -- Error evaluating 
instruction: CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
at 
org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:130)
at org.apache.sysml.api.DMLScript.execute(DMLScript.java:684)
at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:360)
... 1 more
Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
in program block generated from statement block between lines 1 and 5 -- Error 
evaluating instruction: 
CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:320)
at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
at 
org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
... 3 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 9
at 
org.apache.sysml.runtime.matrix.data.MatrixBlock.quickGetValue(MatrixBlock.java:576)
at 
org.apache.sysml.runtime.matrix.data.MatrixBlock.interQuartileMean(MatrixBlock.java:4693)
at 
org.apache.sysml.runtime.instructions.cp.QuantilePickCPInstruction.processInstruction(QuantilePickCPInstruction.java:177)
at 
org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
... 6 more

{code}

  was:
The interQuartileMean(X, W) function seems to be broken. The following script 
throws an {code} ArrayIndexOutOfBoundsException 

[jira] [Resolved] (SYSTEMML-1256) Add download link for previous releases

2017-02-14 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner resolved SYSTEMML-1256.
-
   Resolution: Fixed
Fix Version/s: SystemML 0.13

Fixed with [PR 39|https://github.com/apache/incubator-systemml-website/pull/39].

> Add download link for previous releases
> ---
>
> Key: SYSTEMML-1256
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1256
> Project: SystemML
>  Issue Type: Improvement
>  Components: Website
>Reporter: Glenn Weidner
>Assignee: Glenn Weidner
>Priority: Minor
> Fix For: SystemML 0.13
>
>
> Add link at bottom of download page for Previous Releases:
> http://archive.apache.org/dist/incubator/systemml/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (SYSTEMML-1256) Add download link for previous releases

2017-02-14 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner closed SYSTEMML-1256.
---

Closing since published to website.

> Add download link for previous releases
> ---
>
> Key: SYSTEMML-1256
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1256
> Project: SystemML
>  Issue Type: Improvement
>  Components: Website
>Reporter: Glenn Weidner
>Assignee: Glenn Weidner
>Priority: Minor
> Fix For: SystemML 0.13
>
>
> Add link at bottom of download page for Previous Releases:
> http://archive.apache.org/dist/incubator/systemml/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1265) ArrayIndexOutOfBoundsException in interQuartileMean(X, W)

2017-02-14 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866831#comment-15866831
 ] 

Matthias Boehm commented on SYSTEMML-1265:
--

thanks for reporting this [~fschueler] - it's probably due to the weights which 
are rather uncommon. I"ll have a look later tonight.

> ArrayIndexOutOfBoundsException in interQuartileMean(X, W)
> -
>
> Key: SYSTEMML-1265
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1265
> Project: SystemML
>  Issue Type: Bug
>Reporter: Felix Schüler
>Priority: Minor
>
> The interQuartileMean(X, W) function seems to be broken. The following script 
> throws an {code} ArrayIndexOutOfBoundsException {code}
> {code}
> A = matrix("0.0 1.0 2.0 1.0 0.0", rows=5, cols=1)
> W = matrix("0.0 0.5 1.0 0.5 0.0", rows=5, cols=1)
> b = interQuartileMean(A, W)
> print("b is: " + b)
> {code}
> Error:
> {code}
> 17/02/14 13:56:09 INFO api.DMLScript: BEGIN DML run 02/14/2017 13:56:09
> 17/02/14 13:56:10 INFO api.DMLScript: SystemML Statistics:
> Total execution time: 0.368 sec.
> Number of executed MR Jobs:   0.
> 17/02/14 13:56:10 INFO api.DMLScript: END DML run 02/14/2017 13:56:10
> 17/02/14 13:56:10 ERROR api.DMLScript: Failed to execute DML script.
> org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
> block generated from statement block between lines 1 and 5 -- Error 
> evaluating instruction: 
> CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:130)
>   at org.apache.sysml.api.DMLScript.execute(DMLScript.java:684)
>   at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:360)
>   at org.apache.sysml.api.DMLScript.main(DMLScript.java:221)
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
> in program block generated from statement block between lines 1 and 5 -- 
> Error evaluating instruction: 
> CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:320)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   ... 3 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 9
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.quickGetValue(MatrixBlock.java:576)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.interQuartileMean(MatrixBlock.java:4693)
>   at 
> org.apache.sysml.runtime.instructions.cp.QuantilePickCPInstruction.processInstruction(QuantilePickCPInstruction.java:177)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
>   ... 6 more
> Exception in thread "main" org.apache.sysml.api.DMLException: 
> org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
> block generated from statement block between lines 1 and 5 -- Error 
> evaluating instruction: 
> CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
>   at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:374)
>   at org.apache.sysml.api.DMLScript.main(DMLScript.java:221)
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
> block generated from statement block between lines 1 and 5 -- Error 
> evaluating instruction: 
> CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:130)
>   at org.apache.sysml.api.DMLScript.execute(DMLScript.java:684)
>   at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:360)
>   ... 1 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
> in program block generated from statement block between lines 1 and 5 -- 
> Error evaluating instruction: 
> CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:320)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   ... 3 more
> Caused by: java.la

[jira] [Resolved] (SYSTEMML-1240) Cleanup crc files when write scalar

2017-02-14 Thread Deron Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deron Eriksson resolved SYSTEMML-1240.
--
   Resolution: Fixed
Fix Version/s: SystemML 0.13

Fixed by [PR386|https://github.com/apache/incubator-systemml/pull/386].

> Cleanup crc files when write scalar
> ---
>
> Key: SYSTEMML-1240
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1240
> Project: SystemML
>  Issue Type: Task
>  Components: Runtime
>Reporter: Deron Eriksson
>Assignee: Deron Eriksson
>Priority: Minor
> Fix For: SystemML 0.13
>
>
> When writing a matrix, crc values are cleaned up from file system. When 
> writing scalar, crc values are not cleaned up.
> Example:
> {code}
> A = 2.0
> write(A,"ademo.scalar.txt")
> {code}
> Results in:
> .ademo.scalar.txt.crc
> .ademo.scalar.txt.mtd.crc
> ademo.scalar.txt
> ademo.scalar.txt.mtd



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SYSTEMML-1266) Replace `README.txt` w/ `README.md` In Release Package

2017-02-14 Thread Mike Dusenberry (JIRA)
Mike Dusenberry created SYSTEMML-1266:
-

 Summary: Replace `README.txt` w/ `README.md` In Release Package
 Key: SYSTEMML-1266
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1266
 Project: SystemML
  Issue Type: Improvement
Reporter: Mike Dusenberry
Priority: Minor


Currently, our release package contains an old README.txt file that only 
describes the standalone mode.  We should replace that with the current 
{{README.md}} file that is in the root of the project.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (SYSTEMML-1010) Perftest 0.11 release and related improvements

2017-02-14 Thread Mike Dusenberry (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Dusenberry resolved SYSTEMML-1010.
---
Resolution: Fixed

> Perftest 0.11 release and related improvements
> --
>
> Key: SYSTEMML-1010
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1010
> Project: SystemML
>  Issue Type: Umbrella
>Affects Versions: SystemML 0.11
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.11
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (SYSTEMML-1259) Replace append with cbind for matrices

2017-02-14 Thread Deron Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deron Eriksson closed SYSTEMML-1259.


> Replace append with cbind for matrices
> --
>
> Key: SYSTEMML-1259
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1259
> Project: SystemML
>  Issue Type: Task
>  Components: Algorithms
>Reporter: Deron Eriksson
>Assignee: Deron Eriksson
> Fix For: SystemML 0.13
>
>
> DML Language Ref specifies: "NOTE: append() has been replaced by cbind(), so 
> its use is discouraged."
> So, cbind() should replace append() in all DML/PyDML scripts when used with 
> matrices. (note that append() can also be used to append strings to other 
> strings)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (SYSTEMML-1259) Replace append with cbind for matrices

2017-02-14 Thread Deron Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deron Eriksson resolved SYSTEMML-1259.
--
   Resolution: Fixed
Fix Version/s: SystemML 0.13

Fixed by [PR391|https://github.com/apache/incubator-systemml/pull/391].

> Replace append with cbind for matrices
> --
>
> Key: SYSTEMML-1259
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1259
> Project: SystemML
>  Issue Type: Task
>  Components: Algorithms
>Reporter: Deron Eriksson
>Assignee: Deron Eriksson
> Fix For: SystemML 0.13
>
>
> DML Language Ref specifies: "NOTE: append() has been replaced by cbind(), so 
> its use is discouraged."
> So, cbind() should replace append() in all DML/PyDML scripts when used with 
> matrices. (note that append() can also be used to append strings to other 
> strings)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1217) Perftest 0.12 release and related improvements

2017-02-14 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1217:

Attachment: runMultiLogReg_10M_1k_sparse_k150_v12.out

MultiLogReg log for 0.12.

> Perftest 0.12 release and related improvements
> --
>
> Key: SYSTEMML-1217
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1217
> Project: SystemML
>  Issue Type: Umbrella
>Affects Versions: SystemML 0.12
>Reporter: Glenn Weidner
>Assignee: Glenn Weidner
> Attachments: runMultiLogReg_10M_1k_sparse_k150_v12.out, 
> times_All_8g_rc2.txt, times_Binomial_80g_rc1.txt, times_Binomial_80g_rc2.txt, 
> times_Binomial_80g_v11.txt, times_Binomial_8g_rc1.txt, 
> times_Binomial_8g_v11.txt, times_Clustering_80g_rc1.txt, 
> times_Clustering_80g_v11.txt, times_Clustering_8g_rc1.txt, 
> times_Clustering_8g_v11.txt, times_Multinomial_80g_bayes_sparse_rc1.txt, 
> times_Multinomial_80g_mlreg_dense_rc1.txt, 
> times_Multinomial_80g_mlreg_sparse_rc1.txt, 
> times_Multinomial_80g_rc1_subset.txt, 
> times_Multinomial_80g_v11_mlreg_dense_run2.txt, 
> times_Multinomial_80g_v11_mlreg_dense_run3.txt, 
> times_Multinomial_80g_v11_mlreg_dense.txt, 
> times_Multinomial_80g_v11_mlreg_sparse.txt, 
> times_Multinomial_8g_bayes_predict.txt, 
> times_Multinomial_8g_bayes_predict_v11.txt, times_Multinomial_8g_rc1.txt, 
> times_Multinomial_8g_v11.txt, times_Regression_80g_rc1.txt, 
> times_Regression_80g_rc2.txt, times_Regression_80g_v11.txt, 
> times_Regression_8g_rc1.txt, times_Regression_8g_v11.txt, 
> times_Stats_80g_rc1.txt, times_Stats_80g_v11.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (SYSTEMML-1217) Perftest 0.12 release and related improvements

2017-02-14 Thread Glenn Weidner (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866991#comment-15866991
 ] 

Glenn Weidner edited comment on SYSTEMML-1217 at 2/15/17 12:24 AM:
---

MultiLogReg for 0.11 on same cluster.


was (Author: gweidner):
MultLogReg for 0.11 on same cluster.

> Perftest 0.12 release and related improvements
> --
>
> Key: SYSTEMML-1217
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1217
> Project: SystemML
>  Issue Type: Umbrella
>Affects Versions: SystemML 0.12
>Reporter: Glenn Weidner
>Assignee: Glenn Weidner
> Attachments: runMultiLogReg_10M_1k_sparse_k150_v11.out, 
> runMultiLogReg_10M_1k_sparse_k150_v12.out, times_All_8g_rc2.txt, 
> times_Binomial_80g_rc1.txt, times_Binomial_80g_rc2.txt, 
> times_Binomial_80g_v11.txt, times_Binomial_8g_rc1.txt, 
> times_Binomial_8g_v11.txt, times_Clustering_80g_rc1.txt, 
> times_Clustering_80g_v11.txt, times_Clustering_8g_rc1.txt, 
> times_Clustering_8g_v11.txt, times_Multinomial_80g_bayes_sparse_rc1.txt, 
> times_Multinomial_80g_mlreg_dense_rc1.txt, 
> times_Multinomial_80g_mlreg_sparse_rc1.txt, 
> times_Multinomial_80g_rc1_subset.txt, 
> times_Multinomial_80g_v11_mlreg_dense_run2.txt, 
> times_Multinomial_80g_v11_mlreg_dense_run3.txt, 
> times_Multinomial_80g_v11_mlreg_dense.txt, 
> times_Multinomial_80g_v11_mlreg_sparse.txt, 
> times_Multinomial_8g_bayes_predict.txt, 
> times_Multinomial_8g_bayes_predict_v11.txt, times_Multinomial_8g_rc1.txt, 
> times_Multinomial_8g_v11.txt, times_Regression_80g_rc1.txt, 
> times_Regression_80g_rc2.txt, times_Regression_80g_v11.txt, 
> times_Regression_8g_rc1.txt, times_Regression_8g_v11.txt, 
> times_Stats_80g_rc1.txt, times_Stats_80g_v11.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (SYSTEMML-1217) Perftest 0.12 release and related improvements

2017-02-14 Thread Glenn Weidner (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866991#comment-15866991
 ] 

Glenn Weidner edited comment on SYSTEMML-1217 at 2/15/17 12:24 AM:
---

MultiLogReg log for 0.11 on same cluster.


was (Author: gweidner):
MultiLogReg for 0.11 on same cluster.

> Perftest 0.12 release and related improvements
> --
>
> Key: SYSTEMML-1217
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1217
> Project: SystemML
>  Issue Type: Umbrella
>Affects Versions: SystemML 0.12
>Reporter: Glenn Weidner
>Assignee: Glenn Weidner
> Attachments: runMultiLogReg_10M_1k_sparse_k150_v11.out, 
> runMultiLogReg_10M_1k_sparse_k150_v12.out, times_All_8g_rc2.txt, 
> times_Binomial_80g_rc1.txt, times_Binomial_80g_rc2.txt, 
> times_Binomial_80g_v11.txt, times_Binomial_8g_rc1.txt, 
> times_Binomial_8g_v11.txt, times_Clustering_80g_rc1.txt, 
> times_Clustering_80g_v11.txt, times_Clustering_8g_rc1.txt, 
> times_Clustering_8g_v11.txt, times_Multinomial_80g_bayes_sparse_rc1.txt, 
> times_Multinomial_80g_mlreg_dense_rc1.txt, 
> times_Multinomial_80g_mlreg_sparse_rc1.txt, 
> times_Multinomial_80g_rc1_subset.txt, 
> times_Multinomial_80g_v11_mlreg_dense_run2.txt, 
> times_Multinomial_80g_v11_mlreg_dense_run3.txt, 
> times_Multinomial_80g_v11_mlreg_dense.txt, 
> times_Multinomial_80g_v11_mlreg_sparse.txt, 
> times_Multinomial_8g_bayes_predict.txt, 
> times_Multinomial_8g_bayes_predict_v11.txt, times_Multinomial_8g_rc1.txt, 
> times_Multinomial_8g_v11.txt, times_Regression_80g_rc1.txt, 
> times_Regression_80g_rc2.txt, times_Regression_80g_v11.txt, 
> times_Regression_8g_rc1.txt, times_Regression_8g_v11.txt, 
> times_Stats_80g_rc1.txt, times_Stats_80g_v11.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1217) Perftest 0.12 release and related improvements

2017-02-14 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1217:

Attachment: runMultiLogReg_10M_1k_sparse_k150_v11.out

MultLogReg for 0.11 on same cluster.

> Perftest 0.12 release and related improvements
> --
>
> Key: SYSTEMML-1217
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1217
> Project: SystemML
>  Issue Type: Umbrella
>Affects Versions: SystemML 0.12
>Reporter: Glenn Weidner
>Assignee: Glenn Weidner
> Attachments: runMultiLogReg_10M_1k_sparse_k150_v11.out, 
> runMultiLogReg_10M_1k_sparse_k150_v12.out, times_All_8g_rc2.txt, 
> times_Binomial_80g_rc1.txt, times_Binomial_80g_rc2.txt, 
> times_Binomial_80g_v11.txt, times_Binomial_8g_rc1.txt, 
> times_Binomial_8g_v11.txt, times_Clustering_80g_rc1.txt, 
> times_Clustering_80g_v11.txt, times_Clustering_8g_rc1.txt, 
> times_Clustering_8g_v11.txt, times_Multinomial_80g_bayes_sparse_rc1.txt, 
> times_Multinomial_80g_mlreg_dense_rc1.txt, 
> times_Multinomial_80g_mlreg_sparse_rc1.txt, 
> times_Multinomial_80g_rc1_subset.txt, 
> times_Multinomial_80g_v11_mlreg_dense_run2.txt, 
> times_Multinomial_80g_v11_mlreg_dense_run3.txt, 
> times_Multinomial_80g_v11_mlreg_dense.txt, 
> times_Multinomial_80g_v11_mlreg_sparse.txt, 
> times_Multinomial_8g_bayes_predict.txt, 
> times_Multinomial_8g_bayes_predict_v11.txt, times_Multinomial_8g_rc1.txt, 
> times_Multinomial_8g_v11.txt, times_Regression_80g_rc1.txt, 
> times_Regression_80g_rc2.txt, times_Regression_80g_v11.txt, 
> times_Regression_8g_rc1.txt, times_Regression_8g_v11.txt, 
> times_Stats_80g_rc1.txt, times_Stats_80g_v11.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1217) Perftest 0.12 release and related improvements

2017-02-14 Thread Glenn Weidner (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Glenn Weidner updated SYSTEMML-1217:

Attachment: runMultiLogReg.sh

Shell script used when running MultiLogReg for both 0.12 and 0.11.

> Perftest 0.12 release and related improvements
> --
>
> Key: SYSTEMML-1217
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1217
> Project: SystemML
>  Issue Type: Umbrella
>Affects Versions: SystemML 0.12
>Reporter: Glenn Weidner
>Assignee: Glenn Weidner
> Attachments: runMultiLogReg_10M_1k_sparse_k150_v11.out, 
> runMultiLogReg_10M_1k_sparse_k150_v12.out, runMultiLogReg.sh, 
> times_All_8g_rc2.txt, times_Binomial_80g_rc1.txt, times_Binomial_80g_rc2.txt, 
> times_Binomial_80g_v11.txt, times_Binomial_8g_rc1.txt, 
> times_Binomial_8g_v11.txt, times_Clustering_80g_rc1.txt, 
> times_Clustering_80g_v11.txt, times_Clustering_8g_rc1.txt, 
> times_Clustering_8g_v11.txt, times_Multinomial_80g_bayes_sparse_rc1.txt, 
> times_Multinomial_80g_mlreg_dense_rc1.txt, 
> times_Multinomial_80g_mlreg_sparse_rc1.txt, 
> times_Multinomial_80g_rc1_subset.txt, 
> times_Multinomial_80g_v11_mlreg_dense_run2.txt, 
> times_Multinomial_80g_v11_mlreg_dense_run3.txt, 
> times_Multinomial_80g_v11_mlreg_dense.txt, 
> times_Multinomial_80g_v11_mlreg_sparse.txt, 
> times_Multinomial_8g_bayes_predict.txt, 
> times_Multinomial_8g_bayes_predict_v11.txt, times_Multinomial_8g_rc1.txt, 
> times_Multinomial_8g_v11.txt, times_Regression_80g_rc1.txt, 
> times_Regression_80g_rc2.txt, times_Regression_80g_v11.txt, 
> times_Regression_8g_rc1.txt, times_Regression_8g_v11.txt, 
> times_Stats_80g_rc1.txt, times_Stats_80g_v11.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1242) Perftest: OutOfMemoryError in MultiLogReg for 80g sparse

2017-02-14 Thread Glenn Weidner (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866998#comment-15866998
 ] 

Glenn Weidner commented on SYSTEMML-1242:
-

Yes [~mboehm7] - I do still have the full logs for 
runMultiLogReg_10M_1k_sparse_k150 for both 0.12 and 0.11 used for comparison.  
I've attached the logs to [SYSTEMML-1217] for MultiLogReg sparse for both 0.12 
and 0.11 on same cluster.  The runMultiLogReg.sh that was used for the logs 
also added to [SYSTEMML-1217].  Additional logs for other cases are available 
if needed.

> Perftest: OutOfMemoryError in MultiLogReg for 80g sparse
> 
>
> Key: SYSTEMML-1242
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1242
> Project: SystemML
>  Issue Type: Bug
>  Components: Test
>Affects Versions: SystemML 0.13
> Environment: spark 2.1.0
>Reporter: Imran Younus
>Assignee: Matthias Boehm
> Fix For: SystemML 0.13
>
> Attachments: sparkDML.sh
>
>
> when running {{runMultiLogReg.sh}} script, {{MultiLogReg.dml}} ends with 
> OutOfMemory error for the case of 10M_1K sparse data and {{icpt = 1}}. Here 
> is the end of the log file:
> {code}
> 17/02/04 17:20:33 INFO api.DMLScript: SystemML Statistics:
> Total elapsed time:   697.694 sec.
> Total compilation time:   2.543 sec.
> Total execution time: 695.151 sec.
> Number of compiled Spark inst:73.
> Number of executed Spark inst:16.
> Cache hits (Mem, WB, FS, HDFS):   46/9/1/7.
> Cache writes (WB, FS, HDFS):  27/1/1.
> Cache times (ACQr/m, RLS, EXP):   281.541/0.003/131.589/48.737 sec.
> HOP DAGs recompiled (PRED, SB):   0/15.
> HOP DAGs recompile time:  0.067 sec.
> Spark ctx create time (lazy): 31.078 sec.
> Spark trans counts (par,bc,col):5/4/0.
> Spark trans times (par,bc,col):   46.748/0.392/0.000 secs.
> Total JIT compile time:   151.254 sec.
> Total JVM GC count:   144.
> Total JVM GC time:220.671 sec.
> Heavy hitter instructions (name, time, count):
> -- 1) ba+*144.194 sec 3
> -- 2) rand109.939 sec 9
> -- 3) uark+   105.011 sec 2
> -- 4) r'  100.933 sec 3
> -- 5) sp_/80.387 sec  1
> -- 6) sp_mapmm45.491 sec  2
> -- 7) sp_tak+*40.655 sec  1
> -- 8) append  9.480 sec   1
> -- 9) rangeReIndex7.347 sec   2
> -- 10)sp_-6.392 sec   3
> 17/02/04 17:20:33 INFO api.DMLScript: END DML run 02/04/2017 17:20:33
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:363)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:339)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlockUnsafe(MatrixBlock.java:408)
>   at 
> org.apache.sysml.runtime.io.MatrixReader.createOutputMatrixBlock(MatrixReader.java:107)
>   at 
> org.apache.sysml.runtime.io.ReaderBinaryBlockParallel.readMatrixFromHDFS(ReaderBinaryBlockParallel.java:59)
>   at 
> org.apache.sysml.runtime.util.DataConverter.readMatrixFromHDFS(DataConverter.java:203)
>   at 
> org.apache.sysml.runtime.util.DataConverter.readMatrixFromHDFS(DataConverter.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromHDFS(MatrixObject.java:425)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromHDFS(MatrixObject.java:60)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.readBlobFromHDFS(CacheableData.java:920)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromRDD(MatrixObject.java:478)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromRDD(MatrixObject.java:60)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.acquireRead(CacheableData.java:411)
>   at 
> org.apache.sysml.runtime.controlprogram.context.ExecutionContext.getMatrixInput(ExecutionContext.java:209)
>   at 
> org.apache.sysml.runtime.instructions.cp.AggregateBinaryCPInstruction.processInstruction(AggregateBinaryCPInstruction.java:74)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.IfProgramBlock.execute(IfProgramBlock.java:139)
>   at 
> org.apache.sysml.runtim

[jira] [Commented] (SYSTEMML-1265) ArrayIndexOutOfBoundsException in interQuartileMean(X, W)

2017-02-14 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866997#comment-15866997
 ] 

Felix Schüler commented on SYSTEMML-1265:
-

Yeah, there are a couple of functions that take weights (quantile, median, 
...). Isn't it the same as doing interQuartileMean(W * X)?

> ArrayIndexOutOfBoundsException in interQuartileMean(X, W)
> -
>
> Key: SYSTEMML-1265
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1265
> Project: SystemML
>  Issue Type: Bug
>Reporter: Felix Schüler
>Priority: Minor
>
> The interQuartileMean(X, W) function seems to be broken. The following script 
> throws an {code} ArrayIndexOutOfBoundsException {code}
> {code}
> A = matrix("0.0 1.0 2.0 1.0 0.0", rows=5, cols=1)
> W = matrix("0.0 0.5 1.0 0.5 0.0", rows=5, cols=1)
> b = interQuartileMean(A, W)
> print("b is: " + b)
> {code}
> Error:
> {code}
> 17/02/14 13:56:09 INFO api.DMLScript: BEGIN DML run 02/14/2017 13:56:09
> 17/02/14 13:56:10 INFO api.DMLScript: SystemML Statistics:
> Total execution time: 0.368 sec.
> Number of executed MR Jobs:   0.
> 17/02/14 13:56:10 INFO api.DMLScript: END DML run 02/14/2017 13:56:10
> 17/02/14 13:56:10 ERROR api.DMLScript: Failed to execute DML script.
> org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
> block generated from statement block between lines 1 and 5 -- Error 
> evaluating instruction: 
> CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:130)
>   at org.apache.sysml.api.DMLScript.execute(DMLScript.java:684)
>   at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:360)
>   at org.apache.sysml.api.DMLScript.main(DMLScript.java:221)
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
> in program block generated from statement block between lines 1 and 5 -- 
> Error evaluating instruction: 
> CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:320)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   ... 3 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 9
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.quickGetValue(MatrixBlock.java:576)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.interQuartileMean(MatrixBlock.java:4693)
>   at 
> org.apache.sysml.runtime.instructions.cp.QuantilePickCPInstruction.processInstruction(QuantilePickCPInstruction.java:177)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
>   ... 6 more
> Exception in thread "main" org.apache.sysml.api.DMLException: 
> org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
> block generated from statement block between lines 1 and 5 -- Error 
> evaluating instruction: 
> CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
>   at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:374)
>   at org.apache.sysml.api.DMLScript.main(DMLScript.java:221)
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
> block generated from statement block between lines 1 and 5 -- Error 
> evaluating instruction: 
> CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:130)
>   at org.apache.sysml.api.DMLScript.execute(DMLScript.java:684)
>   at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:360)
>   ... 1 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
> in program block generated from statement block between lines 1 and 5 -- 
> Error evaluating instruction: 
> CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:320)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   ... 3 more
> Caused by: java.lan

[jira] [Commented] (SYSTEMML-1242) Perftest: OutOfMemoryError in MultiLogReg for 80g sparse

2017-02-14 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15867007#comment-15867007
 ] 

Matthias Boehm commented on SYSTEMML-1242:
--

Thanks [~gweidner] - that explains it. So the same issue affects also 0.11 and 
0.12 (and probably earlier releases too as we usually tested against something 
like 20 classes).  

> Perftest: OutOfMemoryError in MultiLogReg for 80g sparse
> 
>
> Key: SYSTEMML-1242
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1242
> Project: SystemML
>  Issue Type: Bug
>  Components: Test
>Affects Versions: SystemML 0.13
> Environment: spark 2.1.0
>Reporter: Imran Younus
>Assignee: Matthias Boehm
> Fix For: SystemML 0.13
>
> Attachments: sparkDML.sh
>
>
> when running {{runMultiLogReg.sh}} script, {{MultiLogReg.dml}} ends with 
> OutOfMemory error for the case of 10M_1K sparse data and {{icpt = 1}}. Here 
> is the end of the log file:
> {code}
> 17/02/04 17:20:33 INFO api.DMLScript: SystemML Statistics:
> Total elapsed time:   697.694 sec.
> Total compilation time:   2.543 sec.
> Total execution time: 695.151 sec.
> Number of compiled Spark inst:73.
> Number of executed Spark inst:16.
> Cache hits (Mem, WB, FS, HDFS):   46/9/1/7.
> Cache writes (WB, FS, HDFS):  27/1/1.
> Cache times (ACQr/m, RLS, EXP):   281.541/0.003/131.589/48.737 sec.
> HOP DAGs recompiled (PRED, SB):   0/15.
> HOP DAGs recompile time:  0.067 sec.
> Spark ctx create time (lazy): 31.078 sec.
> Spark trans counts (par,bc,col):5/4/0.
> Spark trans times (par,bc,col):   46.748/0.392/0.000 secs.
> Total JIT compile time:   151.254 sec.
> Total JVM GC count:   144.
> Total JVM GC time:220.671 sec.
> Heavy hitter instructions (name, time, count):
> -- 1) ba+*144.194 sec 3
> -- 2) rand109.939 sec 9
> -- 3) uark+   105.011 sec 2
> -- 4) r'  100.933 sec 3
> -- 5) sp_/80.387 sec  1
> -- 6) sp_mapmm45.491 sec  2
> -- 7) sp_tak+*40.655 sec  1
> -- 8) append  9.480 sec   1
> -- 9) rangeReIndex7.347 sec   2
> -- 10)sp_-6.392 sec   3
> 17/02/04 17:20:33 INFO api.DMLScript: END DML run 02/04/2017 17:20:33
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:363)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:339)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlockUnsafe(MatrixBlock.java:408)
>   at 
> org.apache.sysml.runtime.io.MatrixReader.createOutputMatrixBlock(MatrixReader.java:107)
>   at 
> org.apache.sysml.runtime.io.ReaderBinaryBlockParallel.readMatrixFromHDFS(ReaderBinaryBlockParallel.java:59)
>   at 
> org.apache.sysml.runtime.util.DataConverter.readMatrixFromHDFS(DataConverter.java:203)
>   at 
> org.apache.sysml.runtime.util.DataConverter.readMatrixFromHDFS(DataConverter.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromHDFS(MatrixObject.java:425)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromHDFS(MatrixObject.java:60)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.readBlobFromHDFS(CacheableData.java:920)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromRDD(MatrixObject.java:478)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromRDD(MatrixObject.java:60)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.acquireRead(CacheableData.java:411)
>   at 
> org.apache.sysml.runtime.controlprogram.context.ExecutionContext.getMatrixInput(ExecutionContext.java:209)
>   at 
> org.apache.sysml.runtime.instructions.cp.AggregateBinaryCPInstruction.processInstruction(AggregateBinaryCPInstruction.java:74)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.IfProgramBlock.execute(IfProgramBlock.java:139)
>   at 
> org.apache.sysml.runtime.controlprogram.WhileProgramBlock.execute(WhileProgramBlock.java:165)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   at org.apache.sysml.api.DMLScr

[jira] [Updated] (SYSTEMML-1242) Perftest: OutOfMemoryError in MultiLogReg for 80g sparse

2017-02-14 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-1242:
-
Affects Version/s: (was: SystemML 0.13)
   SystemML 0.11
   SystemML 0.12

> Perftest: OutOfMemoryError in MultiLogReg for 80g sparse
> 
>
> Key: SYSTEMML-1242
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1242
> Project: SystemML
>  Issue Type: Bug
>  Components: Test
>Affects Versions: SystemML 0.11, SystemML 0.12
> Environment: spark 2.1.0
>Reporter: Imran Younus
>Assignee: Matthias Boehm
> Fix For: SystemML 0.13
>
> Attachments: sparkDML.sh
>
>
> when running {{runMultiLogReg.sh}} script, {{MultiLogReg.dml}} ends with 
> OutOfMemory error for the case of 10M_1K sparse data and {{icpt = 1}}. Here 
> is the end of the log file:
> {code}
> 17/02/04 17:20:33 INFO api.DMLScript: SystemML Statistics:
> Total elapsed time:   697.694 sec.
> Total compilation time:   2.543 sec.
> Total execution time: 695.151 sec.
> Number of compiled Spark inst:73.
> Number of executed Spark inst:16.
> Cache hits (Mem, WB, FS, HDFS):   46/9/1/7.
> Cache writes (WB, FS, HDFS):  27/1/1.
> Cache times (ACQr/m, RLS, EXP):   281.541/0.003/131.589/48.737 sec.
> HOP DAGs recompiled (PRED, SB):   0/15.
> HOP DAGs recompile time:  0.067 sec.
> Spark ctx create time (lazy): 31.078 sec.
> Spark trans counts (par,bc,col):5/4/0.
> Spark trans times (par,bc,col):   46.748/0.392/0.000 secs.
> Total JIT compile time:   151.254 sec.
> Total JVM GC count:   144.
> Total JVM GC time:220.671 sec.
> Heavy hitter instructions (name, time, count):
> -- 1) ba+*144.194 sec 3
> -- 2) rand109.939 sec 9
> -- 3) uark+   105.011 sec 2
> -- 4) r'  100.933 sec 3
> -- 5) sp_/80.387 sec  1
> -- 6) sp_mapmm45.491 sec  2
> -- 7) sp_tak+*40.655 sec  1
> -- 8) append  9.480 sec   1
> -- 9) rangeReIndex7.347 sec   2
> -- 10)sp_-6.392 sec   3
> 17/02/04 17:20:33 INFO api.DMLScript: END DML run 02/04/2017 17:20:33
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:363)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:339)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlockUnsafe(MatrixBlock.java:408)
>   at 
> org.apache.sysml.runtime.io.MatrixReader.createOutputMatrixBlock(MatrixReader.java:107)
>   at 
> org.apache.sysml.runtime.io.ReaderBinaryBlockParallel.readMatrixFromHDFS(ReaderBinaryBlockParallel.java:59)
>   at 
> org.apache.sysml.runtime.util.DataConverter.readMatrixFromHDFS(DataConverter.java:203)
>   at 
> org.apache.sysml.runtime.util.DataConverter.readMatrixFromHDFS(DataConverter.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromHDFS(MatrixObject.java:425)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromHDFS(MatrixObject.java:60)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.readBlobFromHDFS(CacheableData.java:920)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromRDD(MatrixObject.java:478)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromRDD(MatrixObject.java:60)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.acquireRead(CacheableData.java:411)
>   at 
> org.apache.sysml.runtime.controlprogram.context.ExecutionContext.getMatrixInput(ExecutionContext.java:209)
>   at 
> org.apache.sysml.runtime.instructions.cp.AggregateBinaryCPInstruction.processInstruction(AggregateBinaryCPInstruction.java:74)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.IfProgramBlock.execute(IfProgramBlock.java:139)
>   at 
> org.apache.sysml.runtime.controlprogram.WhileProgramBlock.execute(WhileProgramBlock.java:165)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   at org.apache.sysml.api.DMLScript.execute(DMLScript.java:684)
>   at org.apache.sysml.api.DMLScript.executeScript(DMLScript.jav

[jira] [Created] (SYSTEMML-1267) Input DataFrame CodeGenerator Error calculating stats warning

2017-02-14 Thread Deron Eriksson (JIRA)
Deron Eriksson created SYSTEMML-1267:


 Summary: Input DataFrame CodeGenerator Error calculating stats 
warning
 Key: SYSTEMML-1267
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1267
 Project: SystemML
  Issue Type: Bug
  Components: APIs, Runtime
Affects Versions: SystemML 0.13
Reporter: Deron Eriksson


This occurs for Spark 2.1.0 (spark-2.1.0-bin-hadoop2.7) but not for Spark 2.0.2 
on my machine.
This occurs for matrix 1000x1000 but not for matrix 100x100 using Spark 2.1.0.

The following DataFrame input code results in a "CodeGenerator: Error 
calculating stats of compiled class" warning.
{code}
$ spark-shell --executor-memory 4G --driver-memory 4G --jars target/SystemML.jar

import org.apache.sysml.api.mlcontext._
import org.apache.sysml.api.mlcontext.ScriptFactory._
val ml = new MLContext(sc)
import org.apache.spark.sql._
import org.apache.spark.sql.types.{StructType,StructField,DoubleType}
import scala.util.Random
val numRows = 1000
val numCols = 1000
val data = sc.parallelize(0 to numRows-1).map { _ => 
Row.fromSeq(Seq.fill(numCols)(Random.nextDouble)) }
val schema = StructType((0 to numCols-1).map { i => StructField("C" + i, 
DoubleType, true) } )
val df = spark.createDataFrame(data, schema)
val minMaxMean =
"""
minOut = min(Xin)
maxOut = max(Xin)
meanOut = mean(Xin)
"""
val mm = new MatrixMetadata(numRows, numCols)
val minMaxMeanScript = dml(minMaxMean).in("Xin", df, mm).out("minOut", 
"maxOut", "meanOut")
{code}

Results in:
{code}
scala> val minMaxMeanScript = dml(minMaxMean).in("Xin", df, mm).out("minOut", 
"maxOut", "meanOut")
[Stage 0:>  (0 + 8) / 
8]17/02/14 17:37:45 WARN CodeGenerator: Error calculating stats of compiled 
class.
java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readFully(DataInputStream.java:169)
at org.codehaus.janino.util.ClassFile.loadAttribute(ClassFile.java:1509)
at org.codehaus.janino.util.ClassFile.loadAttributes(ClassFile.java:644)
at org.codehaus.janino.util.ClassFile.loadFields(ClassFile.java:623)
at org.codehaus.janino.util.ClassFile.(ClassFile.java:280)
at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anonfun$recordCompilationStats$1.apply(CodeGenerator.scala:967)
at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anonfun$recordCompilationStats$1.apply(CodeGenerator.scala:964)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.recordCompilationStats(CodeGenerator.scala:964)
at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:936)
at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:998)
at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:995)
at 
org.spark_project.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
at 
org.spark_project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
at 
org.spark_project.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
at 
org.spark_project.guava.cache.LocalCache$Segment.get(LocalCache.java:2257)
at org.spark_project.guava.cache.LocalCache.get(LocalCache.java:4000)
at 
org.spark_project.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004)
at 
org.spark_project.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:890)
at 
org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.create(GenerateUnsafeProjection.scala:405)
at 
org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.create(GenerateUnsafeProjection.scala:359)
at 
org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.create(GenerateUnsafeProjection.scala:32)
at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:874)
at 
org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.extractProjection$lzycompute(ExpressionEncoder.scala:266)
at 
org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.extractProjection(ExpressionEncoder.scala:266)
at 
org.apache.spark.sql.catalyst.encoders.Expre

[jira] [Updated] (SYSTEMML-1267) Input DataFrame CodeGenerator Error calculating stats warning

2017-02-14 Thread Deron Eriksson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deron Eriksson updated SYSTEMML-1267:
-
Description: 
This occurs for Spark 2.1.0 (spark-2.1.0-bin-hadoop2.7) but not for Spark 2.0.2 
on my machine.
This occurs for matrix 1000x1000 but not for matrix 100x100 using Spark 2.1.0.

The following DataFrame input code results in a "CodeGenerator: Error 
calculating stats of compiled class" warning.
{code}
$ spark-shell --executor-memory 4G --driver-memory 4G --jars target/SystemML.jar

import org.apache.sysml.api.mlcontext._
import org.apache.sysml.api.mlcontext.ScriptFactory._
val ml = new MLContext(sc)
import org.apache.spark.sql._
import org.apache.spark.sql.types.{StructType,StructField,DoubleType}
import scala.util.Random
val numRows = 1000
val numCols = 1000
val data = sc.parallelize(0 to numRows-1).map { _ => 
Row.fromSeq(Seq.fill(numCols)(Random.nextDouble)) }
val schema = StructType((0 to numCols-1).map { i => StructField("C" + i, 
DoubleType, true) } )
val df = spark.createDataFrame(data, schema)
val minMaxMean =
"""
minOut = min(Xin)
maxOut = max(Xin)
meanOut = mean(Xin)
"""
val mm = new MatrixMetadata(numRows, numCols)
val minMaxMeanScript = dml(minMaxMean).in("Xin", df, mm).out("minOut", 
"maxOut", "meanOut")
{code}

Results in:
{code}
scala> val minMaxMeanScript = dml(minMaxMean).in("Xin", df, mm).out("minOut", 
"maxOut", "meanOut")
[Stage 0:>  (0 + 8) / 
8]17/02/14 17:37:45 WARN CodeGenerator: Error calculating stats of compiled 
class.
java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readFully(DataInputStream.java:169)
at org.codehaus.janino.util.ClassFile.loadAttribute(ClassFile.java:1509)
at org.codehaus.janino.util.ClassFile.loadAttributes(ClassFile.java:644)
at org.codehaus.janino.util.ClassFile.loadFields(ClassFile.java:623)
at org.codehaus.janino.util.ClassFile.(ClassFile.java:280)
at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anonfun$recordCompilationStats$1.apply(CodeGenerator.scala:967)
at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anonfun$recordCompilationStats$1.apply(CodeGenerator.scala:964)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.recordCompilationStats(CodeGenerator.scala:964)
at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:936)
at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:998)
at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:995)
at 
org.spark_project.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
at 
org.spark_project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
at 
org.spark_project.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
at 
org.spark_project.guava.cache.LocalCache$Segment.get(LocalCache.java:2257)
at org.spark_project.guava.cache.LocalCache.get(LocalCache.java:4000)
at 
org.spark_project.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004)
at 
org.spark_project.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:890)
at 
org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.create(GenerateUnsafeProjection.scala:405)
at 
org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.create(GenerateUnsafeProjection.scala:359)
at 
org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.create(GenerateUnsafeProjection.scala:32)
at 
org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:874)
at 
org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.extractProjection$lzycompute(ExpressionEncoder.scala:266)
at 
org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.extractProjection(ExpressionEncoder.scala:266)
at 
org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:290)
at 
org.apache.spark.sql.SparkSession$$anonfun$3.apply(SparkSession.scala:547)
at 
org.apache.spark.sql.SparkSession$$anonfun$3.apply(SparkSession.scala:5

[jira] [Commented] (SYSTEMML-1267) Input DataFrame CodeGenerator Error calculating stats warning

2017-02-14 Thread Deron Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15867085#comment-15867085
 ] 

Deron Eriksson commented on SYSTEMML-1267:
--

cc [~mboehm7] [~niketanpansare]


> Input DataFrame CodeGenerator Error calculating stats warning
> -
>
> Key: SYSTEMML-1267
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1267
> Project: SystemML
>  Issue Type: Bug
>  Components: APIs, Runtime
>Affects Versions: SystemML 0.13
>Reporter: Deron Eriksson
>
> This occurs for Spark 2.1.0 (spark-2.1.0-bin-hadoop2.7) but not for Spark 
> 2.0.2 on my machine.
> This occurs for matrix 1000x1000 but not for matrix 100x100 using Spark 2.1.0.
> The following DataFrame input code results in a "CodeGenerator: Error 
> calculating stats of compiled class" warning.
> {code}
> $ spark-shell --executor-memory 4G --driver-memory 4G --jars 
> target/SystemML.jar
> import org.apache.sysml.api.mlcontext._
> import org.apache.sysml.api.mlcontext.ScriptFactory._
> val ml = new MLContext(sc)
> import org.apache.spark.sql._
> import org.apache.spark.sql.types.{StructType,StructField,DoubleType}
> import scala.util.Random
> val numRows = 1000
> val numCols = 1000
> val data = sc.parallelize(0 to numRows-1).map { _ => 
> Row.fromSeq(Seq.fill(numCols)(Random.nextDouble)) }
> val schema = StructType((0 to numCols-1).map { i => StructField("C" + i, 
> DoubleType, true) } )
> val df = spark.createDataFrame(data, schema)
> val minMaxMean =
> """
> minOut = min(Xin)
> maxOut = max(Xin)
> meanOut = mean(Xin)
> """
> val mm = new MatrixMetadata(numRows, numCols)
> val minMaxMeanScript = dml(minMaxMean).in("Xin", df, mm).out("minOut", 
> "maxOut", "meanOut")
> {code}
> Results in:
> {code}
> scala> val minMaxMeanScript = dml(minMaxMean).in("Xin", df, mm).out("minOut", 
> "maxOut", "meanOut")
> [Stage 0:>  (0 + 8) / 
> 8]17/02/14 17:37:45 WARN CodeGenerator: Error calculating stats of compiled 
> class.
> java.io.EOFException
>   at java.io.DataInputStream.readFully(DataInputStream.java:197)
>   at java.io.DataInputStream.readFully(DataInputStream.java:169)
>   at org.codehaus.janino.util.ClassFile.loadAttribute(ClassFile.java:1509)
>   at org.codehaus.janino.util.ClassFile.loadAttributes(ClassFile.java:644)
>   at org.codehaus.janino.util.ClassFile.loadFields(ClassFile.java:623)
>   at org.codehaus.janino.util.ClassFile.(ClassFile.java:280)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anonfun$recordCompilationStats$1.apply(CodeGenerator.scala:967)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anonfun$recordCompilationStats$1.apply(CodeGenerator.scala:964)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:893)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.recordCompilationStats(CodeGenerator.scala:964)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:936)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:998)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:995)
>   at 
> org.spark_project.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
>   at 
> org.spark_project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
>   at 
> org.spark_project.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
>   at 
> org.spark_project.guava.cache.LocalCache$Segment.get(LocalCache.java:2257)
>   at org.spark_project.guava.cache.LocalCache.get(LocalCache.java:4000)
>   at 
> org.spark_project.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004)
>   at 
> org.spark_project.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:890)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.create(GenerateUnsafeProjection.scala:405)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.create(GenerateUnsafeProjection.scala:359)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.create(GenerateUnsafeProjection.scala:32)
>   at 
> org.apache.spark

[jira] [Commented] (SYSTEMML-1265) ArrayIndexOutOfBoundsException in interQuartileMean(X, W)

2017-02-14 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15867104#comment-15867104
 ] 

Matthias Boehm commented on SYSTEMML-1265:
--

Yes, and we actually have tests for all of them. As it turned out, the problem 
was wrong input data - we expect the weights to be integers, so the two 0.5 are 
actually invalid. So far we only checked the total sum for integer but in your 
case the two 0.5 canceled each other out. I'll add a more robust error 
handling. 

Before updating the documentation, we should also think about making all 
weighted operations consistent - there are some operations like cov and cm that 
allow arbitrary weights, whereas quantile and iqm require integer data.

> ArrayIndexOutOfBoundsException in interQuartileMean(X, W)
> -
>
> Key: SYSTEMML-1265
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1265
> Project: SystemML
>  Issue Type: Bug
>Reporter: Felix Schüler
>Priority: Minor
>
> The interQuartileMean(X, W) function seems to be broken. The following script 
> throws an {code} ArrayIndexOutOfBoundsException {code}
> {code}
> A = matrix("0.0 1.0 2.0 1.0 0.0", rows=5, cols=1)
> W = matrix("0.0 0.5 1.0 0.5 0.0", rows=5, cols=1)
> b = interQuartileMean(A, W)
> print("b is: " + b)
> {code}
> Error:
> {code}
> 17/02/14 13:56:09 INFO api.DMLScript: BEGIN DML run 02/14/2017 13:56:09
> 17/02/14 13:56:10 INFO api.DMLScript: SystemML Statistics:
> Total execution time: 0.368 sec.
> Number of executed MR Jobs:   0.
> 17/02/14 13:56:10 INFO api.DMLScript: END DML run 02/14/2017 13:56:10
> 17/02/14 13:56:10 ERROR api.DMLScript: Failed to execute DML script.
> org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
> block generated from statement block between lines 1 and 5 -- Error 
> evaluating instruction: 
> CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:130)
>   at org.apache.sysml.api.DMLScript.execute(DMLScript.java:684)
>   at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:360)
>   at org.apache.sysml.api.DMLScript.main(DMLScript.java:221)
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
> in program block generated from statement block between lines 1 and 5 -- 
> Error evaluating instruction: 
> CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:320)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   ... 3 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 9
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.quickGetValue(MatrixBlock.java:576)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.interQuartileMean(MatrixBlock.java:4693)
>   at 
> org.apache.sysml.runtime.instructions.cp.QuantilePickCPInstruction.processInstruction(QuantilePickCPInstruction.java:177)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
>   ... 6 more
> Exception in thread "main" org.apache.sysml.api.DMLException: 
> org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
> block generated from statement block between lines 1 and 5 -- Error 
> evaluating instruction: 
> CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
>   at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:374)
>   at org.apache.sysml.api.DMLScript.main(DMLScript.java:221)
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
> block generated from statement block between lines 1 and 5 -- Error 
> evaluating instruction: 
> CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:130)
>   at org.apache.sysml.api.DMLScript.execute(DMLScript.java:684)
>   at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:360)
>   ... 1 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
> in program block generated from statement block between lines 1 and 5 -- 
> Error evaluating instruction: 
> CP°qpick°_mVar4·MATRIX·DOUBLE°_Var6·SCALAR·DOUBLE°IQM°true
>   at 
> org.apache.sysml.r

[jira] [Commented] (SYSTEMML-1267) Input DataFrame CodeGenerator Error calculating stats warning

2017-02-14 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15867127#comment-15867127
 ] 

Matthias Boehm commented on SYSTEMML-1267:
--

Thanks for creating the issue. Yes, I've seen this in our build logs too, but 
don't have a good idea yet where it might come from. Anyway, I'll have a look 
once the high-priority issues are done.

> Input DataFrame CodeGenerator Error calculating stats warning
> -
>
> Key: SYSTEMML-1267
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1267
> Project: SystemML
>  Issue Type: Bug
>  Components: APIs, Runtime
>Affects Versions: SystemML 0.13
>Reporter: Deron Eriksson
>
> This occurs for Spark 2.1.0 (spark-2.1.0-bin-hadoop2.7) but not for Spark 
> 2.0.2 on my machine.
> This occurs for matrix 1000x1000 but not for matrix 100x100 using Spark 2.1.0.
> The following DataFrame input code results in a "CodeGenerator: Error 
> calculating stats of compiled class" warning.
> {code}
> $ spark-shell --executor-memory 4G --driver-memory 4G --jars 
> target/SystemML.jar
> import org.apache.sysml.api.mlcontext._
> import org.apache.sysml.api.mlcontext.ScriptFactory._
> val ml = new MLContext(sc)
> import org.apache.spark.sql._
> import org.apache.spark.sql.types.{StructType,StructField,DoubleType}
> import scala.util.Random
> val numRows = 1000
> val numCols = 1000
> val data = sc.parallelize(0 to numRows-1).map { _ => 
> Row.fromSeq(Seq.fill(numCols)(Random.nextDouble)) }
> val schema = StructType((0 to numCols-1).map { i => StructField("C" + i, 
> DoubleType, true) } )
> val df = spark.createDataFrame(data, schema)
> val minMaxMean =
> """
> minOut = min(Xin)
> maxOut = max(Xin)
> meanOut = mean(Xin)
> """
> val mm = new MatrixMetadata(numRows, numCols)
> val minMaxMeanScript = dml(minMaxMean).in("Xin", df, mm).out("minOut", 
> "maxOut", "meanOut")
> {code}
> Results in:
> {code}
> scala> val minMaxMeanScript = dml(minMaxMean).in("Xin", df, mm).out("minOut", 
> "maxOut", "meanOut")
> [Stage 0:>  (0 + 8) / 
> 8]17/02/14 17:37:45 WARN CodeGenerator: Error calculating stats of compiled 
> class.
> java.io.EOFException
>   at java.io.DataInputStream.readFully(DataInputStream.java:197)
>   at java.io.DataInputStream.readFully(DataInputStream.java:169)
>   at org.codehaus.janino.util.ClassFile.loadAttribute(ClassFile.java:1509)
>   at org.codehaus.janino.util.ClassFile.loadAttributes(ClassFile.java:644)
>   at org.codehaus.janino.util.ClassFile.loadFields(ClassFile.java:623)
>   at org.codehaus.janino.util.ClassFile.(ClassFile.java:280)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anonfun$recordCompilationStats$1.apply(CodeGenerator.scala:967)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anonfun$recordCompilationStats$1.apply(CodeGenerator.scala:964)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:893)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.recordCompilationStats(CodeGenerator.scala:964)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:936)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:998)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:995)
>   at 
> org.spark_project.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
>   at 
> org.spark_project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
>   at 
> org.spark_project.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
>   at 
> org.spark_project.guava.cache.LocalCache$Segment.get(LocalCache.java:2257)
>   at org.spark_project.guava.cache.LocalCache.get(LocalCache.java:4000)
>   at 
> org.spark_project.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004)
>   at 
> org.spark_project.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:890)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.create(GenerateUnsafeProjection.scala:405)
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.GenerateUnsafeProjection$.create(GenerateUnsafeProjection.scala:35

[jira] [Assigned] (SYSTEMML-1244) FrameReader with CSV format have issues due to double quotes in some cases

2017-02-14 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm reassigned SYSTEMML-1244:


Assignee: Matthias Boehm

> FrameReader with CSV format have issues due to double quotes in some cases
> --
>
> Key: SYSTEMML-1244
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1244
> Project: SystemML
>  Issue Type: Bug
>Reporter: Arvind Surve
>Assignee: Matthias Boehm
> Fix For: SystemML 0.13
>
>
> This is an example for input data,
> It has three columns with TAB as a field separator.
> "20news-bydate-train/alt.atheism/49960" 88.0
> "20news-bydate-train/alt.atheism/49960" "#" 1.0
> Couple of observations so far:
>   1. Double quote is considered as a part of input.
>   2. Next Double quote is considered as end of input field.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (SYSTEMML-1244) FrameReader with CSV format have issues due to double quotes in some cases

2017-02-14 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm resolved SYSTEMML-1244.
--
   Resolution: Fixed
Fix Version/s: SystemML 0.13

> FrameReader with CSV format have issues due to double quotes in some cases
> --
>
> Key: SYSTEMML-1244
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1244
> Project: SystemML
>  Issue Type: Bug
>Reporter: Arvind Surve
>Assignee: Matthias Boehm
> Fix For: SystemML 0.13
>
>
> This is an example for input data,
> It has three columns with TAB as a field separator.
> "20news-bydate-train/alt.atheism/49960" 88.0
> "20news-bydate-train/alt.atheism/49960" "#" 1.0
> Couple of observations so far:
>   1. Double quote is considered as a part of input.
>   2. Next Double quote is considered as end of input field.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1244) FrameReader with CSV format have issues due to double quotes in some cases

2017-02-14 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15867204#comment-15867204
 ] 

Matthias Boehm commented on SYSTEMML-1244:
--

Just to clarify - there were two issues: (1) tokens that are a concatenation of 
quoted tokens (according to RFC4180) and non-quoted tokens were split after the 
last quote, and (2) incorrect parsing of frame meta data. 

We now made the related split and count functionality more robust with regard 
to these special cases without sacrificing performance for the common case 
without quotes. 

[~acs_s] would you mind closing your related PR?

> FrameReader with CSV format have issues due to double quotes in some cases
> --
>
> Key: SYSTEMML-1244
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1244
> Project: SystemML
>  Issue Type: Bug
>Reporter: Arvind Surve
>Assignee: Matthias Boehm
> Fix For: SystemML 0.13
>
>
> This is an example for input data,
> It has three columns with TAB as a field separator.
> "20news-bydate-train/alt.atheism/49960" 88.0
> "20news-bydate-train/alt.atheism/49960" "#" 1.0
> Couple of observations so far:
>   1. Double quote is considered as a part of input.
>   2. Next Double quote is considered as end of input field.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1258) cbind appends strings

2017-02-14 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15867223#comment-15867223
 ] 

Matthias Boehm commented on SYSTEMML-1258:
--

Yes, this is because we literally map the append to cbind even before 
validation. 

The only difference between the string append and string concatenation (+ 
operator) is that the string append adds a newline. However, since we meanwhile 
support escape characters, I would actually be in favor of completely removing 
this string append. It would require to change a number of scripts - but I 
think it would actually look much cleaner.

> cbind appends strings
> -
>
> Key: SYSTEMML-1258
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1258
> Project: SystemML
>  Issue Type: Bug
>  Components: Parser, Runtime
>Reporter: Deron Eriksson
>
> The append() method is documented to allow (1) cbind() (column append) 
> functionality for matrices and (2) allows strings to be appended to other 
> strings.
> However, cbind() also seems to allow strings to be appended to other strings. 
> Since cbind() is a matrix function, this seems incorrect.
> {code}
> makes = "makes";
> makes_sense = append(makes, "sense");
> print(makes_sense);
> doesnt_make = "doesn't make";
> doesnt_make_sense = cbind(doesnt_make, "sense");
> print(doesnt_make_sense);
> {code}
> output:
> {code}
> makes
> sense
> doesn't make
> sense
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SYSTEMML-1268) Replace deprecated Accumulator with AccumulatorV2

2017-02-14 Thread Matthias Boehm (JIRA)
Matthias Boehm created SYSTEMML-1268:


 Summary: Replace deprecated Accumulator with AccumulatorV2
 Key: SYSTEMML-1268
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1268
 Project: SystemML
  Issue Type: Task
Reporter: Matthias Boehm
Assignee: Matthias Boehm






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (SYSTEMML-1269) Performance sparse matrix collect

2017-02-14 Thread Matthias Boehm (JIRA)
Matthias Boehm created SYSTEMML-1269:


 Summary: Performance sparse matrix collect
 Key: SYSTEMML-1269
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1269
 Project: SystemML
  Issue Type: Sub-task
Reporter: Matthias Boehm


In case of sparse matrices, our RDD collect functionality exhibits unnecessary 
overhead for (1) repeated sparse row allocations (due to missing exploitation 
of the number of non-zeros which affects the growth rate), and (2) deep copies 
of sparse rows (even in cases like a single column block, where this is not 
required).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)