[jira] [Closed] (SYSTEMML-1261) Fix transitive Spark execution type selection for ba+*

2017-02-17 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm closed SYSTEMML-1261.


> Fix transitive Spark execution type selection for ba+*
> --
>
> Key: SYSTEMML-1261
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1261
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Test
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.13
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (SYSTEMML-1242) Perftest: OutOfMemoryError in MultiLogReg for 80g sparse

2017-02-17 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm closed SYSTEMML-1242.


> Perftest: OutOfMemoryError in MultiLogReg for 80g sparse
> 
>
> Key: SYSTEMML-1242
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1242
> Project: SystemML
>  Issue Type: Bug
>  Components: Test
>Affects Versions: SystemML 0.11, SystemML 0.12
> Environment: spark 2.1.0
>Reporter: Imran Younus
>Assignee: Matthias Boehm
> Fix For: SystemML 0.13
>
> Attachments: sparkDML.sh
>
>
> when running {{runMultiLogReg.sh}} script, {{MultiLogReg.dml}} ends with 
> OutOfMemory error for the case of 10M_1K sparse data and {{icpt = 1}}. Here 
> is the end of the log file:
> {code}
> 17/02/04 17:20:33 INFO api.DMLScript: SystemML Statistics:
> Total elapsed time:   697.694 sec.
> Total compilation time:   2.543 sec.
> Total execution time: 695.151 sec.
> Number of compiled Spark inst:73.
> Number of executed Spark inst:16.
> Cache hits (Mem, WB, FS, HDFS):   46/9/1/7.
> Cache writes (WB, FS, HDFS):  27/1/1.
> Cache times (ACQr/m, RLS, EXP):   281.541/0.003/131.589/48.737 sec.
> HOP DAGs recompiled (PRED, SB):   0/15.
> HOP DAGs recompile time:  0.067 sec.
> Spark ctx create time (lazy): 31.078 sec.
> Spark trans counts (par,bc,col):5/4/0.
> Spark trans times (par,bc,col):   46.748/0.392/0.000 secs.
> Total JIT compile time:   151.254 sec.
> Total JVM GC count:   144.
> Total JVM GC time:220.671 sec.
> Heavy hitter instructions (name, time, count):
> -- 1) ba+*144.194 sec 3
> -- 2) rand109.939 sec 9
> -- 3) uark+   105.011 sec 2
> -- 4) r'  100.933 sec 3
> -- 5) sp_/80.387 sec  1
> -- 6) sp_mapmm45.491 sec  2
> -- 7) sp_tak+*40.655 sec  1
> -- 8) append  9.480 sec   1
> -- 9) rangeReIndex7.347 sec   2
> -- 10)sp_-6.392 sec   3
> 17/02/04 17:20:33 INFO api.DMLScript: END DML run 02/04/2017 17:20:33
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:363)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:339)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlockUnsafe(MatrixBlock.java:408)
>   at 
> org.apache.sysml.runtime.io.MatrixReader.createOutputMatrixBlock(MatrixReader.java:107)
>   at 
> org.apache.sysml.runtime.io.ReaderBinaryBlockParallel.readMatrixFromHDFS(ReaderBinaryBlockParallel.java:59)
>   at 
> org.apache.sysml.runtime.util.DataConverter.readMatrixFromHDFS(DataConverter.java:203)
>   at 
> org.apache.sysml.runtime.util.DataConverter.readMatrixFromHDFS(DataConverter.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromHDFS(MatrixObject.java:425)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromHDFS(MatrixObject.java:60)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.readBlobFromHDFS(CacheableData.java:920)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromRDD(MatrixObject.java:478)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromRDD(MatrixObject.java:60)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.acquireRead(CacheableData.java:411)
>   at 
> org.apache.sysml.runtime.controlprogram.context.ExecutionContext.getMatrixInput(ExecutionContext.java:209)
>   at 
> org.apache.sysml.runtime.instructions.cp.AggregateBinaryCPInstruction.processInstruction(AggregateBinaryCPInstruction.java:74)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.IfProgramBlock.execute(IfProgramBlock.java:139)
>   at 
> org.apache.sysml.runtime.controlprogram.WhileProgramBlock.execute(WhileProgramBlock.java:165)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   at org.apache.sysml.api.DMLScript.execute(DMLScript.java:684)
>   at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:360)
>   at org.apache.sysml.api.DMLScript.main(DMLScript.java:221)
>   at 

[jira] [Resolved] (SYSTEMML-1242) Perftest: OutOfMemoryError in MultiLogReg for 80g sparse

2017-02-17 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm resolved SYSTEMML-1242.
--
Resolution: Fixed

I'm closing this issue as the new transitive execution type selection fixed the 
specific OOM - the other robustness features will be addressed in separate 
JIRAs.

> Perftest: OutOfMemoryError in MultiLogReg for 80g sparse
> 
>
> Key: SYSTEMML-1242
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1242
> Project: SystemML
>  Issue Type: Bug
>  Components: Test
>Affects Versions: SystemML 0.11, SystemML 0.12
> Environment: spark 2.1.0
>Reporter: Imran Younus
>Assignee: Matthias Boehm
> Fix For: SystemML 0.13
>
> Attachments: sparkDML.sh
>
>
> when running {{runMultiLogReg.sh}} script, {{MultiLogReg.dml}} ends with 
> OutOfMemory error for the case of 10M_1K sparse data and {{icpt = 1}}. Here 
> is the end of the log file:
> {code}
> 17/02/04 17:20:33 INFO api.DMLScript: SystemML Statistics:
> Total elapsed time:   697.694 sec.
> Total compilation time:   2.543 sec.
> Total execution time: 695.151 sec.
> Number of compiled Spark inst:73.
> Number of executed Spark inst:16.
> Cache hits (Mem, WB, FS, HDFS):   46/9/1/7.
> Cache writes (WB, FS, HDFS):  27/1/1.
> Cache times (ACQr/m, RLS, EXP):   281.541/0.003/131.589/48.737 sec.
> HOP DAGs recompiled (PRED, SB):   0/15.
> HOP DAGs recompile time:  0.067 sec.
> Spark ctx create time (lazy): 31.078 sec.
> Spark trans counts (par,bc,col):5/4/0.
> Spark trans times (par,bc,col):   46.748/0.392/0.000 secs.
> Total JIT compile time:   151.254 sec.
> Total JVM GC count:   144.
> Total JVM GC time:220.671 sec.
> Heavy hitter instructions (name, time, count):
> -- 1) ba+*144.194 sec 3
> -- 2) rand109.939 sec 9
> -- 3) uark+   105.011 sec 2
> -- 4) r'  100.933 sec 3
> -- 5) sp_/80.387 sec  1
> -- 6) sp_mapmm45.491 sec  2
> -- 7) sp_tak+*40.655 sec  1
> -- 8) append  9.480 sec   1
> -- 9) rangeReIndex7.347 sec   2
> -- 10)sp_-6.392 sec   3
> 17/02/04 17:20:33 INFO api.DMLScript: END DML run 02/04/2017 17:20:33
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:363)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:339)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlockUnsafe(MatrixBlock.java:408)
>   at 
> org.apache.sysml.runtime.io.MatrixReader.createOutputMatrixBlock(MatrixReader.java:107)
>   at 
> org.apache.sysml.runtime.io.ReaderBinaryBlockParallel.readMatrixFromHDFS(ReaderBinaryBlockParallel.java:59)
>   at 
> org.apache.sysml.runtime.util.DataConverter.readMatrixFromHDFS(DataConverter.java:203)
>   at 
> org.apache.sysml.runtime.util.DataConverter.readMatrixFromHDFS(DataConverter.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromHDFS(MatrixObject.java:425)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromHDFS(MatrixObject.java:60)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.readBlobFromHDFS(CacheableData.java:920)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromRDD(MatrixObject.java:478)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromRDD(MatrixObject.java:60)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.acquireRead(CacheableData.java:411)
>   at 
> org.apache.sysml.runtime.controlprogram.context.ExecutionContext.getMatrixInput(ExecutionContext.java:209)
>   at 
> org.apache.sysml.runtime.instructions.cp.AggregateBinaryCPInstruction.processInstruction(AggregateBinaryCPInstruction.java:74)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.IfProgramBlock.execute(IfProgramBlock.java:139)
>   at 
> org.apache.sysml.runtime.controlprogram.WhileProgramBlock.execute(WhileProgramBlock.java:165)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   at org.apache.sysml.api.DMLScript.execute(DMLScript.java:684)
> 

[jira] [Commented] (SYSTEMML-1242) Perftest: OutOfMemoryError in MultiLogReg for 80g sparse

2017-02-17 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873013#comment-15873013
 ] 

Matthias Boehm commented on SYSTEMML-1242:
--

sounds good - thanks for confirming [~iyounus]

> Perftest: OutOfMemoryError in MultiLogReg for 80g sparse
> 
>
> Key: SYSTEMML-1242
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1242
> Project: SystemML
>  Issue Type: Bug
>  Components: Test
>Affects Versions: SystemML 0.11, SystemML 0.12
> Environment: spark 2.1.0
>Reporter: Imran Younus
>Assignee: Matthias Boehm
> Fix For: SystemML 0.13
>
> Attachments: sparkDML.sh
>
>
> when running {{runMultiLogReg.sh}} script, {{MultiLogReg.dml}} ends with 
> OutOfMemory error for the case of 10M_1K sparse data and {{icpt = 1}}. Here 
> is the end of the log file:
> {code}
> 17/02/04 17:20:33 INFO api.DMLScript: SystemML Statistics:
> Total elapsed time:   697.694 sec.
> Total compilation time:   2.543 sec.
> Total execution time: 695.151 sec.
> Number of compiled Spark inst:73.
> Number of executed Spark inst:16.
> Cache hits (Mem, WB, FS, HDFS):   46/9/1/7.
> Cache writes (WB, FS, HDFS):  27/1/1.
> Cache times (ACQr/m, RLS, EXP):   281.541/0.003/131.589/48.737 sec.
> HOP DAGs recompiled (PRED, SB):   0/15.
> HOP DAGs recompile time:  0.067 sec.
> Spark ctx create time (lazy): 31.078 sec.
> Spark trans counts (par,bc,col):5/4/0.
> Spark trans times (par,bc,col):   46.748/0.392/0.000 secs.
> Total JIT compile time:   151.254 sec.
> Total JVM GC count:   144.
> Total JVM GC time:220.671 sec.
> Heavy hitter instructions (name, time, count):
> -- 1) ba+*144.194 sec 3
> -- 2) rand109.939 sec 9
> -- 3) uark+   105.011 sec 2
> -- 4) r'  100.933 sec 3
> -- 5) sp_/80.387 sec  1
> -- 6) sp_mapmm45.491 sec  2
> -- 7) sp_tak+*40.655 sec  1
> -- 8) append  9.480 sec   1
> -- 9) rangeReIndex7.347 sec   2
> -- 10)sp_-6.392 sec   3
> 17/02/04 17:20:33 INFO api.DMLScript: END DML run 02/04/2017 17:20:33
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:363)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:339)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlockUnsafe(MatrixBlock.java:408)
>   at 
> org.apache.sysml.runtime.io.MatrixReader.createOutputMatrixBlock(MatrixReader.java:107)
>   at 
> org.apache.sysml.runtime.io.ReaderBinaryBlockParallel.readMatrixFromHDFS(ReaderBinaryBlockParallel.java:59)
>   at 
> org.apache.sysml.runtime.util.DataConverter.readMatrixFromHDFS(DataConverter.java:203)
>   at 
> org.apache.sysml.runtime.util.DataConverter.readMatrixFromHDFS(DataConverter.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromHDFS(MatrixObject.java:425)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromHDFS(MatrixObject.java:60)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.readBlobFromHDFS(CacheableData.java:920)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromRDD(MatrixObject.java:478)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromRDD(MatrixObject.java:60)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.acquireRead(CacheableData.java:411)
>   at 
> org.apache.sysml.runtime.controlprogram.context.ExecutionContext.getMatrixInput(ExecutionContext.java:209)
>   at 
> org.apache.sysml.runtime.instructions.cp.AggregateBinaryCPInstruction.processInstruction(AggregateBinaryCPInstruction.java:74)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.IfProgramBlock.execute(IfProgramBlock.java:139)
>   at 
> org.apache.sysml.runtime.controlprogram.WhileProgramBlock.execute(WhileProgramBlock.java:165)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   at org.apache.sysml.api.DMLScript.execute(DMLScript.java:684)
>   at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:360)
>   at 

[jira] [Commented] (SYSTEMML-1211) Verify dependencies for Spark 2

2017-02-17 Thread Deron Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872854#comment-15872854
 ] 

Deron Eriksson commented on SYSTEMML-1211:
--

Thank you [~gweidner]! 
[PR400|https://github.com/apache/incubator-systemml/pull/400] addressed the 
Windows Hadoop 2.6.0 issue.


> Verify dependencies for Spark 2
> ---
>
> Key: SYSTEMML-1211
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1211
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Build
>Reporter: Deron Eriksson
>Assignee: Deron Eriksson
>
> With the migration to Spark 2, we should verify that the artifact assemblies 
> are properly handling all dependencies.
> Also, we should verify that that artifact licenses properly include all 
> dependencies following the Spark 2 migration.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (SYSTEMML-1283) Out of memory error

2017-02-17 Thread Brendan Dwyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brendan Dwyer updated SYSTEMML-1283:

Description: 
Possibly related to [SYSTEMML-1281]

When a matrix X containing ~13,000 rows and ~30 unique values are passed into 
the following DML scripts it errors out on my laptop but passes in my 5 node 
cluster.

{code}
  #  # encode dml function for one hot encoding
  encode_onehot = function(matrix[double] X) return(matrix[double] Y) {
  N = nrow(X)
  Y = table(seq(1, N, 1), X)
  }
  # a dummy read, which allows sysML to attach variables
  X = read("") 
  
  col_idx = $onehot_index
  
  nc = ncol(X)
  if (col_idx < 1 | col_idx > nc) {
  stop("one hot index out of range")
  }
  Y = matrix(0, rows=1, cols=1)
  oneHot = encode_onehot(X[,col_idx:col_idx])
  if (col_idx == 1) {
  if (col_idx < nc) {
  X_tmp = X[, col_idx+1:nc]
  Y = append(oneHot, X_tmp)
  } else {
  Y = oneHot
  }
  } else if (1 < col_idx & col_idx < nc) {
  Y = append(append(X[,1:col_idx-1], oneHot), X[, col_idx+1:nc])
  } else { # col_idx == nc
  Y = append(X[,1:col_idx-1], oneHot)
  }
  # a dummy write, which allows sysML to attach varibles
  write(Y, "") 
{code}


Error:
{code}
17/02/17 16:57:35 ERROR Executor: Exception in task 0.0 in stage 63.0 (TID 1739)
java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.lang.Double.valueOf(Double.java:519)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply_853$(Unknown
 Source)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply(Unknown
 Source)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at org.apache.spark.util.Utils$$anon$4.next(Utils.scala:1778)
at org.apache.spark.util.Utils$$anon$4.next(Utils.scala:1772)
at 
scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:31)
at 
org.apache.sysml.runtime.instructions.spark.utils.FrameRDDConverterUtils$DataFrameToBinaryBlockFunction.call(FrameRDDConverterUtils.java:748)
at 
org.apache.sysml.runtime.instructions.spark.utils.FrameRDDConverterUtils$DataFrameToBinaryBlockFunction.call(FrameRDDConverterUtils.java:715)
at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)
at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)
at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:796)
at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:796)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
17/02/17 16:57:35 ERROR TaskSetManager: Task 0 in stage 63.0 failed 1 times; 
aborting job
17/02/17 16:57:36 ERROR SparkUncaughtExceptionHandler: Uncaught exception in 
thread Thread[Executor task launch worker-20,5,main]
java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.lang.Double.valueOf(Double.java:519)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply_853$(Unknown
 Source)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.apply(Unknown
 Source)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at org.apache.spark.util.Utils$$anon$4.next(Utils.scala:1778)
at org.apache.spark.util.Utils$$anon$4.next(Utils.scala:1772)
at 
scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:31)
at 
org.apache.sysml.runtime.instructions.spark.utils.FrameRDDConverterUtils$DataFrameToBinaryBlockFunction.call(FrameRDDConverterUtils.java:748)
at 

[jira] [Commented] (SYSTEMML-1281) OOM Error On Binary Write

2017-02-17 Thread Mike Dusenberry (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872842#comment-15872842
 ] 

Mike Dusenberry commented on SYSTEMML-1281:
---

Thanks for attempting to reproduce this.  Just for clarity, in my case, I have 
100GB executors with 48 cores, thus a single executor per machine.  Also, are 
you using a DataFrame with a {{Vector}} column?  I'm attempting the write now 
with a constrained number of cores (24).  Perhaps you have a different setup?  
As for writing with Spark, it's not currently possible to save a DataFrame with 
a {{Vector}} column to CSV format (Parquet works though).

As for the assumption, yes it's not hardcoded in the system, but it's been an 
implicit bias that has caused many problems and should be eradicated. :)

> OOM Error On Binary Write
> -
>
> Key: SYSTEMML-1281
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1281
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 0.13
>Reporter: Mike Dusenberry
>Priority: Blocker
>
> I'm running into the following heap space OOM error while attempting to save 
> a large Spark DataFrame to a SystemML binary format via DML {{write}} 
> statements.
> Script:
> {code}
> tr_sample_filename = os.path.join("data", "train_{}{}.parquet".format(size, 
> "_grayscale" if grayscale else ""))
> val_sample_filename = os.path.join("data", "val_{}{}.parquet".format(size, 
> "_grayscale" if grayscale else ""))
> train_df = sqlContext.read.load(tr_sample_filename)
> val_df = sqlContext.read.load(val_sample_filename)
> train_df, val_df
> # Note: Must use the row index column, or X may not
> # necessarily correspond correctly to Y
> X_df = train_df.select("__INDEX", "sample")
> X_val_df = val_df.select("__INDEX", "sample")
> y_df = train_df.select("__INDEX", "tumor_score")
> y_val_df = val_df.select("__INDEX", "tumor_score")
> X_df, X_val_df, y_df, y_val_df
> script = """
> # Scale images to [-1,1]
> X = X / 255
> X_val = X_val / 255
> X = X * 2 - 1
> X_val = X_val * 2 - 1
> # One-hot encode the labels
> num_tumor_classes = 3
> n = nrow(y)
> n_val = nrow(y_val)
> Y = table(seq(1, n), y, n, num_tumor_classes)
> Y_val = table(seq(1, n_val), y_val, n_val, num_tumor_classes)
> """
> outputs = ("X", "X_val", "Y", "Y_val")
> script = dml(script).input(X=X_df, X_val=X_val_df, y=y_df, 
> y_val=y_val_df).output(*outputs)
> X, X_val, Y, Y_val = ml.execute(script).get(*outputs)
> X, X_val, Y, Y_val
> script = """
> write(X, "data/systemml/X_"+size+"_"+c+"_binary", format="binary")
> write(Y, "data/systemml/Y_"+size+"_"+c+"_binary", format="binary")
> write(X_val, "data/systemml/X_val_"+size+"_"+c+"_binary", format="binary")
> write(Y_val, "data/systemml/Y_val_"+size+"_"+c+"_binary", format="binary")
> """
> script = dml(script).input(X=X, X_val=X_val, Y=Y, Y_val=Y_val, size=size, c=c)
> ml.execute(script)
> {code}
> General error:
> {code}
> Caused by: org.apache.sysml.api.mlcontext.MLContextException: Exception 
> occurred while executing runtime program
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:371)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.execute(ScriptExecutor.java:292)
>   at org.apache.sysml.api.mlcontext.MLContext.execute(MLContext.java:293)
>   ... 12 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
> block generated from statement block between lines 1 and 11 -- Error 
> evaluating instruction: CP°mvvar°X°¶_Var49¶°binaryblock
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:130)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:369)
>   ... 14 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
> in program block generated from statement block between lines 1 and 11 -- 
> Error evaluating instruction: CP°mvvar°X°¶_Var49¶°binaryblock
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:320)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   ... 15 more
> Caused by: org.apache.sysml.runtime.controlprogram.caching.CacheException: 
> Move to data/systemml/X_256_3_binary failed.
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.moveData(CacheableData.java:1329)
>   at 
> org.apache.sysml.runtime.instructions.cp.VariableCPInstruction.processMoveInstruction(VariableCPInstruction.java:706)
>   at 

[jira] [Commented] (SYSTEMML-1281) OOM Error On Binary Write

2017-02-17 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872824#comment-15872824
 ] 

Matthias Boehm commented on SYSTEMML-1281:
--

Ok, I just tried to reproduce this error with csv-binaryblock (and 
dataset-binaryblock) conversions of dimension 100,000 x 200,000, dense but both 
work fine for me. Could it be that there is some side effect (in terms of 
memory consumption) to the data being in parquet format? Let's either (1) write 
it out to csv with spark, or (2) configure spark with more head room for user 
space and write it to binary. Once this is done, I'd like to have a look at the 
data set.

Btw, there is no such assumption of ~1000 columns; we aim at the general case 
of a wide range of matrix shapes (that's one of the reasons why we have squared 
blocks) but of course we optimized for typically encountered matrix shapes of 
tall and skinny matrices. So, yes there is room for improving the support of 
those kinds of wide matrices.

> OOM Error On Binary Write
> -
>
> Key: SYSTEMML-1281
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1281
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 0.13
>Reporter: Mike Dusenberry
>Priority: Blocker
>
> I'm running into the following heap space OOM error while attempting to save 
> a large Spark DataFrame to a SystemML binary format via DML {{write}} 
> statements.
> Script:
> {code}
> tr_sample_filename = os.path.join("data", "train_{}{}.parquet".format(size, 
> "_grayscale" if grayscale else ""))
> val_sample_filename = os.path.join("data", "val_{}{}.parquet".format(size, 
> "_grayscale" if grayscale else ""))
> train_df = sqlContext.read.load(tr_sample_filename)
> val_df = sqlContext.read.load(val_sample_filename)
> train_df, val_df
> # Note: Must use the row index column, or X may not
> # necessarily correspond correctly to Y
> X_df = train_df.select("__INDEX", "sample")
> X_val_df = val_df.select("__INDEX", "sample")
> y_df = train_df.select("__INDEX", "tumor_score")
> y_val_df = val_df.select("__INDEX", "tumor_score")
> X_df, X_val_df, y_df, y_val_df
> script = """
> # Scale images to [-1,1]
> X = X / 255
> X_val = X_val / 255
> X = X * 2 - 1
> X_val = X_val * 2 - 1
> # One-hot encode the labels
> num_tumor_classes = 3
> n = nrow(y)
> n_val = nrow(y_val)
> Y = table(seq(1, n), y, n, num_tumor_classes)
> Y_val = table(seq(1, n_val), y_val, n_val, num_tumor_classes)
> """
> outputs = ("X", "X_val", "Y", "Y_val")
> script = dml(script).input(X=X_df, X_val=X_val_df, y=y_df, 
> y_val=y_val_df).output(*outputs)
> X, X_val, Y, Y_val = ml.execute(script).get(*outputs)
> X, X_val, Y, Y_val
> script = """
> write(X, "data/systemml/X_"+size+"_"+c+"_binary", format="binary")
> write(Y, "data/systemml/Y_"+size+"_"+c+"_binary", format="binary")
> write(X_val, "data/systemml/X_val_"+size+"_"+c+"_binary", format="binary")
> write(Y_val, "data/systemml/Y_val_"+size+"_"+c+"_binary", format="binary")
> """
> script = dml(script).input(X=X, X_val=X_val, Y=Y, Y_val=Y_val, size=size, c=c)
> ml.execute(script)
> {code}
> General error:
> {code}
> Caused by: org.apache.sysml.api.mlcontext.MLContextException: Exception 
> occurred while executing runtime program
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:371)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.execute(ScriptExecutor.java:292)
>   at org.apache.sysml.api.mlcontext.MLContext.execute(MLContext.java:293)
>   ... 12 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
> block generated from statement block between lines 1 and 11 -- Error 
> evaluating instruction: CP°mvvar°X°¶_Var49¶°binaryblock
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:130)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:369)
>   ... 14 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
> in program block generated from statement block between lines 1 and 11 -- 
> Error evaluating instruction: CP°mvvar°X°¶_Var49¶°binaryblock
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:320)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   ... 15 more
> Caused by: org.apache.sysml.runtime.controlprogram.caching.CacheException: 
> Move to data/systemml/X_256_3_binary failed.
>   at 
> 

[jira] [Commented] (SYSTEMML-1281) OOM Error On Binary Write

2017-02-17 Thread Mike Dusenberry (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872784#comment-15872784
 ] 

Mike Dusenberry commented on SYSTEMML-1281:
---

Writing directly to CSV from the DataFrame inputs also failed.

{code}
script = """
write(X, "data/systemml/X_"+size+"_"+c+".csv", format="csv")
write(Y, "data/systemml/Y_"+size+"_"+c+".csv", format="csv")
write(X_val, "data/systemml/X_val_"+size+"_"+c+".csv", format="csv")
write(Y_val, "data/systemml/Y_val_"+size+"_"+c+".csv", format="csv")
"""
script = dml(script).input(X=X_df, X_val=X_val_df, Y=y_df, Y_val=y_val_df, 
size=size, c=c)
ml.execute(script)
{code}

> OOM Error On Binary Write
> -
>
> Key: SYSTEMML-1281
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1281
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 0.13
>Reporter: Mike Dusenberry
>Priority: Blocker
>
> I'm running into the following heap space OOM error while attempting to save 
> a large Spark DataFrame to a SystemML binary format via DML {{write}} 
> statements.
> Script:
> {code}
> tr_sample_filename = os.path.join("data", "train_{}{}.parquet".format(size, 
> "_grayscale" if grayscale else ""))
> val_sample_filename = os.path.join("data", "val_{}{}.parquet".format(size, 
> "_grayscale" if grayscale else ""))
> train_df = sqlContext.read.load(tr_sample_filename)
> val_df = sqlContext.read.load(val_sample_filename)
> train_df, val_df
> # Note: Must use the row index column, or X may not
> # necessarily correspond correctly to Y
> X_df = train_df.select("__INDEX", "sample")
> X_val_df = val_df.select("__INDEX", "sample")
> y_df = train_df.select("__INDEX", "tumor_score")
> y_val_df = val_df.select("__INDEX", "tumor_score")
> X_df, X_val_df, y_df, y_val_df
> script = """
> # Scale images to [-1,1]
> X = X / 255
> X_val = X_val / 255
> X = X * 2 - 1
> X_val = X_val * 2 - 1
> # One-hot encode the labels
> num_tumor_classes = 3
> n = nrow(y)
> n_val = nrow(y_val)
> Y = table(seq(1, n), y, n, num_tumor_classes)
> Y_val = table(seq(1, n_val), y_val, n_val, num_tumor_classes)
> """
> outputs = ("X", "X_val", "Y", "Y_val")
> script = dml(script).input(X=X_df, X_val=X_val_df, y=y_df, 
> y_val=y_val_df).output(*outputs)
> X, X_val, Y, Y_val = ml.execute(script).get(*outputs)
> X, X_val, Y, Y_val
> script = """
> write(X, "data/systemml/X_"+size+"_"+c+"_binary", format="binary")
> write(Y, "data/systemml/Y_"+size+"_"+c+"_binary", format="binary")
> write(X_val, "data/systemml/X_val_"+size+"_"+c+"_binary", format="binary")
> write(Y_val, "data/systemml/Y_val_"+size+"_"+c+"_binary", format="binary")
> """
> script = dml(script).input(X=X, X_val=X_val, Y=Y, Y_val=Y_val, size=size, c=c)
> ml.execute(script)
> {code}
> General error:
> {code}
> Caused by: org.apache.sysml.api.mlcontext.MLContextException: Exception 
> occurred while executing runtime program
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:371)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.execute(ScriptExecutor.java:292)
>   at org.apache.sysml.api.mlcontext.MLContext.execute(MLContext.java:293)
>   ... 12 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
> block generated from statement block between lines 1 and 11 -- Error 
> evaluating instruction: CP°mvvar°X°¶_Var49¶°binaryblock
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:130)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:369)
>   ... 14 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
> in program block generated from statement block between lines 1 and 11 -- 
> Error evaluating instruction: CP°mvvar°X°¶_Var49¶°binaryblock
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:320)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   ... 15 more
> Caused by: org.apache.sysml.runtime.controlprogram.caching.CacheException: 
> Move to data/systemml/X_256_3_binary failed.
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.moveData(CacheableData.java:1329)
>   at 
> org.apache.sysml.runtime.instructions.cp.VariableCPInstruction.processMoveInstruction(VariableCPInstruction.java:706)
>   at 
> org.apache.sysml.runtime.instructions.cp.VariableCPInstruction.processInstruction(VariableCPInstruction.java:511)
>   at 
> 

[jira] [Commented] (SYSTEMML-1238) Python test failing for LinearRegCG

2017-02-17 Thread Niketan Pansare (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872777#comment-15872777
 ] 

Niketan Pansare commented on SYSTEMML-1238:
---

Thanks Imran :)

> Python test failing for LinearRegCG
> ---
>
> Key: SYSTEMML-1238
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1238
> Project: SystemML
>  Issue Type: Bug
>  Components: Algorithms, APIs
>Affects Versions: SystemML 0.13
>Reporter: Imran Younus
>Assignee: Niketan Pansare
> Fix For: SystemML 0.13
>
> Attachments: python_LinearReg_test_spark.1.6.log, 
> python_LinearReg_test_spark.2.1.log
>
>
> [~deron] discovered that the one of the python test ({{test_mllearn_df.py}}) 
> with spark 2.1.0 was failing because the test score from linear regression 
> was very low ({{~ 0.24}}). I did a some investigation and it turns out the 
> the model parameters computed by the dml script are incorrect. In 
> systemml.12, the values of betas from linear regression model are 
> {{\[152.919, 938.237\]}}. This is what we expect from normal equation. (I 
> also tested this with sklearn). But the values of betas from systemml.13 
> (with spark 2.1.0) come out to be {{\[153.146, 458.489\]}}. These are not 
> correct and therefore the test score is much lower than expected. The data 
> going into DML script is correct. I printed out the valued of {{X}} and {{Y}} 
> in dml and I didn't see any issue there.
> Attached are the log files for two different tests (systemml0.12 and 0.13) 
> with explain flag.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1242) Perftest: OutOfMemoryError in MultiLogReg for 80g sparse

2017-02-17 Thread Imran Younus (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872752#comment-15872752
 ] 

Imran Younus commented on SYSTEMML-1242:


[~mboehm7] I ran this test again after your fix, and it completed successfully. 
Should we close this jira now?


> Perftest: OutOfMemoryError in MultiLogReg for 80g sparse
> 
>
> Key: SYSTEMML-1242
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1242
> Project: SystemML
>  Issue Type: Bug
>  Components: Test
>Affects Versions: SystemML 0.11, SystemML 0.12
> Environment: spark 2.1.0
>Reporter: Imran Younus
>Assignee: Matthias Boehm
> Fix For: SystemML 0.13
>
> Attachments: sparkDML.sh
>
>
> when running {{runMultiLogReg.sh}} script, {{MultiLogReg.dml}} ends with 
> OutOfMemory error for the case of 10M_1K sparse data and {{icpt = 1}}. Here 
> is the end of the log file:
> {code}
> 17/02/04 17:20:33 INFO api.DMLScript: SystemML Statistics:
> Total elapsed time:   697.694 sec.
> Total compilation time:   2.543 sec.
> Total execution time: 695.151 sec.
> Number of compiled Spark inst:73.
> Number of executed Spark inst:16.
> Cache hits (Mem, WB, FS, HDFS):   46/9/1/7.
> Cache writes (WB, FS, HDFS):  27/1/1.
> Cache times (ACQr/m, RLS, EXP):   281.541/0.003/131.589/48.737 sec.
> HOP DAGs recompiled (PRED, SB):   0/15.
> HOP DAGs recompile time:  0.067 sec.
> Spark ctx create time (lazy): 31.078 sec.
> Spark trans counts (par,bc,col):5/4/0.
> Spark trans times (par,bc,col):   46.748/0.392/0.000 secs.
> Total JIT compile time:   151.254 sec.
> Total JVM GC count:   144.
> Total JVM GC time:220.671 sec.
> Heavy hitter instructions (name, time, count):
> -- 1) ba+*144.194 sec 3
> -- 2) rand109.939 sec 9
> -- 3) uark+   105.011 sec 2
> -- 4) r'  100.933 sec 3
> -- 5) sp_/80.387 sec  1
> -- 6) sp_mapmm45.491 sec  2
> -- 7) sp_tak+*40.655 sec  1
> -- 8) append  9.480 sec   1
> -- 9) rangeReIndex7.347 sec   2
> -- 10)sp_-6.392 sec   3
> 17/02/04 17:20:33 INFO api.DMLScript: END DML run 02/04/2017 17:20:33
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:363)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:339)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlockUnsafe(MatrixBlock.java:408)
>   at 
> org.apache.sysml.runtime.io.MatrixReader.createOutputMatrixBlock(MatrixReader.java:107)
>   at 
> org.apache.sysml.runtime.io.ReaderBinaryBlockParallel.readMatrixFromHDFS(ReaderBinaryBlockParallel.java:59)
>   at 
> org.apache.sysml.runtime.util.DataConverter.readMatrixFromHDFS(DataConverter.java:203)
>   at 
> org.apache.sysml.runtime.util.DataConverter.readMatrixFromHDFS(DataConverter.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromHDFS(MatrixObject.java:425)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromHDFS(MatrixObject.java:60)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.readBlobFromHDFS(CacheableData.java:920)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromRDD(MatrixObject.java:478)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.MatrixObject.readBlobFromRDD(MatrixObject.java:60)
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.acquireRead(CacheableData.java:411)
>   at 
> org.apache.sysml.runtime.controlprogram.context.ExecutionContext.getMatrixInput(ExecutionContext.java:209)
>   at 
> org.apache.sysml.runtime.instructions.cp.AggregateBinaryCPInstruction.processInstruction(AggregateBinaryCPInstruction.java:74)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.IfProgramBlock.execute(IfProgramBlock.java:139)
>   at 
> org.apache.sysml.runtime.controlprogram.WhileProgramBlock.execute(WhileProgramBlock.java:165)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   at org.apache.sysml.api.DMLScript.execute(DMLScript.java:684)
>   at 

[jira] [Commented] (SYSTEMML-1281) OOM Error On Binary Write

2017-02-17 Thread Mike Dusenberry (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872746#comment-15872746
 ] 

Mike Dusenberry commented on SYSTEMML-1281:
---

I was attempting to write after the matrix transformations.  In general, this 
is yet another case that underscores the need to improve our engine by removing 
the assumption of ~1000-column matrices, and instead assume matrices of any 
number of rows/columns.  Challenging, of course, but important if we want to 
truly support scalable ML.

> OOM Error On Binary Write
> -
>
> Key: SYSTEMML-1281
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1281
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 0.13
>Reporter: Mike Dusenberry
>Priority: Blocker
>
> I'm running into the following heap space OOM error while attempting to save 
> a large Spark DataFrame to a SystemML binary format via DML {{write}} 
> statements.
> Script:
> {code}
> tr_sample_filename = os.path.join("data", "train_{}{}.parquet".format(size, 
> "_grayscale" if grayscale else ""))
> val_sample_filename = os.path.join("data", "val_{}{}.parquet".format(size, 
> "_grayscale" if grayscale else ""))
> train_df = sqlContext.read.load(tr_sample_filename)
> val_df = sqlContext.read.load(val_sample_filename)
> train_df, val_df
> # Note: Must use the row index column, or X may not
> # necessarily correspond correctly to Y
> X_df = train_df.select("__INDEX", "sample")
> X_val_df = val_df.select("__INDEX", "sample")
> y_df = train_df.select("__INDEX", "tumor_score")
> y_val_df = val_df.select("__INDEX", "tumor_score")
> X_df, X_val_df, y_df, y_val_df
> script = """
> # Scale images to [-1,1]
> X = X / 255
> X_val = X_val / 255
> X = X * 2 - 1
> X_val = X_val * 2 - 1
> # One-hot encode the labels
> num_tumor_classes = 3
> n = nrow(y)
> n_val = nrow(y_val)
> Y = table(seq(1, n), y, n, num_tumor_classes)
> Y_val = table(seq(1, n_val), y_val, n_val, num_tumor_classes)
> """
> outputs = ("X", "X_val", "Y", "Y_val")
> script = dml(script).input(X=X_df, X_val=X_val_df, y=y_df, 
> y_val=y_val_df).output(*outputs)
> X, X_val, Y, Y_val = ml.execute(script).get(*outputs)
> X, X_val, Y, Y_val
> script = """
> write(X, "data/systemml/X_"+size+"_"+c+"_binary", format="binary")
> write(Y, "data/systemml/Y_"+size+"_"+c+"_binary", format="binary")
> write(X_val, "data/systemml/X_val_"+size+"_"+c+"_binary", format="binary")
> write(Y_val, "data/systemml/Y_val_"+size+"_"+c+"_binary", format="binary")
> """
> script = dml(script).input(X=X, X_val=X_val, Y=Y, Y_val=Y_val, size=size, c=c)
> ml.execute(script)
> {code}
> General error:
> {code}
> Caused by: org.apache.sysml.api.mlcontext.MLContextException: Exception 
> occurred while executing runtime program
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:371)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.execute(ScriptExecutor.java:292)
>   at org.apache.sysml.api.mlcontext.MLContext.execute(MLContext.java:293)
>   ... 12 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
> block generated from statement block between lines 1 and 11 -- Error 
> evaluating instruction: CP°mvvar°X°¶_Var49¶°binaryblock
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:130)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:369)
>   ... 14 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
> in program block generated from statement block between lines 1 and 11 -- 
> Error evaluating instruction: CP°mvvar°X°¶_Var49¶°binaryblock
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:320)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   ... 15 more
> Caused by: org.apache.sysml.runtime.controlprogram.caching.CacheException: 
> Move to data/systemml/X_256_3_binary failed.
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.moveData(CacheableData.java:1329)
>   at 
> org.apache.sysml.runtime.instructions.cp.VariableCPInstruction.processMoveInstruction(VariableCPInstruction.java:706)
>   at 
> org.apache.sysml.runtime.instructions.cp.VariableCPInstruction.processInstruction(VariableCPInstruction.java:511)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
>   ... 18 more
> Caused by: 

[jira] [Resolved] (SYSTEMML-1277) DataFrames With `mllib.Vector` Columns Are No Longer Converted to Matrices.

2017-02-17 Thread Mike Dusenberry (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Dusenberry resolved SYSTEMML-1277.
---
   Resolution: Fixed
Fix Version/s: SystemML 0.13

This fixed my my real-world case.  Thanks, [~deron]!

> DataFrames With `mllib.Vector` Columns Are No Longer Converted to Matrices.
> ---
>
> Key: SYSTEMML-1277
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1277
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 0.13
>Reporter: Mike Dusenberry
>Assignee: Deron Eriksson
>Priority: Blocker
> Fix For: SystemML 0.13
>
>
> Recently, we made the switch from the old {{mllib.Vector}} to the new 
> {{ml.Vector}} type.  Unfortunately, this leaves us with the issue of no 
> longer recognizing DataFrames with {{mllib.Vector}} columns during 
> conversion, and thus, we (1) do not correctly convert to SystemML {{Matrix}} 
> objects, (2) instead fall back on conversion to {{Frame}} objects, and then 
> (3) fail completely when the ensuing DML script is expecting to operated on 
> matrices.
> Given a Spark {{DataFrame}} {{X_df}} of type {{DataFrame\[__INDEX: int, 
> sample: vector\]}}, where {{vector}} is of type {{mllib.Vector}}, the 
> following script will now fail (did not previously):
> {code}
> script = """
> # Scale images to [-1,1]
> X = X / 255
> X = X * 2 - 1
> """
> outputs = ("X")
> script = dml(script).input(X=X_df).output(*outputs)
> X = ml.execute(script).get(*outputs)
> X
> {code}
> {code}
> Caused by: org.apache.sysml.api.mlcontext.MLContextException: Exception 
> occurred while validating script
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.validateScript(ScriptExecutor.java:487)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.execute(ScriptExecutor.java:280)
>   at org.apache.sysml.api.mlcontext.MLContext.execute(MLContext.java:293)
>   ... 12 more
> Caused by: org.apache.sysml.parser.LanguageException: Invalid Parameters : 
> ERROR: null -- line 4, column 4 -- Invalid Datatypes for operation FRAME 
> SCALAR
>   at 
> org.apache.sysml.parser.Expression.raiseValidateError(Expression.java:549)
>   at 
> org.apache.sysml.parser.Expression.computeDataType(Expression.java:415)
>   at 
> org.apache.sysml.parser.Expression.computeDataType(Expression.java:386)
>   at 
> org.apache.sysml.parser.BinaryExpression.validateExpression(BinaryExpression.java:130)
>   at 
> org.apache.sysml.parser.StatementBlock.validate(StatementBlock.java:567)
>   at 
> org.apache.sysml.parser.DMLTranslator.validateParseTree(DMLTranslator.java:140)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.validateScript(ScriptExecutor.java:485)
>   ... 14 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (SYSTEMML-1277) DataFrames With `mllib.Vector` Columns Are No Longer Converted to Matrices.

2017-02-17 Thread Mike Dusenberry (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Dusenberry closed SYSTEMML-1277.
-

> DataFrames With `mllib.Vector` Columns Are No Longer Converted to Matrices.
> ---
>
> Key: SYSTEMML-1277
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1277
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 0.13
>Reporter: Mike Dusenberry
>Assignee: Deron Eriksson
>Priority: Blocker
> Fix For: SystemML 0.13
>
>
> Recently, we made the switch from the old {{mllib.Vector}} to the new 
> {{ml.Vector}} type.  Unfortunately, this leaves us with the issue of no 
> longer recognizing DataFrames with {{mllib.Vector}} columns during 
> conversion, and thus, we (1) do not correctly convert to SystemML {{Matrix}} 
> objects, (2) instead fall back on conversion to {{Frame}} objects, and then 
> (3) fail completely when the ensuing DML script is expecting to operated on 
> matrices.
> Given a Spark {{DataFrame}} {{X_df}} of type {{DataFrame\[__INDEX: int, 
> sample: vector\]}}, where {{vector}} is of type {{mllib.Vector}}, the 
> following script will now fail (did not previously):
> {code}
> script = """
> # Scale images to [-1,1]
> X = X / 255
> X = X * 2 - 1
> """
> outputs = ("X")
> script = dml(script).input(X=X_df).output(*outputs)
> X = ml.execute(script).get(*outputs)
> X
> {code}
> {code}
> Caused by: org.apache.sysml.api.mlcontext.MLContextException: Exception 
> occurred while validating script
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.validateScript(ScriptExecutor.java:487)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.execute(ScriptExecutor.java:280)
>   at org.apache.sysml.api.mlcontext.MLContext.execute(MLContext.java:293)
>   ... 12 more
> Caused by: org.apache.sysml.parser.LanguageException: Invalid Parameters : 
> ERROR: null -- line 4, column 4 -- Invalid Datatypes for operation FRAME 
> SCALAR
>   at 
> org.apache.sysml.parser.Expression.raiseValidateError(Expression.java:549)
>   at 
> org.apache.sysml.parser.Expression.computeDataType(Expression.java:415)
>   at 
> org.apache.sysml.parser.Expression.computeDataType(Expression.java:386)
>   at 
> org.apache.sysml.parser.BinaryExpression.validateExpression(BinaryExpression.java:130)
>   at 
> org.apache.sysml.parser.StatementBlock.validate(StatementBlock.java:567)
>   at 
> org.apache.sysml.parser.DMLTranslator.validateParseTree(DMLTranslator.java:140)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.validateScript(ScriptExecutor.java:485)
>   ... 14 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (SYSTEMML-1238) Python test failing for LinearRegCG

2017-02-17 Thread Niketan Pansare (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niketan Pansare resolved SYSTEMML-1238.
---
   Resolution: Fixed
Fix Version/s: SystemML 0.13

Fixed in the commit 
https://github.com/apache/incubator-systemml/commit/9d0087cbbd250c9b486923555b450602f816cf19
 by setting regularization to 0 (similar to that of scikit-learn).

> Python test failing for LinearRegCG
> ---
>
> Key: SYSTEMML-1238
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1238
> Project: SystemML
>  Issue Type: Bug
>  Components: Algorithms, APIs
>Affects Versions: SystemML 0.13
>Reporter: Imran Younus
>Assignee: Niketan Pansare
> Fix For: SystemML 0.13
>
> Attachments: python_LinearReg_test_spark.1.6.log, 
> python_LinearReg_test_spark.2.1.log
>
>
> [~deron] discovered that the one of the python test ({{test_mllearn_df.py}}) 
> with spark 2.1.0 was failing because the test score from linear regression 
> was very low ({{~ 0.24}}). I did a some investigation and it turns out the 
> the model parameters computed by the dml script are incorrect. In 
> systemml.12, the values of betas from linear regression model are 
> {{\[152.919, 938.237\]}}. This is what we expect from normal equation. (I 
> also tested this with sklearn). But the values of betas from systemml.13 
> (with spark 2.1.0) come out to be {{\[153.146, 458.489\]}}. These are not 
> correct and therefore the test score is much lower than expected. The data 
> going into DML script is correct. I printed out the valued of {{X}} and {{Y}} 
> in dml and I didn't see any issue there.
> Attached are the log files for two different tests (systemml0.12 and 0.13) 
> with explain flag.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (SYSTEMML-1252) Performance stratstats script

2017-02-17 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm resolved SYSTEMML-1252.
--
   Resolution: Done
Fix Version/s: SystemML 0.13

> Performance stratstats script 
> --
>
> Key: SYSTEMML-1252
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1252
> Project: SystemML
>  Issue Type: Task
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.13
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (SYSTEMML-1255) New fused operator tack+* in CP and Spark

2017-02-17 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm resolved SYSTEMML-1255.
--
   Resolution: Done
 Assignee: Matthias Boehm
Fix Version/s: SystemML 0.13

> New fused operator tack+* in CP and Spark
> -
>
> Key: SYSTEMML-1255
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1255
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Compiler
>Reporter: Matthias Boehm
>Assignee: Matthias Boehm
> Fix For: SystemML 0.13
>
>
> Similar to the existing tak+* operator, this new tack+* operator fused two or 
> three binary multiply operations and final column-wise aggregation 
> colSums(X*Y*Z) in order to avoid materializing the intermediates which is 
> very expensive compared to the cheap multiply and sum operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1281) OOM Error On Binary Write

2017-02-17 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872653#comment-15872653
 ] 

Matthias Boehm commented on SYSTEMML-1281:
--

Did you try to write the ORIGINAL dataset to csv or after the matrix 
transformations? The latter can't help because we would convert to binary block 
for these operations and it is failing from dataset-binary block conversion.

> OOM Error On Binary Write
> -
>
> Key: SYSTEMML-1281
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1281
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 0.13
>Reporter: Mike Dusenberry
>Priority: Blocker
>
> I'm running into the following heap space OOM error while attempting to save 
> a large Spark DataFrame to a SystemML binary format via DML {{write}} 
> statements.
> Script:
> {code}
> tr_sample_filename = os.path.join("data", "train_{}{}.parquet".format(size, 
> "_grayscale" if grayscale else ""))
> val_sample_filename = os.path.join("data", "val_{}{}.parquet".format(size, 
> "_grayscale" if grayscale else ""))
> train_df = sqlContext.read.load(tr_sample_filename)
> val_df = sqlContext.read.load(val_sample_filename)
> train_df, val_df
> # Note: Must use the row index column, or X may not
> # necessarily correspond correctly to Y
> X_df = train_df.select("__INDEX", "sample")
> X_val_df = val_df.select("__INDEX", "sample")
> y_df = train_df.select("__INDEX", "tumor_score")
> y_val_df = val_df.select("__INDEX", "tumor_score")
> X_df, X_val_df, y_df, y_val_df
> script = """
> # Scale images to [-1,1]
> X = X / 255
> X_val = X_val / 255
> X = X * 2 - 1
> X_val = X_val * 2 - 1
> # One-hot encode the labels
> num_tumor_classes = 3
> n = nrow(y)
> n_val = nrow(y_val)
> Y = table(seq(1, n), y, n, num_tumor_classes)
> Y_val = table(seq(1, n_val), y_val, n_val, num_tumor_classes)
> """
> outputs = ("X", "X_val", "Y", "Y_val")
> script = dml(script).input(X=X_df, X_val=X_val_df, y=y_df, 
> y_val=y_val_df).output(*outputs)
> X, X_val, Y, Y_val = ml.execute(script).get(*outputs)
> X, X_val, Y, Y_val
> script = """
> write(X, "data/systemml/X_"+size+"_"+c+"_binary", format="binary")
> write(Y, "data/systemml/Y_"+size+"_"+c+"_binary", format="binary")
> write(X_val, "data/systemml/X_val_"+size+"_"+c+"_binary", format="binary")
> write(Y_val, "data/systemml/Y_val_"+size+"_"+c+"_binary", format="binary")
> """
> script = dml(script).input(X=X, X_val=X_val, Y=Y, Y_val=Y_val, size=size, c=c)
> ml.execute(script)
> {code}
> General error:
> {code}
> Caused by: org.apache.sysml.api.mlcontext.MLContextException: Exception 
> occurred while executing runtime program
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:371)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.execute(ScriptExecutor.java:292)
>   at org.apache.sysml.api.mlcontext.MLContext.execute(MLContext.java:293)
>   ... 12 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
> block generated from statement block between lines 1 and 11 -- Error 
> evaluating instruction: CP°mvvar°X°¶_Var49¶°binaryblock
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:130)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:369)
>   ... 14 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
> in program block generated from statement block between lines 1 and 11 -- 
> Error evaluating instruction: CP°mvvar°X°¶_Var49¶°binaryblock
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:320)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   ... 15 more
> Caused by: org.apache.sysml.runtime.controlprogram.caching.CacheException: 
> Move to data/systemml/X_256_3_binary failed.
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.moveData(CacheableData.java:1329)
>   at 
> org.apache.sysml.runtime.instructions.cp.VariableCPInstruction.processMoveInstruction(VariableCPInstruction.java:706)
>   at 
> org.apache.sysml.runtime.instructions.cp.VariableCPInstruction.processInstruction(VariableCPInstruction.java:511)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
>   ... 18 more
> Caused by: org.apache.sysml.runtime.controlprogram.caching.CacheException: 
> Export to data/systemml/X_256_3_binary failed.
>  

[jira] [Commented] (SYSTEMML-1281) OOM Error On Binary Write

2017-02-17 Thread Mike Dusenberry (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872641#comment-15872641
 ] 

Mike Dusenberry commented on SYSTEMML-1281:
---

Yeah I tried to write it to CSV using DML, but still ran into the OOM error.  I 
can try writing to CSV with Spark directly.

> OOM Error On Binary Write
> -
>
> Key: SYSTEMML-1281
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1281
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 0.13
>Reporter: Mike Dusenberry
>Priority: Blocker
>
> I'm running into the following heap space OOM error while attempting to save 
> a large Spark DataFrame to a SystemML binary format via DML {{write}} 
> statements.
> Script:
> {code}
> tr_sample_filename = os.path.join("data", "train_{}{}.parquet".format(size, 
> "_grayscale" if grayscale else ""))
> val_sample_filename = os.path.join("data", "val_{}{}.parquet".format(size, 
> "_grayscale" if grayscale else ""))
> train_df = sqlContext.read.load(tr_sample_filename)
> val_df = sqlContext.read.load(val_sample_filename)
> train_df, val_df
> # Note: Must use the row index column, or X may not
> # necessarily correspond correctly to Y
> X_df = train_df.select("__INDEX", "sample")
> X_val_df = val_df.select("__INDEX", "sample")
> y_df = train_df.select("__INDEX", "tumor_score")
> y_val_df = val_df.select("__INDEX", "tumor_score")
> X_df, X_val_df, y_df, y_val_df
> script = """
> # Scale images to [-1,1]
> X = X / 255
> X_val = X_val / 255
> X = X * 2 - 1
> X_val = X_val * 2 - 1
> # One-hot encode the labels
> num_tumor_classes = 3
> n = nrow(y)
> n_val = nrow(y_val)
> Y = table(seq(1, n), y, n, num_tumor_classes)
> Y_val = table(seq(1, n_val), y_val, n_val, num_tumor_classes)
> """
> outputs = ("X", "X_val", "Y", "Y_val")
> script = dml(script).input(X=X_df, X_val=X_val_df, y=y_df, 
> y_val=y_val_df).output(*outputs)
> X, X_val, Y, Y_val = ml.execute(script).get(*outputs)
> X, X_val, Y, Y_val
> script = """
> write(X, "data/systemml/X_"+size+"_"+c+"_binary", format="binary")
> write(Y, "data/systemml/Y_"+size+"_"+c+"_binary", format="binary")
> write(X_val, "data/systemml/X_val_"+size+"_"+c+"_binary", format="binary")
> write(Y_val, "data/systemml/Y_val_"+size+"_"+c+"_binary", format="binary")
> """
> script = dml(script).input(X=X, X_val=X_val, Y=Y, Y_val=Y_val, size=size, c=c)
> ml.execute(script)
> {code}
> General error:
> {code}
> Caused by: org.apache.sysml.api.mlcontext.MLContextException: Exception 
> occurred while executing runtime program
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:371)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.execute(ScriptExecutor.java:292)
>   at org.apache.sysml.api.mlcontext.MLContext.execute(MLContext.java:293)
>   ... 12 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
> block generated from statement block between lines 1 and 11 -- Error 
> evaluating instruction: CP°mvvar°X°¶_Var49¶°binaryblock
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:130)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:369)
>   ... 14 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
> in program block generated from statement block between lines 1 and 11 -- 
> Error evaluating instruction: CP°mvvar°X°¶_Var49¶°binaryblock
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:320)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   ... 15 more
> Caused by: org.apache.sysml.runtime.controlprogram.caching.CacheException: 
> Move to data/systemml/X_256_3_binary failed.
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.moveData(CacheableData.java:1329)
>   at 
> org.apache.sysml.runtime.instructions.cp.VariableCPInstruction.processMoveInstruction(VariableCPInstruction.java:706)
>   at 
> org.apache.sysml.runtime.instructions.cp.VariableCPInstruction.processInstruction(VariableCPInstruction.java:511)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
>   ... 18 more
> Caused by: org.apache.sysml.runtime.controlprogram.caching.CacheException: 
> Export to data/systemml/X_256_3_binary failed.
>   at 
> 

[jira] [Commented] (SYSTEMML-1211) Verify dependencies for Spark 2

2017-02-17 Thread Glenn Weidner (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872614#comment-15872614
 ] 

Glenn Weidner commented on SYSTEMML-1211:
-

Yes [~mboehm7] - I also can reproduce the test failures on my system and will 
look into updating the hadoop_bin_windows.

> Verify dependencies for Spark 2
> ---
>
> Key: SYSTEMML-1211
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1211
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Build
>Reporter: Deron Eriksson
>Assignee: Deron Eriksson
>
> With the migration to Spark 2, we should verify that the artifact assemblies 
> are properly handling all dependencies.
> Also, we should verify that that artifact licenses properly include all 
> dependencies following the Spark 2 migration.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1211) Verify dependencies for Spark 2

2017-02-17 Thread Deron Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872463#comment-15872463
 ] 

Deron Eriksson commented on SYSTEMML-1211:
--

License for standalone jar artifact updated by 
https://github.com/apache/incubator-systemml/commit/184e02dac008ff1aa524b32455d0cb391d7cb484

> Verify dependencies for Spark 2
> ---
>
> Key: SYSTEMML-1211
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1211
> Project: SystemML
>  Issue Type: Sub-task
>  Components: Build
>Reporter: Deron Eriksson
>Assignee: Deron Eriksson
>
> With the migration to Spark 2, we should verify that the artifact assemblies 
> are properly handling all dependencies.
> Also, we should verify that that artifact licenses properly include all 
> dependencies following the Spark 2 migration.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1281) OOM Error On Binary Write

2017-02-17 Thread Mike Dusenberry (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872451#comment-15872451
 ] 

Mike Dusenberry commented on SYSTEMML-1281:
---

cc [~fschueler], [~acs_s], [~nakul02], [~niketanpansare], [~mboehm7], 
[~reinw...@us.ibm.com]

> OOM Error On Binary Write
> -
>
> Key: SYSTEMML-1281
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1281
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 0.13
>Reporter: Mike Dusenberry
>Priority: Blocker
>
> I'm running into the following heap space OOM error while attempting to save 
> a large Spark DataFrame to a SystemML binary format via DML {{write}} 
> statements.
> Script:
> {code}
> tr_sample_filename = os.path.join("data", "train_{}{}.parquet".format(size, 
> "_grayscale" if grayscale else ""))
> val_sample_filename = os.path.join("data", "val_{}{}.parquet".format(size, 
> "_grayscale" if grayscale else ""))
> train_df = sqlContext.read.load(tr_sample_filename)
> val_df = sqlContext.read.load(val_sample_filename)
> train_df, val_df
> # Note: Must use the row index column, or X may not
> # necessarily correspond correctly to Y
> X_df = train_df.select("__INDEX", "sample")
> X_val_df = val_df.select("__INDEX", "sample")
> y_df = train_df.select("__INDEX", "tumor_score")
> y_val_df = val_df.select("__INDEX", "tumor_score")
> X_df, X_val_df, y_df, y_val_df
> script = """
> # Scale images to [-1,1]
> X = X / 255
> X_val = X_val / 255
> X = X * 2 - 1
> X_val = X_val * 2 - 1
> # One-hot encode the labels
> num_tumor_classes = 3
> n = nrow(y)
> n_val = nrow(y_val)
> Y = table(seq(1, n), y, n, num_tumor_classes)
> Y_val = table(seq(1, n_val), y_val, n_val, num_tumor_classes)
> """
> outputs = ("X", "X_val", "Y", "Y_val")
> script = dml(script).input(X=X_df, X_val=X_val_df, y=y_df, 
> y_val=y_val_df).output(*outputs)
> X, X_val, Y, Y_val = ml.execute(script).get(*outputs)
> X, X_val, Y, Y_val
> script = """
> write(X, "data/systemml/X_"+size+"_"+c+"_binary", format="binary")
> write(Y, "data/systemml/Y_"+size+"_"+c+"_binary", format="binary")
> write(X_val, "data/systemml/X_val_"+size+"_"+c+"_binary", format="binary")
> write(Y_val, "data/systemml/Y_val_"+size+"_"+c+"_binary", format="binary")
> """
> script = dml(script).input(X=X, X_val=X_val, Y=Y, Y_val=Y_val, size=size, c=c)
> ml.execute(script)
> {code}
> General error:
> {code}
> Caused by: org.apache.sysml.api.mlcontext.MLContextException: Exception 
> occurred while executing runtime program
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:371)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.execute(ScriptExecutor.java:292)
>   at org.apache.sysml.api.mlcontext.MLContext.execute(MLContext.java:293)
>   ... 12 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
> block generated from statement block between lines 1 and 11 -- Error 
> evaluating instruction: CP°mvvar°X°¶_Var49¶°binaryblock
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:130)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:369)
>   ... 14 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
> in program block generated from statement block between lines 1 and 11 -- 
> Error evaluating instruction: CP°mvvar°X°¶_Var49¶°binaryblock
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:320)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   ... 15 more
> Caused by: org.apache.sysml.runtime.controlprogram.caching.CacheException: 
> Move to data/systemml/X_256_3_binary failed.
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.moveData(CacheableData.java:1329)
>   at 
> org.apache.sysml.runtime.instructions.cp.VariableCPInstruction.processMoveInstruction(VariableCPInstruction.java:706)
>   at 
> org.apache.sysml.runtime.instructions.cp.VariableCPInstruction.processInstruction(VariableCPInstruction.java:511)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
>   ... 18 more
> Caused by: org.apache.sysml.runtime.controlprogram.caching.CacheException: 
> Export to data/systemml/X_256_3_binary failed.
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.exportData(CacheableData.java:800)
>   at 
> 

[jira] [Comment Edited] (SYSTEMML-1281) OOM Error On Binary Write

2017-02-17 Thread Mike Dusenberry (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872447#comment-15872447
 ] 

Mike Dusenberry edited comment on SYSTEMML-1281 at 2/17/17 8:13 PM:


Well, I found it while working on the deep learning breast cancer project.  
However, the specific code is actually not tied specifically to deep learning, 
and affects any code that wishes to write a DataFrame out to SystemML binary 
format.


was (Author: mwdus...@us.ibm.com):
Well, I found it while working on the deep learning breast cancer project.  
However, the specific code is actual not tied specifically to deep learning, 
and affects any code that wishes to write a DataFrame out to SystemML binary 
format.

> OOM Error On Binary Write
> -
>
> Key: SYSTEMML-1281
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1281
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 0.13
>Reporter: Mike Dusenberry
>Priority: Blocker
>
> I'm running into the following heap space OOM error while attempting to save 
> a large Spark DataFrame to a SystemML binary format via DML {{write}} 
> statements.
> Script:
> {code}
> tr_sample_filename = os.path.join("data", "train_{}{}.parquet".format(size, 
> "_grayscale" if grayscale else ""))
> val_sample_filename = os.path.join("data", "val_{}{}.parquet".format(size, 
> "_grayscale" if grayscale else ""))
> train_df = sqlContext.read.load(tr_sample_filename)
> val_df = sqlContext.read.load(val_sample_filename)
> train_df, val_df
> # Note: Must use the row index column, or X may not
> # necessarily correspond correctly to Y
> X_df = train_df.select("__INDEX", "sample")
> X_val_df = val_df.select("__INDEX", "sample")
> y_df = train_df.select("__INDEX", "tumor_score")
> y_val_df = val_df.select("__INDEX", "tumor_score")
> X_df, X_val_df, y_df, y_val_df
> script = """
> # Scale images to [-1,1]
> X = X / 255
> X_val = X_val / 255
> X = X * 2 - 1
> X_val = X_val * 2 - 1
> # One-hot encode the labels
> num_tumor_classes = 3
> n = nrow(y)
> n_val = nrow(y_val)
> Y = table(seq(1, n), y, n, num_tumor_classes)
> Y_val = table(seq(1, n_val), y_val, n_val, num_tumor_classes)
> """
> outputs = ("X", "X_val", "Y", "Y_val")
> script = dml(script).input(X=X_df, X_val=X_val_df, y=y_df, 
> y_val=y_val_df).output(*outputs)
> X, X_val, Y, Y_val = ml.execute(script).get(*outputs)
> X, X_val, Y, Y_val
> script = """
> write(X, "data/systemml/X_"+size+"_"+c+"_binary", format="binary")
> write(Y, "data/systemml/Y_"+size+"_"+c+"_binary", format="binary")
> write(X_val, "data/systemml/X_val_"+size+"_"+c+"_binary", format="binary")
> write(Y_val, "data/systemml/Y_val_"+size+"_"+c+"_binary", format="binary")
> """
> script = dml(script).input(X=X, X_val=X_val, Y=Y, Y_val=Y_val, size=size, c=c)
> ml.execute(script)
> {code}
> General error:
> {code}
> Caused by: org.apache.sysml.api.mlcontext.MLContextException: Exception 
> occurred while executing runtime program
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:371)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.execute(ScriptExecutor.java:292)
>   at org.apache.sysml.api.mlcontext.MLContext.execute(MLContext.java:293)
>   ... 12 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
> block generated from statement block between lines 1 and 11 -- Error 
> evaluating instruction: CP°mvvar°X°¶_Var49¶°binaryblock
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:130)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:369)
>   ... 14 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
> in program block generated from statement block between lines 1 and 11 -- 
> Error evaluating instruction: CP°mvvar°X°¶_Var49¶°binaryblock
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:320)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   ... 15 more
> Caused by: org.apache.sysml.runtime.controlprogram.caching.CacheException: 
> Move to data/systemml/X_256_3_binary failed.
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.moveData(CacheableData.java:1329)
>   at 
> org.apache.sysml.runtime.instructions.cp.VariableCPInstruction.processMoveInstruction(VariableCPInstruction.java:706)
>   at 
> 

[jira] [Commented] (SYSTEMML-1281) OOM Error On Binary Write

2017-02-17 Thread Mike Dusenberry (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872447#comment-15872447
 ] 

Mike Dusenberry commented on SYSTEMML-1281:
---

Well, I found it while working on the deep learning breast cancer project.  
However, the specific code is actual not tied specifically to deep learning, 
and affects any code that wishes to write a DataFrame out to SystemML binary 
format.

> OOM Error On Binary Write
> -
>
> Key: SYSTEMML-1281
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1281
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 0.13
>Reporter: Mike Dusenberry
>Priority: Blocker
>
> I'm running into the following heap space OOM error while attempting to save 
> a large Spark DataFrame to a SystemML binary format via DML {{write}} 
> statements.
> Script:
> {code}
> tr_sample_filename = os.path.join("data", "train_{}{}.parquet".format(size, 
> "_grayscale" if grayscale else ""))
> val_sample_filename = os.path.join("data", "val_{}{}.parquet".format(size, 
> "_grayscale" if grayscale else ""))
> train_df = sqlContext.read.load(tr_sample_filename)
> val_df = sqlContext.read.load(val_sample_filename)
> train_df, val_df
> # Note: Must use the row index column, or X may not
> # necessarily correspond correctly to Y
> X_df = train_df.select("__INDEX", "sample")
> X_val_df = val_df.select("__INDEX", "sample")
> y_df = train_df.select("__INDEX", "tumor_score")
> y_val_df = val_df.select("__INDEX", "tumor_score")
> X_df, X_val_df, y_df, y_val_df
> script = """
> # Scale images to [-1,1]
> X = X / 255
> X_val = X_val / 255
> X = X * 2 - 1
> X_val = X_val * 2 - 1
> # One-hot encode the labels
> num_tumor_classes = 3
> n = nrow(y)
> n_val = nrow(y_val)
> Y = table(seq(1, n), y, n, num_tumor_classes)
> Y_val = table(seq(1, n_val), y_val, n_val, num_tumor_classes)
> """
> outputs = ("X", "X_val", "Y", "Y_val")
> script = dml(script).input(X=X_df, X_val=X_val_df, y=y_df, 
> y_val=y_val_df).output(*outputs)
> X, X_val, Y, Y_val = ml.execute(script).get(*outputs)
> X, X_val, Y, Y_val
> script = """
> write(X, "data/systemml/X_"+size+"_"+c+"_binary", format="binary")
> write(Y, "data/systemml/Y_"+size+"_"+c+"_binary", format="binary")
> write(X_val, "data/systemml/X_val_"+size+"_"+c+"_binary", format="binary")
> write(Y_val, "data/systemml/Y_val_"+size+"_"+c+"_binary", format="binary")
> """
> script = dml(script).input(X=X, X_val=X_val, Y=Y, Y_val=Y_val, size=size, c=c)
> ml.execute(script)
> {code}
> General error:
> {code}
> Caused by: org.apache.sysml.api.mlcontext.MLContextException: Exception 
> occurred while executing runtime program
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:371)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.execute(ScriptExecutor.java:292)
>   at org.apache.sysml.api.mlcontext.MLContext.execute(MLContext.java:293)
>   ... 12 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
> block generated from statement block between lines 1 and 11 -- Error 
> evaluating instruction: CP°mvvar°X°¶_Var49¶°binaryblock
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:130)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:369)
>   ... 14 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
> in program block generated from statement block between lines 1 and 11 -- 
> Error evaluating instruction: CP°mvvar°X°¶_Var49¶°binaryblock
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:320)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   ... 15 more
> Caused by: org.apache.sysml.runtime.controlprogram.caching.CacheException: 
> Move to data/systemml/X_256_3_binary failed.
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.moveData(CacheableData.java:1329)
>   at 
> org.apache.sysml.runtime.instructions.cp.VariableCPInstruction.processMoveInstruction(VariableCPInstruction.java:706)
>   at 
> org.apache.sysml.runtime.instructions.cp.VariableCPInstruction.processInstruction(VariableCPInstruction.java:511)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
>   ... 18 more
> Caused by: org.apache.sysml.runtime.controlprogram.caching.CacheException: 
> Export to data/systemml/X_256_3_binary 

[jira] [Commented] (SYSTEMML-1277) DataFrames With `mllib.Vector` Columns Are No Longer Converted to Matrices.

2017-02-17 Thread Xin Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872364#comment-15872364
 ] 

Xin Wu commented on SYSTEMML-1277:
--

Is this issue also for Deep Learning?

> DataFrames With `mllib.Vector` Columns Are No Longer Converted to Matrices.
> ---
>
> Key: SYSTEMML-1277
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1277
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 0.13
>Reporter: Mike Dusenberry
>Assignee: Deron Eriksson
>Priority: Blocker
>
> Recently, we made the switch from the old {{mllib.Vector}} to the new 
> {{ml.Vector}} type.  Unfortunately, this leaves us with the issue of no 
> longer recognizing DataFrames with {{mllib.Vector}} columns during 
> conversion, and thus, we (1) do not correctly convert to SystemML {{Matrix}} 
> objects, (2) instead fall back on conversion to {{Frame}} objects, and then 
> (3) fail completely when the ensuing DML script is expecting to operated on 
> matrices.
> Given a Spark {{DataFrame}} {{X_df}} of type {{DataFrame\[__INDEX: int, 
> sample: vector\]}}, where {{vector}} is of type {{mllib.Vector}}, the 
> following script will now fail (did not previously):
> {code}
> script = """
> # Scale images to [-1,1]
> X = X / 255
> X = X * 2 - 1
> """
> outputs = ("X")
> script = dml(script).input(X=X_df).output(*outputs)
> X = ml.execute(script).get(*outputs)
> X
> {code}
> {code}
> Caused by: org.apache.sysml.api.mlcontext.MLContextException: Exception 
> occurred while validating script
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.validateScript(ScriptExecutor.java:487)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.execute(ScriptExecutor.java:280)
>   at org.apache.sysml.api.mlcontext.MLContext.execute(MLContext.java:293)
>   ... 12 more
> Caused by: org.apache.sysml.parser.LanguageException: Invalid Parameters : 
> ERROR: null -- line 4, column 4 -- Invalid Datatypes for operation FRAME 
> SCALAR
>   at 
> org.apache.sysml.parser.Expression.raiseValidateError(Expression.java:549)
>   at 
> org.apache.sysml.parser.Expression.computeDataType(Expression.java:415)
>   at 
> org.apache.sysml.parser.Expression.computeDataType(Expression.java:386)
>   at 
> org.apache.sysml.parser.BinaryExpression.validateExpression(BinaryExpression.java:130)
>   at 
> org.apache.sysml.parser.StatementBlock.validate(StatementBlock.java:567)
>   at 
> org.apache.sysml.parser.DMLTranslator.validateParseTree(DMLTranslator.java:140)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.validateScript(ScriptExecutor.java:485)
>   ... 14 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1281) OOM Error On Binary Write

2017-02-17 Thread Xin Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872358#comment-15872358
 ] 

Xin Wu commented on SYSTEMML-1281:
--

This issue is related to Deep learning, right?

> OOM Error On Binary Write
> -
>
> Key: SYSTEMML-1281
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1281
> Project: SystemML
>  Issue Type: Bug
>Affects Versions: SystemML 0.13
>Reporter: Mike Dusenberry
>Priority: Blocker
>
> I'm running into the following heap space OOM error while attempting to save 
> a large Spark DataFrame to a SystemML binary format via DML {{write}} 
> statements.
> Script:
> {code}
> tr_sample_filename = os.path.join("data", "train_{}{}.parquet".format(size, 
> "_grayscale" if grayscale else ""))
> val_sample_filename = os.path.join("data", "val_{}{}.parquet".format(size, 
> "_grayscale" if grayscale else ""))
> train_df = sqlContext.read.load(tr_sample_filename)
> val_df = sqlContext.read.load(val_sample_filename)
> train_df, val_df
> # Note: Must use the row index column, or X may not
> # necessarily correspond correctly to Y
> X_df = train_df.select("__INDEX", "sample")
> X_val_df = val_df.select("__INDEX", "sample")
> y_df = train_df.select("__INDEX", "tumor_score")
> y_val_df = val_df.select("__INDEX", "tumor_score")
> X_df, X_val_df, y_df, y_val_df
> script = """
> # Scale images to [-1,1]
> X = X / 255
> X_val = X_val / 255
> X = X * 2 - 1
> X_val = X_val * 2 - 1
> # One-hot encode the labels
> num_tumor_classes = 3
> n = nrow(y)
> n_val = nrow(y_val)
> Y = table(seq(1, n), y, n, num_tumor_classes)
> Y_val = table(seq(1, n_val), y_val, n_val, num_tumor_classes)
> """
> outputs = ("X", "X_val", "Y", "Y_val")
> script = dml(script).input(X=X_df, X_val=X_val_df, y=y_df, 
> y_val=y_val_df).output(*outputs)
> X, X_val, Y, Y_val = ml.execute(script).get(*outputs)
> X, X_val, Y, Y_val
> script = """
> write(X, "data/systemml/X_"+size+"_"+c+"_binary", format="binary")
> write(Y, "data/systemml/Y_"+size+"_"+c+"_binary", format="binary")
> write(X_val, "data/systemml/X_val_"+size+"_"+c+"_binary", format="binary")
> write(Y_val, "data/systemml/Y_val_"+size+"_"+c+"_binary", format="binary")
> """
> script = dml(script).input(X=X, X_val=X_val, Y=Y, Y_val=Y_val, size=size, c=c)
> ml.execute(script)
> {code}
> General error:
> {code}
> Caused by: org.apache.sysml.api.mlcontext.MLContextException: Exception 
> occurred while executing runtime program
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:371)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.execute(ScriptExecutor.java:292)
>   at org.apache.sysml.api.mlcontext.MLContext.execute(MLContext.java:293)
>   ... 12 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: 
> org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program 
> block generated from statement block between lines 1 and 11 -- Error 
> evaluating instruction: CP°mvvar°X°¶_Var49¶°binaryblock
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:130)
>   at 
> org.apache.sysml.api.mlcontext.ScriptExecutor.executeRuntimeProgram(ScriptExecutor.java:369)
>   ... 14 more
> Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error 
> in program block generated from statement block between lines 1 and 11 -- 
> Error evaluating instruction: CP°mvvar°X°¶_Var49¶°binaryblock
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:320)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:168)
>   at 
> org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:123)
>   ... 15 more
> Caused by: org.apache.sysml.runtime.controlprogram.caching.CacheException: 
> Move to data/systemml/X_256_3_binary failed.
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.moveData(CacheableData.java:1329)
>   at 
> org.apache.sysml.runtime.instructions.cp.VariableCPInstruction.processMoveInstruction(VariableCPInstruction.java:706)
>   at 
> org.apache.sysml.runtime.instructions.cp.VariableCPInstruction.processInstruction(VariableCPInstruction.java:511)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
>   ... 18 more
> Caused by: org.apache.sysml.runtime.controlprogram.caching.CacheException: 
> Export to data/systemml/X_256_3_binary failed.
>   at 
> org.apache.sysml.runtime.controlprogram.caching.CacheableData.exportData(CacheableData.java:800)
>   at 
> 

[jira] [Created] (SYSTEMML-1282) Determine required avro jars for bin artifacts

2017-02-17 Thread Deron Eriksson (JIRA)
Deron Eriksson created SYSTEMML-1282:


 Summary: Determine required avro jars for bin artifacts
 Key: SYSTEMML-1282
 URL: https://issues.apache.org/jira/browse/SYSTEMML-1282
 Project: SystemML
  Issue Type: Task
  Components: Build
Reporter: Deron Eriksson
Priority: Minor


The current -bin (tgz and zip) artifacts have the following avro jars in them:
{code}
avro-1.7.4.jar
avro-ipc-1.7.7-tests.jar
avro-ipc-1.7.7.jar
avro-mapred-1.7.7-hadoop2.jar
{code}

Determine if avro-ipc-1.7.7-tests.jar, avro-ipc-1.7.7.jar, and 
avro-mapred-1.7.7-hadoop2.jar are needed. If not, exclude them from bin 
artifacts.

If any are needed, determine if a single version (1.7.4 or 1.7.7) should be 
used, and use that version.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (SYSTEMML-1243) Perftest: OutOfMemoryError in stratstats.dml for 800MB case

2017-02-17 Thread Matthias Boehm (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871492#comment-15871492
 ] 

Matthias Boehm commented on SYSTEMML-1243:
--

Just to clarify our default 100k scenario with 100 features runs fine. This 
scenario here however, uses 1000 features which makes stratstats more 
challenging. I was able to reproduce this OOM, even after the recent changes 
which already reduces memory pressure. 

The core problem comes form several matrix multiplication of the following 
form, where we've chosen mapmm (with repartitioning at runtime level in order 
to overcome Spark's 2GB limitation per partition).  
{code}
mapmm: rdd [10 x 1000, nnz=95000819, blocks (1000 x 1000)] 800MB
mapmm: bc [1000 x 100, nnz=100, blocks (1000 x 1000)] 172MB
--> output: 10 x 100
{code}

However, because the RDD has only 100 block - this gives us an upper bound on 
the maximum number of input partitions, hindering us from repartition this RDD 
to our preferred number of partitions which causes too large outputs per task 
(partition). 

I can think of three potential directions going forward:

1) Flip RDD and broadcast during runtime if we detect that it would be 
beneficial for repartitioning (in this case changing the upper bound by 10x)

2) Alternative matrix multiplication operations: Traditionally, we would have 
applied RMM for these scenarios but replication can similarly lead to large 
task outputs. Alternatively, we could consider enabling pmapmm for production 
use. 

3) Extended permutation matrix multiply pmm: So far, we only support selection 
but no permutation matrices and we're only able to detect this within a DAG 
which would not apply here. One option would be to keep track of special 
producing operations and flag intermediates. 

> Perftest: OutOfMemoryError in stratstats.dml for 800MB case
> ---
>
> Key: SYSTEMML-1243
> URL: https://issues.apache.org/jira/browse/SYSTEMML-1243
> Project: SystemML
>  Issue Type: Bug
>  Components: Test
>Affects Versions: SystemML 0.13
> Environment: spark 2.1.0
>Reporter: Imran Younus
>Assignee: Matthias Boehm
> Fix For: SystemML 0.13
>
> Attachments: sparkDML.sh
>
>
> when running {{runAllStats.sh}} script, {{stratstats.dml}} ends with 
> OutOfMemory error for 100k_1k data set. Here is end of log file:
> {code}
> 17/02/06 16:09:25 INFO api.DMLScript: SystemML Statistics:
> Total elapsed time:   1435.880 sec.
> Total compilation time:   2.433 sec.
> Total execution time: 1433.447 sec.
> Number of compiled Spark inst:190.
> Number of executed Spark inst:3.
> Cache hits (Mem, WB, FS, HDFS):   72343/3/4/7.
> Cache writes (WB, FS, HDFS):  10419/5/0.
> Cache times (ACQr/m, RLS, EXP):   387.598/0.039/277.658/0.000 sec.
> HOP DAGs recompiled (PRED, SB):   0/107.
> HOP DAGs recompile time:  0.207 sec.
> Functions recompiled: 3.
> Functions recompile time: 0.026 sec.
> Spark ctx create time (lazy): 36.537 sec.
> Spark trans counts (par,bc,col):3/3/0.
> Spark trans times (par,bc,col):   0.404/0.147/0.000 secs.
> Total JIT compile time:   63.262 sec.
> Total JVM GC count:   57.
> Total JVM GC time:34.538 sec.
> Heavy hitter instructions (name, time, count):
> -- 1) wdivmm  1078.568 sec5
> -- 2) ba+*286.854 sec 22
> -- 3) sp_mapmm37.244 sec  3
> -- 4) fStat_tailprob  2.071 sec   3
> -- 5) rangeReIndex1.608 sec   30601
> -- 6) ==  0.974 sec   11
> -- 7) ^2  0.793 sec   13
> -- 8) cdf 0.603 sec   10200
> -- 9) replace 0.349 sec   10
> -- 10)r'  0.278 sec   106
> 17/02/06 16:09:25 INFO api.DMLScript: END DML run 02/06/2017 16:09:25
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:363)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseBlock(MatrixBlock.java:339)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.allocateDenseOrSparseBlock(MatrixBlock.java:346)
>   at 
> org.apache.sysml.runtime.matrix.data.LibMatrixMult.matrixMultWDivMM(LibMatrixMult.java:752)
>   at 
> org.apache.sysml.runtime.matrix.data.MatrixBlock.quaternaryOperations(MatrixBlock.java:5475)
>   at 
> org.apache.sysml.runtime.instructions.cp.QuaternaryCPInstruction.processInstruction(QuaternaryCPInstruction.java:128)
>   at 
> org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:290)
>   at 
>