[jira] [Updated] (SYSTEMML-749) Failed nrow call after spark removeEmpty operation
[ https://issues.apache.org/jira/browse/SYSTEMML-749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias Boehm updated SYSTEMML-749: Description: In the special case of removeEmpty over a completely empty input matrix, the spark removeEmpty instruction creates an invalid output of dimensions [0 x ncol(in)] or [nrow(in) x 0] which is not supported in SystemML. Accordingly, any subsequent operation would have undefined behavior; in case of meta data operations like nrow or ncol, this actually leads to an explicit error with the following message: "Invalid meta data returned by nrow: 0". > Failed nrow call after spark removeEmpty operation > -- > > Key: SYSTEMML-749 > URL: https://issues.apache.org/jira/browse/SYSTEMML-749 > Project: SystemML > Issue Type: Bug > Components: Runtime >Affects Versions: SystemML 0.10 >Reporter: Matthias Boehm > > In the special case of removeEmpty over a completely empty input matrix, the > spark removeEmpty instruction creates an invalid output of dimensions [0 x > ncol(in)] or [nrow(in) x 0] which is not supported in SystemML. Accordingly, > any subsequent operation would have undefined behavior; in case of meta data > operations like nrow or ncol, this actually leads to an explicit error with > the following message: "Invalid meta data returned by nrow: 0". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (SYSTEMML-749) Failed nrow call after spark removeEmpty operation
Matthias Boehm created SYSTEMML-749: --- Summary: Failed nrow call after spark removeEmpty operation Key: SYSTEMML-749 URL: https://issues.apache.org/jira/browse/SYSTEMML-749 Project: SystemML Issue Type: Bug Components: Runtime Affects Versions: SystemML 0.10 Reporter: Matthias Boehm -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (SYSTEMML-748) Built-in support for performance metrics
Manoj Kumar created SYSTEMML-748: Summary: Built-in support for performance metrics Key: SYSTEMML-748 URL: https://issues.apache.org/jira/browse/SYSTEMML-748 Project: SystemML Issue Type: Task Components: APIs Reporter: Manoj Kumar Priority: Minor More often than not, I would like to just predict using a given model and try out different scoring / error metrics independently. It would be great to have for instance scripts like "accuracy.dml", which would just take y_true and y_pred and outputs the accuracy. The same applies to auc, mse, r2_score and friends. Let me know your thoughts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (SYSTEMML-747) Wrong in-memory csv reblock decision w/ unknowns
[ https://issues.apache.org/jira/browse/SYSTEMML-747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias Boehm updated SYSTEMML-747: Description: The decision on cp in-memory reblock for text input matrices is made based on the estimated size in memory. For csv reblock, we support persistent reads with unknown dimension sizes. In scenarios with unknown dimensions the memory estimate is always negative, resulting always in in-memory reblocks which either take very long or even run out of memory. > Wrong in-memory csv reblock decision w/ unknowns > > > Key: SYSTEMML-747 > URL: https://issues.apache.org/jira/browse/SYSTEMML-747 > Project: SystemML > Issue Type: Bug > Components: Runtime >Affects Versions: SystemML 0.10 >Reporter: Matthias Boehm > > The decision on cp in-memory reblock for text input matrices is made based on > the estimated size in memory. For csv reblock, we support persistent reads > with unknown dimension sizes. In scenarios with unknown dimensions the memory > estimate is always negative, resulting always in in-memory reblocks which > either take very long or even run out of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (SYSTEMML-747) Wrong in-memory csv reblock decision w/ unknowns
Matthias Boehm created SYSTEMML-747: --- Summary: Wrong in-memory csv reblock decision w/ unknowns Key: SYSTEMML-747 URL: https://issues.apache.org/jira/browse/SYSTEMML-747 Project: SystemML Issue Type: Bug Components: Runtime Affects Versions: SystemML 0.10 Reporter: Matthias Boehm -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SYSTEMML-745) Spark context not initialized error under yarn-cluster for small data
[ https://issues.apache.org/jira/browse/SYSTEMML-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15312698#comment-15312698 ] Glenn Weidner commented on SYSTEMML-745: Test was done against Spark 1.6.1. If issue also occurs in Spark 2.0, then a Spark JIRA/PR may be required. > Spark context not initialized error under yarn-cluster for small data > - > > Key: SYSTEMML-745 > URL: https://issues.apache.org/jira/browse/SYSTEMML-745 > Project: SystemML > Issue Type: Bug > Components: Runtime >Reporter: Glenn Weidner > > Sample command to reproduce issue: > ./bin/spark-submit --master yarn-cluster --class > org.apache.sysml.api.DMLScript ../systemml/lib/systemml-0.10.0-incubating.jar > -s "print('Hello Apache SystemML') > Work-around is to add '-exec spark': > ./bin/spark-submit --master yarn-cluster --class > org.apache.sysml.api.DMLScript ../systemml/lib/systemml-0.10.0-incubating.jar > -s "print('Hello Apache SystemML') -exec spark -- This message was sent by Atlassian JIRA (v6.3.4#6332)