[jira] [Updated] (SYSTEMML-749) Failed nrow call after spark removeEmpty operation

2016-06-02 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-749:

Description: In the special case of removeEmpty over a completely empty 
input matrix, the spark removeEmpty instruction creates an invalid output of 
dimensions [0 x ncol(in)] or [nrow(in) x 0] which is not supported in SystemML. 
Accordingly, any subsequent operation would have undefined behavior; in case of 
meta data operations like nrow or ncol, this actually leads to an explicit 
error with the following message: "Invalid meta data returned by nrow: 0".

> Failed nrow call after spark removeEmpty operation
> --
>
> Key: SYSTEMML-749
> URL: https://issues.apache.org/jira/browse/SYSTEMML-749
> Project: SystemML
>  Issue Type: Bug
>  Components: Runtime
>Affects Versions: SystemML 0.10
>Reporter: Matthias Boehm
>
> In the special case of removeEmpty over a completely empty input matrix, the 
> spark removeEmpty instruction creates an invalid output of dimensions [0 x 
> ncol(in)] or [nrow(in) x 0] which is not supported in SystemML. Accordingly, 
> any subsequent operation would have undefined behavior; in case of meta data 
> operations like nrow or ncol, this actually leads to an explicit error with 
> the following message: "Invalid meta data returned by nrow: 0".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (SYSTEMML-749) Failed nrow call after spark removeEmpty operation

2016-06-02 Thread Matthias Boehm (JIRA)
Matthias Boehm created SYSTEMML-749:
---

 Summary: Failed nrow call after spark removeEmpty operation
 Key: SYSTEMML-749
 URL: https://issues.apache.org/jira/browse/SYSTEMML-749
 Project: SystemML
  Issue Type: Bug
  Components: Runtime
Affects Versions: SystemML 0.10
Reporter: Matthias Boehm






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (SYSTEMML-748) Built-in support for performance metrics

2016-06-02 Thread Manoj Kumar (JIRA)
Manoj Kumar created SYSTEMML-748:


 Summary: Built-in support for performance metrics
 Key: SYSTEMML-748
 URL: https://issues.apache.org/jira/browse/SYSTEMML-748
 Project: SystemML
  Issue Type: Task
  Components: APIs
Reporter: Manoj Kumar
Priority: Minor


More often than not, I would like to just predict using a given model and try 
out different scoring / error metrics independently.

It would be great to have for instance scripts like "accuracy.dml", which would 
just take y_true and y_pred and outputs the accuracy. The same applies to auc, 
mse, r2_score and friends.

Let me know your thoughts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (SYSTEMML-747) Wrong in-memory csv reblock decision w/ unknowns

2016-06-02 Thread Matthias Boehm (JIRA)

 [ 
https://issues.apache.org/jira/browse/SYSTEMML-747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Boehm updated SYSTEMML-747:

Description: The decision on cp in-memory reblock for text input matrices 
is made based on the estimated size in memory. For csv reblock, we support 
persistent reads with unknown dimension sizes. In scenarios with unknown 
dimensions the memory estimate is always negative, resulting always in 
in-memory reblocks which either take very long or even run out of memory.

> Wrong in-memory csv reblock decision w/ unknowns
> 
>
> Key: SYSTEMML-747
> URL: https://issues.apache.org/jira/browse/SYSTEMML-747
> Project: SystemML
>  Issue Type: Bug
>  Components: Runtime
>Affects Versions: SystemML 0.10
>Reporter: Matthias Boehm
>
> The decision on cp in-memory reblock for text input matrices is made based on 
> the estimated size in memory. For csv reblock, we support persistent reads 
> with unknown dimension sizes. In scenarios with unknown dimensions the memory 
> estimate is always negative, resulting always in in-memory reblocks which 
> either take very long or even run out of memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (SYSTEMML-747) Wrong in-memory csv reblock decision w/ unknowns

2016-06-02 Thread Matthias Boehm (JIRA)
Matthias Boehm created SYSTEMML-747:
---

 Summary: Wrong in-memory csv reblock decision w/ unknowns
 Key: SYSTEMML-747
 URL: https://issues.apache.org/jira/browse/SYSTEMML-747
 Project: SystemML
  Issue Type: Bug
  Components: Runtime
Affects Versions: SystemML 0.10
Reporter: Matthias Boehm






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SYSTEMML-745) Spark context not initialized error under yarn-cluster for small data

2016-06-02 Thread Glenn Weidner (JIRA)

[ 
https://issues.apache.org/jira/browse/SYSTEMML-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15312698#comment-15312698
 ] 

Glenn Weidner commented on SYSTEMML-745:


Test was done against Spark 1.6.1.  If issue also occurs in Spark 2.0, then a 
Spark JIRA/PR may be required.

> Spark context not initialized error under yarn-cluster for small data
> -
>
> Key: SYSTEMML-745
> URL: https://issues.apache.org/jira/browse/SYSTEMML-745
> Project: SystemML
>  Issue Type: Bug
>  Components: Runtime
>Reporter: Glenn Weidner
>
> Sample command to reproduce issue:
> ./bin/spark-submit --master yarn-cluster --class 
> org.apache.sysml.api.DMLScript ../systemml/lib/systemml-0.10.0-incubating.jar 
> -s "print('Hello Apache SystemML')
> Work-around is to add '-exec spark':
> ./bin/spark-submit --master yarn-cluster --class 
> org.apache.sysml.api.DMLScript ../systemml/lib/systemml-0.10.0-incubating.jar 
> -s "print('Hello Apache SystemML') -exec spark



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)