[ https://issues.apache.org/jira/browse/SPARK-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiangrui Meng resolved SPARK-1580. ---------------------------------- Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1731 [https://github.com/apache/spark/pull/1731] > [MLlib] ALS: Estimate communication and computation costs given a partitioner > ----------------------------------------------------------------------------- > > Key: SPARK-1580 > URL: https://issues.apache.org/jira/browse/SPARK-1580 > Project: Spark > Issue Type: Improvement > Components: MLlib > Reporter: Tor Myklebust > Assignee: Tor Myklebust > Priority: Minor > Fix For: 1.1.0 > > > It would be nice to be able to estimate the amount of work needed to solve an > ALS problem. The chief components of this "work" are computation time---time > spent forming and solving the least squares problems---and communication > cost---the number of bytes sent across the network. Communication cost > depends heavily on how the users and products are partitioned. > We currently do not try to cluster users or products so that fewer feature > vectors need to be communicated. This is intended as a first step toward > that end---we ought to be able to tell whether one partitioning is better > than another. -- This message was sent by Atlassian JIRA (v6.2#6252)