[ https://issues.apache.org/jira/browse/SPARK-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiangrui Meng updated SPARK-1580: --------------------------------- Target Version/s: 1.1.0 Fix Version/s: (was: 1.1.0) > ALS: Estimate communication and computation costs given a partitioner > --------------------------------------------------------------------- > > Key: SPARK-1580 > URL: https://issues.apache.org/jira/browse/SPARK-1580 > Project: Spark > Issue Type: Improvement > Components: MLlib > Reporter: Tor Myklebust > Priority: Minor > > It would be nice to be able to estimate the amount of work needed to solve an > ALS problem. The chief components of this "work" are computation time---time > spent forming and solving the least squares problems---and communication > cost---the number of bytes sent across the network. Communication cost > depends heavily on how the users and products are partitioned. > We currently do not try to cluster users or products so that fewer feature > vectors need to be communicated. This is intended as a first step toward > that end---we ought to be able to tell whether one partitioning is better > than another. -- This message was sent by Atlassian JIRA (v6.2#6252)