[ https://issues.apache.org/jira/browse/SPARK-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiangrui Meng resolved SPARK-2293. ---------------------------------- Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1250 [https://github.com/apache/spark/pull/1250] > Replace RDD.zip usage by map with predict inside. > ------------------------------------------------- > > Key: SPARK-2293 > URL: https://issues.apache.org/jira/browse/SPARK-2293 > Project: Spark > Issue Type: Improvement > Components: MLlib > Reporter: Xiangrui Meng > Priority: Minor > Fix For: 1.1.0 > > > In our guide, we use > {code} > val prediction = model.predict(test.map(_.features)) > val predictionAndLabel = prediction.zip(test.map(_.label)) > {code} > This is not efficient because test will be computed twice. We should change > it to > {code} > val predictionAndLabel = test.map(p => (model.predict(p.features), p.label)) > {code} > It is also nice to add a `predictWith` method to predictive models. > {code} > def predictWith[V](RDD[(Vector, V)]): RDD[(Double, V)] > {code} > But I'm not sure whether this is a good name. `predictWithValue`? -- This message was sent by Atlassian JIRA (v6.2#6252)