SPARK-7879 https://issues.apache.org/jira/browse/SPARK-7879 seems to
address your use case (running KMeans on a dataframe and having the results
added as an additional column)
On Wed, Jul 1, 2015 at 5:53 PM, Eric Friedman eric.d.fried...@gmail.com
wrote:
In preparing a DataFrame (spark 1.4) to
In preparing a DataFrame (spark 1.4) to use with MLlib's kmeans.train
method, is there a cleaner way to create the Vectors than this?
data.map{r = Vectors.dense(r.getDouble(0), r.getDouble(3), r.getDouble(4),
r.getDouble(5), r.getDouble(6))}
Second, once I train the model and call predict on my