MLlib - Show an element in RDD[(Int, Iterable[Array[Double]])]

danilopds Thu, 05 Feb 2015 12:35:12 -0800

Hi,
I'm learning Spark and testing the Spark MLlib library with algorithm
K-means.


So,
I created a file "height-weight.txt" like this:
65.0 220.0
73.0 160.0
59.0 110.0
61.0 120.0
...

And the code (executed in spark-shell):
import org.apache.spark.mllib.clustering.KMeans
import org.apache.spark.mllib.linalg.Vectors

val data = sc.textFile("/opt/testAppSpark/data/height-weight.txt")
val parsedData = data.map(s => Vectors.dense(s.split('
').map(_.toDouble))).cache()
val numCluster = 3
val numIterations = 30
val cluster = KMeans.train(parsedData, numCluster, numIterations)
val groups = data.map{_.split(' ').map(_.toDouble)}.groupBy{rdd =>
cluster.predict(Vectors.dense(rdd))}
groups.collect

When I typed /groups.collect/, I received an information like:
res29: Array[(Int, Iterable[Array[Double]])] =
Array((0,CompactBuffer([D@12c6123d, [D@9d76c6c, [D@1e0f2b80, [D@75f0efea,
[D@1d172824, [D@5b4c6267, [D@73d08704)), (2,CompactBuffer([D@7f505302,
[D@7279e99a, [D@21d7b82d, [D@597ca3b6, [D@5e02fa0)),
(1,CompactBuffer([D@4156b463, [D@235cf118, [D@2ad870cb, [D@67d53566,
[D@5ea4f0cb, [D@1ebccff8, [D@7df9b28b, [D@1439044a)))

Typing /groups/ em command line I see:
res1: org.apache.spark.rdd.RDD[(Int, Iterable[Array[Double]])] =
ShuffledRDD[28] at groupBy at <console>:24

How can I see the results?
Thanks.







--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/MLlib-Show-an-element-in-RDD-Int-Iterable-Array-Double-tp21521.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

MLlib - Show an element in RDD[(Int, Iterable[Array[Double]])]

Reply via email to