I'm a Scala / Spark / GraphX newbie, so may be missing something obvious.  

I have a set of edges that I read into a graph.  For an iterative
community-detection algorithm, I want to assign each vertex to a community
with the name of the vertex.  Intuitively it seems like I should be able to
pull the vertexID out of the VertexRDD and build a new VertexRDD with 2 Int
attributes.  Unfortunately I'm not finding the recipe to unpack the
VertexRDD into the vertexID and attribute pieces.

The code snippet that builds the graph looks like

import org.apache.spark._
import org.apache.spark.graphx._
import org.apache.spark.rdd.RDD
val G = GraphLoader.edgeListFile(sc,"[[...]]clique_5_2_3.edg")

Poking at G to see what it looks like, I see

scala> :type G.vertices
org.apache.spark.graphx.VertexRDD[Int]

scala> G.vertices.collect()
res1: Array[(org.apache.spark.graphx.VertexId, Int)] = Array((10002,1),
(4,1), (10001,1), (10000,1), (0,1), (1,1), (10003,1), (3,1), (10004,1),
(2,1))

I've tried several ways to pull out just the first element of each tuple
into a new variable, with no success.

scala> var (x: Int) = G.vertices
<console>:21: error: type mismatch;
 found   : org.apache.spark.graphx.VertexRDD[Int]
 required: Int
       var (x: Int) = G.vertices
                        ^

scala> val x: Int = G.vertices._1
<console>:21: error: value _1 is not a member of
org.apache.spark.graphx.VertexRDD[Int]
       val x: Int = G.vertices._1
                               ^
What am I missing? 



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/noob-how-to-extract-different-members-of-a-VertexRDD-tp12399.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to