Here is another solution (minGraph is the graph from your code. I assume that 
is your original graph):

val graphWithNoOutEdges = minGraph.filter(
  graph => graph.outerJoinVertices(graph.outDegrees) {(vId, vData, 
outDegreesOpt) => outDegreesOpt.getOrElse(0)},
  vpred = (vId: VertexId, vOutDegrees: Int) => vOutDegrees == 0
)

val verticesWithNoOutEdges = graphWithNoOutEdges.vertices

Mohammed
Author: Big Data Analytics with 
Spark<http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>

From: Guillermo Ortiz [mailto:konstt2...@gmail.com]
Sent: Friday, February 26, 2016 5:46 AM
To: Robin East
Cc: user
Subject: Re: Get all vertexes with outDegree equals to 0 with GraphX

Yes, I am not really happy with that "collect".
I was taking a look to use subgraph method and others options and didn't figure 
out anything easy or direct..

I'm going to try your idea.

2016-02-26 14:16 GMT+01:00 Robin East 
<robin.e...@xense.co.uk<mailto:robin.e...@xense.co.uk>>:
Whilst I can think of other ways to do it I don’t think they would be 
conceptually or syntactically any simpler. GraphX doesn’t have the concept of 
built-in vertex properties which would make this simpler - a vertex in GraphX 
is a Vertex ID (Long) and a bunch of custom attributes that you assign. This 
means you have to find a way of ‘pushing’ the vertex degree into the graph so 
you can do comparisons (cf a join in relational databases) or as you have done 
create a list and filter against that (cf filtering against a sub-query in 
relational database).

One thing I would point out is that you probably want to avoid 
finalVerexes.collect() for a large-scale system - this will pull all the 
vertices into the driver and then push them out to the executors again as part 
of the filter operation. A better strategy for large graphs would be:

1. build a graph based on the existing graph where the vertex attribute is the 
vertex degree - the GraphX documentation shows how to do this
2. filter this “degrees” graph to just give you 0 degree vertices
3 use graph.mask passing in the 0-degree graph to get the original graph with 
just 0 degree vertices

Just one variation on several possibilities, the key point is that everything 
is just a graph transformation until you call an action on the resulting graph
-------------------------------------------------------------------------------
Robin East
Spark GraphX in Action Michael Malak and Robin East
Manning Publications Co.
http://www.manning.com/books/spark-graphx-in-action




On 26 Feb 2016, at 11:59, Guillermo Ortiz 
<konstt2...@gmail.com<mailto:konstt2...@gmail.com>> wrote:

I'm new with graphX. I need to get the vertex without out edges..
I guess that it's pretty easy but I did it pretty complicated.. and inefficienct


val vertices: RDD[(VertexId, (List[String], List[String]))] =
  sc.parallelize(Array((1L, (List("a"), List[String]())),
    (2L, (List("b"), List[String]())),
    (3L, (List("c"), List[String]())),
    (4L, (List("d"), List[String]())),
    (5L, (List("e"), List[String]())),
    (6L, (List("f"), List[String]()))))

// Create an RDD for edges
val relationships: RDD[Edge[Boolean]] =
  sc.parallelize(Array(Edge(1L, 2L, true), Edge(2L, 3L, true), Edge(3L, 4L, 
true), Edge(5L, 2L, true)))

val out = minGraph.outDegrees.map(vertex => vertex._1)

val finalVertexes = minGraph.vertices.keys.subtract(out)

//It must be something better than this way..
val nodes = finalVertexes.collect()
val result = minGraph.vertices.filter(v => nodes.contains(v._1))



What's the good way to do this operation? It seems that it should be pretty 
easy.


Reply via email to