Thank you, I have to think what the code does,, because I am a little noob
in scala and it's hard to understand it to me.

2016-02-27 3:53 GMT+01:00 Mohammed Guller <moham...@glassbeam.com>:

> Here is another solution (minGraph is the graph from your code. I assume
> that is your original graph):
>
>
>
> val graphWithNoOutEdges = minGraph.filter(
>
>   graph => graph.outerJoinVertices(graph.outDegrees) {(vId, vData,
> outDegreesOpt) => outDegreesOpt.getOrElse(0)},
>
>   vpred = (vId: VertexId, vOutDegrees: Int) => vOutDegrees == 0
>
> )
>
>
>
> val verticesWithNoOutEdges = graphWithNoOutEdges.vertices
>
>
>
> Mohammed
>
> Author: Big Data Analytics with Spark
> <http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>
>
>
>
> *From:* Guillermo Ortiz [mailto:konstt2...@gmail.com]
> *Sent:* Friday, February 26, 2016 5:46 AM
> *To:* Robin East
> *Cc:* user
> *Subject:* Re: Get all vertexes with outDegree equals to 0 with GraphX
>
>
>
> Yes, I am not really happy with that "collect".
>
> I was taking a look to use subgraph method and others options and didn't
> figure out anything easy or direct..
>
>
>
> I'm going to try your idea.
>
>
>
> 2016-02-26 14:16 GMT+01:00 Robin East <robin.e...@xense.co.uk>:
>
> Whilst I can think of other ways to do it I don’t think they would be
> conceptually or syntactically any simpler. GraphX doesn’t have the concept
> of built-in vertex properties which would make this simpler - a vertex in
> GraphX is a Vertex ID (Long) and a bunch of custom attributes that you
> assign. This means you have to find a way of ‘pushing’ the vertex degree
> into the graph so you can do comparisons (cf a join in relational
> databases) or as you have done create a list and filter against that (cf
> filtering against a sub-query in relational database).
>
>
>
> One thing I would point out is that you probably want to avoid
> finalVerexes.collect() for a large-scale system - this will pull all the
> vertices into the driver and then push them out to the executors again as
> part of the filter operation. A better strategy for large graphs would be:
>
>
>
> 1. build a graph based on the existing graph where the vertex attribute is
> the vertex degree - the GraphX documentation shows how to do this
>
> 2. filter this “degrees” graph to just give you 0 degree vertices
>
> 3 use graph.mask passing in the 0-degree graph to get the original graph
> with just 0 degree vertices
>
>
>
> Just one variation on several possibilities, the key point is that
> everything is just a graph transformation until you call an action on the
> resulting graph
>
>
> -------------------------------------------------------------------------------
>
> Robin East
>
> *Spark GraphX in Action *Michael Malak and Robin East
>
> Manning Publications Co.
>
> http://www.manning.com/books/spark-graphx-in-action
>
>
>
>
>
>
>
>
>
> On 26 Feb 2016, at 11:59, Guillermo Ortiz <konstt2...@gmail.com> wrote:
>
>
>
> I'm new with graphX. I need to get the vertex without out edges..
>
> I guess that it's pretty easy but I did it pretty complicated.. and
> inefficienct
>
>
>
> *val *vertices: RDD[(VertexId, (List[String], List[String]))] =
>   sc.parallelize(*Array*((1L, (*List*(*"a"*), *List*[String]())),
>     (2L, (*List*(*"b"*), *List*[String]())),
>     (3L, (*List*(*"c"*), *List*[String]())),
>     (4L, (*List*(*"d"*), *List*[String]())),
>     (5L, (*List*(*"e"*), *List*[String]())),
>     (6L, (*List*(*"f"*), *List*[String]()))))
>
>
> *// Create an RDD for edges**val *relationships: RDD[Edge[Boolean]] =
>   sc.parallelize(*Array*(*Edge*(1L, 2L, *true*), *Edge*(2L, 3L, *true*), 
> *Edge*(3L, 4L, *true*), *Edge*(5L, 2L, *true*)))
>
> *val *out = minGraph.*outDegrees*.map(vertex => vertex._1)
>
> *val *finalVertexes = minGraph.vertices.keys.subtract(out)
>
> //It must be something better than this way..
> *val *nodes = finalVertexes.collect()
> *val *result = minGraph.vertices.filter(v => nodes.contains(v._1))
>
>
>
> What's the good way to do this operation? It seems that it should be pretty 
> easy.
>
>
>
>
>

Reply via email to