RE: Two joins in GraphX Pregel implementation

2015-07-29 Thread Ulanov, Alexander
: Tuesday, July 28, 2015 12:05 PM To: Ulanov, Alexander Cc: Robin East; dev@spark.apache.org Subject: Re: Two joins in GraphX Pregel implementation On 27 Jul 2015, at 16:42, Ulanov, Alexander alexander.ula...@hp.commailto:alexander.ula...@hp.com wrote: It seems that the mentioned two joins can

RE: Two joins in GraphX Pregel implementation

2015-07-28 Thread Ulanov, Alexander
. Do you know the reason why this improvement is not pushed? CC’ing Dave From: Robin East [mailto:robin.e...@xense.co.uk] Sent: Monday, July 27, 2015 9:11 AM To: Ulanov, Alexander Cc: dev@spark.apache.org Subject: Re: Two joins in GraphX Pregel implementation Quite possibly - there is a JIRA open

Re: Two joins in GraphX Pregel implementation

2015-07-28 Thread Ankur Dave
On 27 Jul 2015, at 16:42, Ulanov, Alexander alexander.ula...@hp.com wrote: It seems that the mentioned two joins can be rewritten as one outer join You're right. In fact, the outer join can be streamlined further using a method from GraphOps: g = g.joinVertices(messages)(vprog).cache() Then,

Two joins in GraphX Pregel implementation

2015-07-27 Thread Ulanov, Alexander
Dear Spark developers, Below is the GraphX Pregel code snippet from https://spark.apache.org/docs/latest/graphx-programming-guide.html#pregel-api: (it does not contain caching step): while (activeMessages 0 i maxIterations) { // Receive the messages:

RE: Two joins in GraphX Pregel implementation

2015-07-27 Thread Ulanov, Alexander
27, 2015 8:56 AM To: Ulanov, Alexander Cc: dev@spark.apache.org Subject: Re: Two joins in GraphX Pregel implementation What happens to this line of code: messages = g.mapReduceTriplets(sendMsg, mergeMsg, Some((newVerts, activeDir))).cache() Part of the Pregel ‘contract’ is that vertices