Graphx traversal and merge interesting edges

2014-07-04 Thread HHB
Hello Gurus,

Pardon me I am noob @ Spark & GraphX (& Scala) And I seek your wisdome here..  
I want to know how to do a graph traversal and do selective merge on edges... 
Thanks to the documentation :-) I could create a simple graph of employees & 
their colleagues. The a structure of Graph is below, where ( ) represent nodes, 
--[: ]--> represents relationships. I want to transform the current graph into 
a useful one.

Given the current graph:

(Alice) --[:works_at]--> (Acme) <--[:works_at]-- (Bob) --[:worked_at]--> 
(CompanyX) <--[:works_at]-- (Cindy)


Convert that into:

(Alice) --[:get_feedback_from]--> (Cindy)

So merging " --[:works_at]--> (Acme) <--[:works_at]-- (Bob) --[:worked_at]--> 
(CompanyX) <--[:works_at]--  " path, in to an edge with the direction same as 
--[:worked_at]-->. I want to preserve other edges of left over vertices.

I want to create a new relation between an employee and his/her colleague's 
former colleague's. I could not find example that could help me (may be I have 
looked at wrong place)... 

Thanks!

Re: Graphx traversal and merge interesting edges

2014-07-05 Thread HHB
Thanks Ankur,

Cannot thank you enough for this!!! I am reading your example still digesting & 
grokking it though :-)

I was breaking my head over this for past few hours.

In my last futile attempts over past few hours. I was looking at Pregel... E.g 
if that could be used to see at what step of a path match the vertex is in and 
send message to next vertex with the history of traversal.. then for merging 
message append the historical traversal path of for each message :-P. 

--Gautam

On 05-Jul-2014, at 3:23 pm, Ankur Dave  wrote:

> Interesting problem! My understanding is that you want to (1) find paths 
> matching a particular pattern, and (2) add edges between the start and end 
> vertices of the matched paths.
> 
> For (1), I implemented a pattern matcher for GraphX that iteratively 
> accumulates partial pattern matches. I used your example in the unit test.
> 
> For (2), you can take the output of the pattern matcher (the set of matching 
> paths organized by their terminal vertices) and construct a set of new edges 
> using the initial and terminal vertices of each path. Then you can make a new 
> graph consisting of the union of the original edge set and the new edges. Let 
> me know if you'd like help with this.
> 
> Ankur
> 



Re: Graphx traversal and merge interesting edges

2014-07-08 Thread HHB
Hi Ankur,

I was trying out the PatterMatcher it works for smaller path, but I see that 
for the longer ones it continues to run forever...

Here's what I am trying: 
https://gist.github.com/hihellobolke/dd2dc0fcebba485975d1  (The example of 3 
share traders transacting in appl shares)

The first edge pattern list (Line 66) works okay, but the second one (Line 76) 
never return..

Thanks,
Gautam


On 05-Jul-2014, at 3:23 pm, Ankur Dave  wrote:

> Interesting problem! My understanding is that you want to (1) find paths 
> matching a particular pattern, and (2) add edges between the start and end 
> vertices of the matched paths.
> 
> For (1), I implemented a pattern matcher for GraphX that iteratively 
> accumulates partial pattern matches. I used your example in the unit test.
> 
> For (2), you can take the output of the pattern matcher (the set of matching 
> paths organized by their terminal vertices) and construct a set of new edges 
> using the initial and terminal vertices of each path. Then you can make a new 
> graph consisting of the union of the original edge set and the new edges. Let 
> me know if you'd like help with this.
> 
> Ankur
> 



Re: Graphx traversal and merge interesting edges

2014-07-14 Thread HHB
Hi Ankur,

FYI - in a naive attempt to enhance your solution, managed to create 
MergePatternPath. I think it works in expected way (atleast for the traversing 
problem in last email). 

I modified your code a bit. Also instead of EdgePattern I used List of 
Functions that match the whole edge triplets along the path... and it returns a 
*new Graph* which preserves the vertices attributes, but only with new merged 
edges.

MergePatternPath:
https://github.com/hihellobolke/spark/blob/graphx-traversal/graphx/src/main/scala/org/apache/spark/graphx/lib/MergePatternPath.scala

Here's a Gist of how I was using it:
https://gist.github.com/hihellobolke/c8e6c97cefed714258ad

This prolly is very naive attempt :-). Is there any possibility of adding it to 
the graphx.lib albeit one which is sophisticated & performant?

Thanks

On 08-Jul-2014, at 4:57 pm, HHB  wrote:

> Hi Ankur,
> 
> I was trying out the PatterMatcher it works for smaller path, but I see that 
> for the longer ones it continues to run forever...
> 
> Here's what I am trying: 
> https://gist.github.com/hihellobolke/dd2dc0fcebba485975d1  (The example of 3 
> share traders transacting in appl shares)
> 
> The first edge pattern list (Line 66) works okay, but the second one (Line 
> 76) never return..
> 
> Thanks,
> Gautam
> 
> 
> On 05-Jul-2014, at 3:23 pm, Ankur Dave  wrote:
> 
>> Interesting problem! My understanding is that you want to (1) find paths 
>> matching a particular pattern, and (2) add edges between the start and end 
>> vertices of the matched paths.
>> 
>> For (1), I implemented a pattern matcher for GraphX that iteratively 
>> accumulates partial pattern matches. I used your example in the unit test.
>> 
>> For (2), you can take the output of the pattern matcher (the set of matching 
>> paths organized by their terminal vertices) and construct a set of new edges 
>> using the initial and terminal vertices of each path. Then you can make a 
>> new graph consisting of the union of the original edge set and the new 
>> edges. Let me know if you'd like help with this.
>> 
>> Ankur
>> 
>