Graphx traversal and merge interesting edges
Hello Gurus, Pardon me I am noob @ Spark & GraphX (& Scala) And I seek your wisdome here.. I want to know how to do a graph traversal and do selective merge on edges... Thanks to the documentation :-) I could create a simple graph of employees & their colleagues. The a structure of Graph is below, where ( ) represent nodes, --[: ]--> represents relationships. I want to transform the current graph into a useful one. Given the current graph: (Alice) --[:works_at]--> (Acme) <--[:works_at]-- (Bob) --[:worked_at]--> (CompanyX) <--[:works_at]-- (Cindy) Convert that into: (Alice) --[:get_feedback_from]--> (Cindy) So merging " --[:works_at]--> (Acme) <--[:works_at]-- (Bob) --[:worked_at]--> (CompanyX) <--[:works_at]-- " path, in to an edge with the direction same as --[:worked_at]-->. I want to preserve other edges of left over vertices. I want to create a new relation between an employee and his/her colleague's former colleague's. I could not find example that could help me (may be I have looked at wrong place)... Thanks!
Re: Graphx traversal and merge interesting edges
Thanks Ankur, Cannot thank you enough for this!!! I am reading your example still digesting & grokking it though :-) I was breaking my head over this for past few hours. In my last futile attempts over past few hours. I was looking at Pregel... E.g if that could be used to see at what step of a path match the vertex is in and send message to next vertex with the history of traversal.. then for merging message append the historical traversal path of for each message :-P. --Gautam On 05-Jul-2014, at 3:23 pm, Ankur Dave wrote: > Interesting problem! My understanding is that you want to (1) find paths > matching a particular pattern, and (2) add edges between the start and end > vertices of the matched paths. > > For (1), I implemented a pattern matcher for GraphX that iteratively > accumulates partial pattern matches. I used your example in the unit test. > > For (2), you can take the output of the pattern matcher (the set of matching > paths organized by their terminal vertices) and construct a set of new edges > using the initial and terminal vertices of each path. Then you can make a new > graph consisting of the union of the original edge set and the new edges. Let > me know if you'd like help with this. > > Ankur >
Re: Graphx traversal and merge interesting edges
Hi Ankur, I was trying out the PatterMatcher it works for smaller path, but I see that for the longer ones it continues to run forever... Here's what I am trying: https://gist.github.com/hihellobolke/dd2dc0fcebba485975d1 (The example of 3 share traders transacting in appl shares) The first edge pattern list (Line 66) works okay, but the second one (Line 76) never return.. Thanks, Gautam On 05-Jul-2014, at 3:23 pm, Ankur Dave wrote: > Interesting problem! My understanding is that you want to (1) find paths > matching a particular pattern, and (2) add edges between the start and end > vertices of the matched paths. > > For (1), I implemented a pattern matcher for GraphX that iteratively > accumulates partial pattern matches. I used your example in the unit test. > > For (2), you can take the output of the pattern matcher (the set of matching > paths organized by their terminal vertices) and construct a set of new edges > using the initial and terminal vertices of each path. Then you can make a new > graph consisting of the union of the original edge set and the new edges. Let > me know if you'd like help with this. > > Ankur >
Re: Graphx traversal and merge interesting edges
Hi Ankur, FYI - in a naive attempt to enhance your solution, managed to create MergePatternPath. I think it works in expected way (atleast for the traversing problem in last email). I modified your code a bit. Also instead of EdgePattern I used List of Functions that match the whole edge triplets along the path... and it returns a *new Graph* which preserves the vertices attributes, but only with new merged edges. MergePatternPath: https://github.com/hihellobolke/spark/blob/graphx-traversal/graphx/src/main/scala/org/apache/spark/graphx/lib/MergePatternPath.scala Here's a Gist of how I was using it: https://gist.github.com/hihellobolke/c8e6c97cefed714258ad This prolly is very naive attempt :-). Is there any possibility of adding it to the graphx.lib albeit one which is sophisticated & performant? Thanks On 08-Jul-2014, at 4:57 pm, HHB wrote: > Hi Ankur, > > I was trying out the PatterMatcher it works for smaller path, but I see that > for the longer ones it continues to run forever... > > Here's what I am trying: > https://gist.github.com/hihellobolke/dd2dc0fcebba485975d1 (The example of 3 > share traders transacting in appl shares) > > The first edge pattern list (Line 66) works okay, but the second one (Line > 76) never return.. > > Thanks, > Gautam > > > On 05-Jul-2014, at 3:23 pm, Ankur Dave wrote: > >> Interesting problem! My understanding is that you want to (1) find paths >> matching a particular pattern, and (2) add edges between the start and end >> vertices of the matched paths. >> >> For (1), I implemented a pattern matcher for GraphX that iteratively >> accumulates partial pattern matches. I used your example in the unit test. >> >> For (2), you can take the output of the pattern matcher (the set of matching >> paths organized by their terminal vertices) and construct a set of new edges >> using the initial and terminal vertices of each path. Then you can make a >> new graph consisting of the union of the original edge set and the new >> edges. Let me know if you'd like help with this. >> >> Ankur >> >