Re: How could I do this algorithm in Spark?

2016-02-25 Thread Guillermo Ortiz
I'm going to try to do it with Pregel.. it there are others ideas... great!. What do you call P time? I think that it's O(Number Vertex * N) 2016-02-25 16:17 GMT+01:00 Darren Govoni : > This might be hard to do. One generalization of this problem is > https://en.m.wikipedia.org/wiki/Longest_path

RE: How could I do this algorithm in Spark?

2016-02-25 Thread Darren Govoni
This might be hard to do. One generalization of this problem isĀ  https://en.m.wikipedia.org/wiki/Longest_path_problem Given a node (e.g. A), find longest path. All interior relations are transitive and can be inferred. But finding a distributed spark way of doing it in P time would be intere

Re: How could I do this algorithm in Spark?

2016-02-25 Thread Guillermo Ortiz
Thank you!, I'm trying to do it with Pregel,, it's being hard because I have never used GraphX and Pregel before. 2016-02-25 14:00 GMT+01:00 Sabarish Sasidharan : > Like Robin said, pls explore Pregel. You could do it without Pregel but it > might be laborious. I have a simple outline below. You

Re: How could I do this algorithm in Spark?

2016-02-25 Thread Sabarish Sasidharan
Like Robin said, pls explore Pregel. You could do it without Pregel but it might be laborious. I have a simple outline below. You will need more iterations if the number of levels is higher. a-b b-c c-d b-e e-f f-c flatmaptopair a -> (a-b) b -> (a-b) b -> (b-c) c -> (b-c) c -> (c-d) d -> (c-d) b

Re: How could I do this algorithm in Spark?

2016-02-25 Thread Guillermo Ortiz
I'm taking a look to Pregel. It seems it's a good way to do it. The only negative thing that I see it's not a really complex graph with a lot of edges between the vertex .. They are more like a lot of isolated small graphs 2016-02-25 12:32 GMT+01:00 Robin East : > The structures you are describin

Re: How could I do this algorithm in Spark?

2016-02-25 Thread Robin East
The structures you are describing look like edges of a graph and you want to follow the graph to a terminal vertex and then propagate that value back up the path. On this assumption it would be simple to create the structures as graphs in GraphX and use Pregel for the algorithm implementation. -

Re: How could I do this algorithm in Spark?

2016-02-25 Thread Guillermo Ortiz
Oh, the letters were just an example, it could be: a , t b, o t, k k, c So.. a -> t -> k -> c and the result is: a,c; t,c; k,c and b,o I don't know if you were thinking about sortBy because the another example where letter were consecutive. 2016-02-25 9:42 GMT+01:00 Guillermo Ortiz : > I don't

Re: How could I do this algorithm in Spark?

2016-02-25 Thread Guillermo Ortiz
I don't see that sorting the data helps. The answer has to be all the associations. In this case the answer has to be: a , b --> it was a error in the question, sorry. b , d c , d x , y y , y I feel like all the data which is associate should be in the same executor. On this case if I order the in

Re: How could I do this algorithm in Spark?

2016-02-24 Thread James Barney
Guillermo, I think you're after an associative algorithm where A is ultimately associated with D, correct? Jakob would correct if that is a typo--a sort would be all that is necessary in that case. I believe you're looking for something else though, if I understand correctly. This seems like a si

Re: How could I do this algorithm in Spark?

2016-02-24 Thread Jakob Odersky
Hi Guillermo, assuming that the first "a,b" is a typo and you actually meant "a,d", this is a sorting problem. You could easily model your data as an RDD or tuples (or as a dataframe/set) and use the sortBy (or orderBy for dataframe/sets) methods. best, --Jakob On Wed, Feb 24, 2016 at 2:26 PM, G