I'm going to try to do it with Pregel.. it there are others ideas...
great!.
What do you call P time? I think that it's O(Number Vertex * N)
2016-02-25 16:17 GMT+01:00 Darren Govoni :
> This might be hard to do. One generalization of this problem is
>
This might be hard to do. One generalization of this problem isĀ
https://en.m.wikipedia.org/wiki/Longest_path_problem
Given a node (e.g. A), find longest path. All interior relations are transitive
and can be inferred.
But finding a distributed spark way of doing it in P time would be
Thank you!, I'm trying to do it with Pregel,, it's being hard because I
have never used GraphX and Pregel before.
2016-02-25 14:00 GMT+01:00 Sabarish Sasidharan :
> Like Robin said, pls explore Pregel. You could do it without Pregel but it
> might be laborious. I have a
Like Robin said, pls explore Pregel. You could do it without Pregel but it
might be laborious. I have a simple outline below. You will need more
iterations if the number of levels is higher.
a-b
b-c
c-d
b-e
e-f
f-c
flatmaptopair
a -> (a-b)
b -> (a-b)
b -> (b-c)
c -> (b-c)
c -> (c-d)
d -> (c-d)
I'm taking a look to Pregel. It seems it's a good way to do it. The only
negative thing that I see it's not a really complex graph with a lot of
edges between the vertex .. They are more like a lot of isolated small
graphs
2016-02-25 12:32 GMT+01:00 Robin East :
> The
Oh, the letters were just an example, it could be:
a , t
b, o
t, k
k, c
So.. a -> t -> k -> c and the result is: a,c; t,c; k,c and b,o
I don't know if you were thinking about sortBy because the another example
where letter were consecutive.
2016-02-25 9:42 GMT+01:00 Guillermo Ortiz
I don't see that sorting the data helps.
The answer has to be all the associations. In this case the answer has to
be:
a , b --> it was a error in the question, sorry.
b , d
c , d
x , y
y , y
I feel like all the data which is associate should be in the same executor.
On this case if I order the
Guillermo,
I think you're after an associative algorithm where A is ultimately
associated with D, correct? Jakob would correct if that is a typo--a sort
would be all that is necessary in that case.
I believe you're looking for something else though, if I understand
correctly.
This seems like a
Hi Guillermo,
assuming that the first "a,b" is a typo and you actually meant "a,d",
this is a sorting problem.
You could easily model your data as an RDD or tuples (or as a
dataframe/set) and use the sortBy (or orderBy for dataframe/sets)
methods.
best,
--Jakob
On Wed, Feb 24, 2016 at 2:26 PM,