I want to do some algorithm in Spark.. I know how to do it in a single machine where all data are together, but I don't know a good way to do it in Spark.
If someone has an idea.. I have some data like this a , b x , y b , c y , y c , d I want something like: a , d b , d c , d x , y y , y I need to know that a->b->c->d, so a->d, b->d and c->d. I don't want the code, just an idea how I could deal with it. Any idea?