Pankaj Thanks a lot for this great idea. We will give it a try. Cheers.
On Fri, Mar 28, 2014 at 3:25 PM, Pankaj Malhotra <pankajiit...@gmail.com>wrote: > maybe it would be better if you use mapreduce such that in the map phase > each key-value pair at a node is a key and the node is the value...this way > you get the first level of connections at the reduce-keys...then u can use > the output of reduce phase as adjacency list for the graph to be processed > using Giraph... > Cheers > Pankaj > On Mar 28, 2014 6:27 PM, "Matthieu Labour" <matthieu.lab...@gmail.com> > wrote: > >> Hi >> >> I am looking for tips on how to leverage Giraph for the use case below: >> >> I have a list of Nodes. >> A Node is a collection of Key-Value pairs. >> 2 Nodes are related (have an edge) if they share a Key-Value pair. >> >> Until now I have been running a Depth First Search algorithm to cluster >> the Nodes into Connected Components. >> >> However, my data set has grown significantly and I need to scale. This is >> the reason that brought me to Giraph. >> >> I have gone through the Connected Component example in Giraph but need a >> bit of help to get started. Specifically I wonder how I can change it to >> accommodate the use case described above. >> >> I would greatly appreciate any help. >> Thank you in advance. >> -matt >> >