Hi

I am looking for tips on how to leverage Giraph for the use case below:

I have a list of Nodes.
A Node is a collection of Key-Value pairs.
2 Nodes are related (have an edge) if they share a Key-Value pair.

Until now I have been running a Depth First Search algorithm to cluster the
Nodes into Connected Components.

However, my data set has grown significantly and I need to scale. This is
the reason that brought me to Giraph.

I have gone through the Connected Component example in Giraph but need a
bit of help to get started. Specifically I wonder how I can change it to
accommodate the use case described above.

I would greatly appreciate any help.
Thank you in advance.
-matt

Reply via email to