Hi all,

I have some questions about the use of aggregators. I want to implement an 
algorithms for Community Detection and Community Tracking.
The Community Detection algorithm basically  outputs a file where each line 
represents a community and contains the IDs of the nodes in the community.

For the Community Tracking part I cluster a second graph (i.e. another time 
step) and then need to compare each community of the first time step with every 
community of the second time step.  For large graphs the number of communities 
can get quite large as well.

One idea I had was to register an aggregator for each community of the first 
time step and then for each community found in the second time step one node of 
each community send a message to the aggregator containing the nodes of its 
community. The aggregator the calculates the similarity for each received 
community of time step 2.
I would end up registering several thousand aggregators I only need after one 
superstep.

The other idea was to alter the compute method for the node with the smallest 
ID in each community and let them do the similarity calculation. This the means 
I would have to add (and later remove) some thousand edges to the graph.

What do you think would perform better? Or should I do the calculation outside 
of giraph?

I appreciate any input.

Thanks

Pascal


Reply via email to