Hi,

I implemented a benchmark that allows me to generate an arbitrarily large graph (depending on the number of iterations). Now I would like to configure Giraph so that I can make the best use of my hardware for this benchmark. Based on the number of nodes in my cluster, their amount of main memory and number of cores, I am asking myself how do I determine the optimal parameters of Giraph / Hadoop, specifically:

- the number of used mappers
- the HEAP_SIZE environment variable
- the memory specified in the mapred.map.child.java.opts property

(any other relevant parameters?)

Also, I was wondering how well Giraph can handle computations which start with a very small graph and mutate it to a very large one. For example, if I understand correctly the number of mappers is not dynamically adjusted.

Any hints (or links to documentation) are highly appreciated.

Cheers,
Christian

Reply via email to