Hi, I have a computer cluster consisting of 15 slave machines and 1 master machine.
On each slave machine, there are two Xeon E5-2620 CPUs. With the help of HT, there are 24 threads. I am wondering how to specify parameters in order to run giraph job in parallel on my cluster. I am using the following parameters to run a pagerank algorithm. hadoop jar ~/giraph-examples.jar org.apache.giraph.GiraphRunner SimplePageRank -vif PageRankInputFormat -vip /input -vof PageRankOutputFormat -op /pagerank -w 1 -mc SimplePageRank\$SimplePageRankMasterCompute -wc SimplePageRank\$SimplePageRankWorkerContext In particular, 1)I know I can use “-w” to specify the number of workers. In my opinion, the number of workers equals to the number of mappers in hadoop except zookeeper. Therefore, in my case(15 slave machine), which number should be chosen? Is 15 a good choice? Since, I find if I input a large number, e.g. 100, the mappers will hang. 2)I know I can use “-Dgiraph.numComputeThreads=1” to specify vertex computing thread number. However, if I specify it to 10, the total runtime is much longer than default. I think the default is 1, which is found in the source code. I wonder if I want to use this parameter, which number should be chosen. 3)When the giraph job is running, I use “top” command to monitor my cpu usage on slave machines. I find that the java process can use 200%-300% cpu resource. However, if I change the number of vertex computing threads to 10, the java process can use 800% cpu resource. I think it is not a linear relation and I want to know why. Thanks for your help. Best, -Yi