Ok, so I added the partitions flag, going with hadoop jar target/giraph-0.1-jar-with-dependencies.jar org.apache.giraph.examples.SimpleShortestPathsVertex -Dgiraph.SplitMasterWorker=false -Dgiraph.numComputeThreads=12 -Dhash.userPartitionCount=12 input output 12 1
but still I got no overall speedup at all (compared to using 1 thread) and only 1 out of 12 cores is utilized at most times. Isn't Giraph supposed to exploit parallelism to get some speedup? Any other suggestion? Thanks, Alexandros On 29 November 2012 00:20, Avery Ching <ach...@apache.org> wrote: > Oh, forgot one thing. You need to set the number of partitions to use > single each thread works on a single partition at a time. > > Try -Dhash.userPartitionCount=<number of threads> > > > On 11/28/12 5:29 AM, Alexandros Daglis wrote: > > Dear Avery, > > I followed your advice, but the application seems to be totally > thread-count-insensitive: I literally observe zero scaling of performance, > while I increase the thread count. Maybe you can point out if I am doing > something wrong. > > - Using only 4 cores on a single node at the moment > - Input graph: 14 million vertices, file size is 470 MB > - Running SSSP as follows: hadoop jar > target/giraph-0.1-jar-with-dependencies.jar > org.apache.giraph.examples.SimpleShortestPathsVertex > -Dgiraph.SplitMasterWorker=false -Dgiraph.numComputeThreads=X input output > 12 1 > where X=1,2,3,12,30 > - I notice a total insensitivity to the number of thread I specify. > Aggregate core utilization is always approximately the same (usually around > 25-30% => only one of the cores running) and overall execution time is > always the same (~8 mins) > > Why is Giraph's performance not scaling? Is the input size / number of > workers inappropriate? It's not an IO issue either, because even during > really low core utilization, time is wasted on idle, not on IO. > > Cheers, > Alexandros > > > > On 28 November 2012 11:13, Alexandros Daglis <alexandros.dag...@epfl.ch>wrote: > >> Thank you Avery, that helped a lot! >> >> Regards, >> Alexandros >> >> >> On 27 November 2012 20:57, Avery Ching <ach...@apache.org> wrote: >> >>> Hi Alexandros, >>> >>> The extra task is for the master process (a coordination task). In your >>> case, since you are using a single machine, you can use a single task. >>> >>> -Dgiraph.SplitMasterWorker=false >>> >>> and you can try multithreading instead of multiple workers. >>> >>> -Dgiraph.numComputeThreads=12 >>> >>> The reason why cpu usage increases is due to netty threads to handle >>> network requests. By using multithreading instead, you should bypass this. >>> >>> Avery >>> >>> >>> On 11/27/12 9:40 AM, Alexandros Daglis wrote: >>> >>>> Hello everybody, >>>> >>>> I went through most of the documentation I could find for Giraph and >>>> also most of the messages in this email list, but still I have not figured >>>> out precisely what a "worker" really is. I would really appreciate it if >>>> you could help me understand how the framework works. >>>> >>>> At first I thought that a worker has a one-to-one correspondence to a >>>> map task. Apparently this is not exactly the case, since I have noticed >>>> that if I ask for x workers, the job finishes after having used x+1 map >>>> tasks. What is this extra task for? >>>> >>>> I have been trying out the example SSSP application on a single node >>>> with 12 cores. Giving an input graph of ~400MB and using 1 worker, around >>>> 10 GBs of memory are used during execution. What intrigues me is that if I >>>> use 2 workers for the same input (and without limiting memory per map >>>> task), double the memory will be used. Furthermore, there will be no >>>> improvement in performance. I rather notice a slowdown. Are these >>>> observations normal? >>>> >>>> Might it be the case that 1 and 2 workers are very few and I should go >>>> to the 30-100 range that is the proposed number of mappers for a >>>> conventional MapReduce job? >>>> >>>> Finally, a last observation. Even though I use only 1 worker, I see >>>> that there are significant periods during execution where up to 90% of the >>>> 12 cores computing power is consumed, that is, almost 10 cores are used in >>>> parallel. Does each worker spawn multiple threads and dynamically balances >>>> the load to utilize the available hardware? >>>> >>>> Thanks a lot in advance! >>>> >>>> Best, >>>> Alexandros >>>> >>>> >>>> >>> >> > >