RE: What a "worker" really is and other interesting runtime information

Magyar, Bence (US SSA) Thu, 29 Nov 2012 05:03:00 -0800

Folks,

I have some of the same questions as Alexandros below.  What is exactly is "a 
worker"?  I am not sure I understood Avery's answer below.  I have 4-node 
cluster.  Each node has 24 nodes.  My first node is functioning (in MapReduce 
parlance) as both a "job tracker" as well as a "task tracker".  So I have 4 
compute nodes.  (I have verified that master/slave config is correct).  I am 
launching the Giraph SimpleShortestPathsVertex example on an input graph with 
approximately 140,000 nodes/ 410,000 edges and the computation is taking 
approx. 6 minutes.  Although I don't know what a "good" number is, 6 minutes 
seems rather "slow" given all the compute horsepower I have at my disposal.  
When I monitor "top" on my machines while the compute is running, my cores are 
~ 80-90% idle.


I am launching my job with the following parameters:

./giraph -Dgiraph.useSuperstepCounters=false 
-DSimpleShortestPathsVertex.sourceId=100 ../target/giraph.jar 
org.apache.giraph.examples.SimpleShortestPathsVertex -if 
org.apache.giraph.io.JsonLongDoubleFloatDoubleVertexInputFormat -ip 
/user/hduser/in -of 
org.apache.giraph.io.JsonLongDoubleFloatDoubleVertexOutputFormat -op 
/user/hduser/out -w 3

Note that I have my number of workers (-w =3).  Should this be some other 
value?  Does anyone have any simple configuration suggestions that will help me 
tune Giraph to my problem?

Thanks!

Bence

From: Alexandros Daglis [mailto:alexandros.dag...@epfl.ch]
Sent: Thursday, November 29, 2012 6:19 AM
To: user@giraph.apache.org
Subject: Re: What a "worker" really is and other interesting runtime information

Ok, so I added the partitions flag, going with

 hadoop jar target/giraph-0.1-jar-with-dependencies.jar 
org.apache.giraph.examples.SimpleShortestPathsVertex 
-Dgiraph.SplitMasterWorker=false -Dgiraph.numComputeThreads=12 
-Dhash.userPartitionCount=12 input output 12 1

but still I got no overall speedup at all (compared to using 1 thread) and only 
1 out of 12 cores is utilized at most times. Isn't Giraph supposed to exploit 
parallelism to get some speedup? Any other suggestion?

Thanks,
Alexandros
On 29 November 2012 00:20, Avery Ching 
<ach...@apache.org<mailto:ach...@apache.org>> wrote:
Oh, forgot one thing.  You need to set the number of partitions to use single 
each thread works on a single partition at a time.

Try -Dhash.userPartitionCount=<number of threads>


On 11/28/12 5:29 AM, Alexandros Daglis wrote:
Dear Avery,

I followed your advice, but the application seems to be totally 
thread-count-insensitive: I literally observe zero scaling of performance, 
while I increase the thread count. Maybe you can point out if I am doing 
something wrong.

- Using only 4 cores on a single node at the moment
- Input graph: 14 million vertices, file size is 470 MB
- Running SSSP as follows: hadoop jar 
target/giraph-0.1-jar-with-dependencies.jar 
org.apache.giraph.examples.SimpleShortestPathsVertex 
-Dgiraph.SplitMasterWorker=false -Dgiraph.numComputeThreads=X input output 12 1
where X=1,2,3,12,30
- I notice a total insensitivity to the number of thread I specify. Aggregate 
core utilization is always approximately the same (usually around 25-30% => 
only one of the cores running) and overall execution time is always the same 
(~8 mins)

Why is Giraph's performance not scaling? Is the input size / number of workers 
inappropriate? It's not an IO issue either, because even during really low core 
utilization, time is wasted on idle, not on IO.

Cheers,
Alexandros


On 28 November 2012 11:13, Alexandros Daglis 
<alexandros.dag...@epfl.ch<mailto:alexandros.dag...@epfl.ch>> wrote:
Thank you Avery, that helped a lot!

Regards,
Alexandros

On 27 November 2012 20:57, Avery Ching 
<ach...@apache.org<mailto:ach...@apache.org>> wrote:
Hi Alexandros,

The extra task is for the master process (a coordination task). In your case, 
since you are using a single machine, you can use a single task.

-Dgiraph.SplitMasterWorker=false

and you can try multithreading instead of multiple workers.

-Dgiraph.numComputeThreads=12

The reason why cpu usage increases is due to netty threads to handle network 
requests.  By using multithreading instead, you should bypass this.

Avery


On 11/27/12 9:40 AM, Alexandros Daglis wrote:
Hello everybody,

I went through most of the documentation I could find for Giraph and also most 
of the messages in this email list, but still I have not figured out precisely 
what a "worker" really is. I would really appreciate it if you could help me 
understand how the framework works.

At first I thought that a worker has a one-to-one correspondence to a map task. 
Apparently this is not exactly the case, since I have noticed that if I ask for 
x workers, the job finishes after having used x+1 map tasks. What is this extra 
task for?

I have been trying out the example SSSP application on a single node with 12 
cores. Giving an input graph of ~400MB and using 1 worker, around 10 GBs of 
memory are used during execution. What intrigues me is that if I use 2 workers 
for the same input (and without limiting memory per map task), double the 
memory will be used. Furthermore, there will be no improvement in performance. 
I rather notice a slowdown. Are these observations normal?

Might it be the case that 1 and 2 workers are very few and I should go to the 
30-100 range that is the proposed number of mappers for a conventional 
MapReduce job?

Finally, a last observation. Even though I use only 1 worker, I see that there 
are significant periods during execution where up to 90% of the 12 cores 
computing power is consumed, that is, almost 10 cores are used in parallel. 
Does each worker spawn multiple threads and dynamically balances the load to 
utilize the available hardware?

Thanks a lot in advance!

Best,
Alexandros

RE: What a "worker" really is and other interesting runtime information

Reply via email to