Fwd: Giraph Performance Tuning

Sonja Koenig Mon, 24 Aug 2015 02:18:55 -0700

Hey there everyone!

On the user list, there was noone to help me, so I thought I'll juststart bugging devs..

I am currently writing my bachelor thesis about Giraph and GraphX, whereI am trying to compare their scalability and features and bring theminto a context with different graph types.In order to compare the two on a fair basis, I want to tune theframeworks to get the most out of them :-)I was hoping to get some tips and tricks from you all, where I can makesome configurations to impact my computations..


My set up:

10 machines, each 1 cpu with 1 3,3GHz core, 4GB RAM, 100GB HDD -> one isdesignated master

Giraph 1.10
Hadoop 1.2.1

So far I haven't done any special configurations for hadoop or giraphbesides the basic ones during setup.

Performance-critical might be these:
In *mapred-site.xml*:
    mapred.tasktracker.map.tasks.maximum = 4
    mapred.map.tasks=4
In *dfs-site.xml*:
    dfs.replication=3

If I am correctly informed, the default amount of heap is 1000MB, whichI haven't changed. I am also not sure where I can actually increasememory usage. Any advice?Also, I read somewhere that it is smarter to increase the amount ofthreads per worker and not the amount of worker per machine? But I amanyways somewhat handicapped with only one core per machine..

Lastly, has anyone noticed any performance changes when usingcheckointing, combiners, aggregators and so on?Is the use of combiners and aggregators a choice of the application codeor my execution command?


I would appreciate any advice and comments greatly! :-)

Greetings from Ulm,
Sonja

Fwd: Giraph Performance Tuning

Reply via email to