Re: Giraph : newbie questions

2012-07-21 Thread Claudio Martella
There are already a couple of partitioners in the codebase, check those out. Also, keep in mind that by using fewer workers you diminish network communication but you also decrease parallelism. On Fri, Jul 20, 2012 at 8:52 PM, Jonathan Bishop jbishop@gmail.com wrote: Avery, Is there an

Re: Giraph : newbie questions

2012-07-16 Thread David Garcia
Giraph partitions the vertices using a hashing function that's basically the equivalent of (hash(vertexID) mod #ofComputeNodes). You can mitigate memory issues by starting the job with a minimum of vertices in your file and then add them dynamically as your job progresses (assuming that your job