Yes, Giraph "hijacks" mapper tasks, and then does everything else on its own.
On Fri, Feb 7, 2014 at 12:39 PM, Alexander Frolov <alexndr.fro...@gmail.com>wrote: > > > > On Fri, Feb 7, 2014 at 2:30 PM, Claudio Martella < > claudio.marte...@gmail.com> wrote: > >> >> >> >> On Fri, Feb 7, 2014 at 9:44 AM, Alexander Frolov < >> alexndr.fro...@gmail.com> wrote: >> >>> Thank you, I will try to do this. As I understood I should set number >>>> of threads manually through Giraph API. >>>> >>>> BTW, what is conceptual difference between running multiple workers on >>>> the TaskTracker and running single worker and multiple threads? In terms of >>>> vertex fetching, memory sharing etc. >>>> >>> >> Basically, better usage of resources: one single JVM, no duplication of >> core data structures, less netty threads and communication points, more >> locality (less messages over the network), less actors accessing zookeeper >> etc. >> >> >>> >>>> Also I would like to ask how message transfer between vertices is >>> implemented in terms of Hadoop primitives? Source code reference will be >>> enough. >>> >> >> Communication does not happen via Hadoop primitives, but ad-hoc via >> netty. >> > > Ok. It seams that Hadoop has minimalistic influence on Giraph application > execution after graph is loaded into memory (that is mapping is done). > -- Claudio Martella