On Fri, Feb 7, 2014 at 3:53 PM, Claudio Martella <claudio.marte...@gmail.com > wrote:
> Yes, Giraph "hijacks" mapper tasks, and then does everything else on its > own. > Thanks, that is important for understanding. > > > On Fri, Feb 7, 2014 at 12:39 PM, Alexander Frolov < > alexndr.fro...@gmail.com> wrote: > >> >> >> >> On Fri, Feb 7, 2014 at 2:30 PM, Claudio Martella < >> claudio.marte...@gmail.com> wrote: >> >>> >>> >>> >>> On Fri, Feb 7, 2014 at 9:44 AM, Alexander Frolov < >>> alexndr.fro...@gmail.com> wrote: >>> >>>> Thank you, I will try to do this. As I understood I should set number >>>>> of threads manually through Giraph API. >>>>> >>>>> BTW, what is conceptual difference between running multiple workers on >>>>> the TaskTracker and running single worker and multiple threads? In terms >>>>> of >>>>> vertex fetching, memory sharing etc. >>>>> >>>> >>> Basically, better usage of resources: one single JVM, no duplication of >>> core data structures, less netty threads and communication points, more >>> locality (less messages over the network), less actors accessing zookeeper >>> etc. >>> >>> >>>> >>>>> Also I would like to ask how message transfer between vertices is >>>> implemented in terms of Hadoop primitives? Source code reference will be >>>> enough. >>>> >>> >>> Communication does not happen via Hadoop primitives, but ad-hoc via >>> netty. >>> >> >> Ok. It seams that Hadoop has minimalistic influence on Giraph application >> execution after graph is loaded into memory (that is mapping is done). >> > > > > -- > Claudio Martella > >