Yes, Giraph "hijacks" mapper tasks, and then does everything else on its
own.


On Fri, Feb 7, 2014 at 12:39 PM, Alexander Frolov
<alexndr.fro...@gmail.com>wrote:

>
>
>
> On Fri, Feb 7, 2014 at 2:30 PM, Claudio Martella <
> claudio.marte...@gmail.com> wrote:
>
>>
>>
>>
>> On Fri, Feb 7, 2014 at 9:44 AM, Alexander Frolov <
>> alexndr.fro...@gmail.com> wrote:
>>
>>>  Thank you, I will try to do this. As I understood I should set number
>>>> of threads manually through Giraph API.
>>>>
>>>> BTW, what is conceptual difference between running multiple workers on
>>>> the TaskTracker and running single worker and multiple threads? In terms of
>>>> vertex fetching, memory sharing etc.
>>>>
>>>
>> Basically, better usage of resources: one single JVM, no duplication of
>> core data structures, less netty threads and communication points, more
>> locality (less messages over the network), less actors accessing zookeeper
>> etc.
>>
>>
>>>
>>>>  Also I would like to ask how message transfer between vertices is
>>> implemented in terms of Hadoop primitives? Source code reference will be
>>> enough.
>>>
>>
>> Communication does not happen via Hadoop primitives, but ad-hoc via
>> netty.
>>
>
> Ok. It seams that Hadoop has minimalistic influence on Giraph application
> execution after graph is loaded into memory (that is mapping is done).
>



-- 
   Claudio Martella

Reply via email to