[ 
https://issues.apache.org/jira/browse/GIRAPH-273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440675#comment-13440675
 ] 

Maja Kabiljo commented on GIRAPH-273:
-------------------------------------

We actually ended up with something better than aggregation tree. 

Say we have A aggregators and W workers. With the tree approach the whole 
aggregation would last for:
A * (aggregation_time + transfer_time) * log W
What we can do is perform aggregations in a completely distributed way. Each 
aggregator would have a worker which owns it and which does aggregation for it, 
so we would end up with about:
A * (aggregation_time + transfer_time)
After performing aggregations, all workers would send the final values to 
master, and after master.compute aggregators would go back the same way. In 
case of applications without master compute, we can even skip sending 
aggregated values to master all together. 

Is having all the workers connect to master an issue? Master will have the same 
number of connections as any other worker has, and in this approach we just 
send smaller amount of data through each of the connections, instead of having 
that same amount sent through just two.
                
> Aggregators shouldn't use Zookeeper
> -----------------------------------
>
>                 Key: GIRAPH-273
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-273
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>
> We use Zookeeper znodes to transfer aggregated values from workers to master 
> and back. Zookeeper is supposed to be used for coordination, and it also has 
> a memory limit which prevents users from having aggregators with large value 
> objects. These are the reasons why we should implement aggregators gathering 
> and distribution in a different way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to