[ 
https://issues.apache.org/jira/browse/GIRAPH-273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13445738#comment-13445738
 ] 

Eli Reisman commented on GIRAPH-273:
------------------------------------

The tree sounds a lot better to me. What is the case where an aggregator is a 
large chunk of data, won't it mostly be a counter, numerical value, or other 
fixed-size datatype that is augmented or altered by the values at each worker? 

As Netty has added more features, I have already seen memory and network issues 
start to build up as we scale to 1000's of worker nodes, which is definitely 
our use case here, so that many more connections/traffic on the network kind of 
scares me. I hear what you're saying about the master not having much to do 
right now though.

If you have some use cases where aggregators accumulate instead of aggregate 
then maybe havign all the extra network connections would be worth it. Log W 
time on a small message passed around that way seems ok to me.

If Pregel likes it, Eli likes it. ;)




                
> Aggregators shouldn't use Zookeeper
> -----------------------------------
>
>                 Key: GIRAPH-273
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-273
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>
> We use Zookeeper znodes to transfer aggregated values from workers to master 
> and back. Zookeeper is supposed to be used for coordination, and it also has 
> a memory limit which prevents users from having aggregators with large value 
> objects. These are the reasons why we should implement aggregators gathering 
> and distribution in a different way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to