[ https://issues.apache.org/jira/browse/GIRAPH-273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13445738#comment-13445738 ]
Eli Reisman commented on GIRAPH-273: ------------------------------------ The tree sounds a lot better to me. What is the case where an aggregator is a large chunk of data, won't it mostly be a counter, numerical value, or other fixed-size datatype that is augmented or altered by the values at each worker? As Netty has added more features, I have already seen memory and network issues start to build up as we scale to 1000's of worker nodes, which is definitely our use case here, so that many more connections/traffic on the network kind of scares me. I hear what you're saying about the master not having much to do right now though. If you have some use cases where aggregators accumulate instead of aggregate then maybe havign all the extra network connections would be worth it. Log W time on a small message passed around that way seems ok to me. If Pregel likes it, Eli likes it. ;) > Aggregators shouldn't use Zookeeper > ----------------------------------- > > Key: GIRAPH-273 > URL: https://issues.apache.org/jira/browse/GIRAPH-273 > Project: Giraph > Issue Type: Improvement > Reporter: Maja Kabiljo > Assignee: Maja Kabiljo > > We use Zookeeper znodes to transfer aggregated values from workers to master > and back. Zookeeper is supposed to be used for coordination, and it also has > a memory limit which prevents users from having aggregators with large value > objects. These are the reasons why we should implement aggregators gathering > and distribution in a different way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira