[ 
https://issues.apache.org/jira/browse/GIRAPH-273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13424862#comment-13424862
 ] 

Gianmarco De Francisci Morales commented on GIRAPH-273:
-------------------------------------------------------

Hi,
good idea.
Regarding to option 2, why would this be better than option 1 for large files?
In any case the files would need to be read/written to the network, and on HDFS 
they would also be replicated.
I don't think HDFS is a good place for temporary files.

I guess the best way to implement aggregators would be a Dremel like solution 
with aggregation trees, so that you can reduce the pressure on the master while 
at the same time keep the latency low. (but maybe for only hundreds of machines 
this is overkill).
                
> Aggregators shouldn't use Zookeeper
> -----------------------------------
>
>                 Key: GIRAPH-273
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-273
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>
> We use Zookeeper znodes to transfer aggregated values from workers to master 
> and back. Zookeeper is supposed to be used for coordination, and it also has 
> a memory limit which prevents users from having aggregators with large value 
> objects. These are the reasons why we should implement aggregators gathering 
> and distribution in a different way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to