[ 
https://issues.apache.org/jira/browse/GIRAPH-477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550370#comment-13550370
 ] 

Alessandro Presta commented on GIRAPH-477:
------------------------------------------

Thanks for the review Eli.
The main problem here is not that it's more data to store, but that (at least 
at our scale) that loop is a real bottleneck. We also need to take another look 
at our split reservation strategy, because I suspect we could be a lot smarter 
in terms of not having all workers (and threads) fetching the same data from 
ZooKeeper over and over. We could even consider having the master take care of 
this instead of ZK.
But at least making this code path optional fixes the issue.
                
> Fetching locality info in InputSplitPathOrganizer causes jobs to hang
> ---------------------------------------------------------------------
>
>                 Key: GIRAPH-477
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-477
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Alessandro Presta
>            Assignee: Alessandro Presta
>         Attachments: GIRAPH-477.patch
>
>
> In the presence of many input splits (>6000 in our case) and input split 
> threads (3000), the loop that fetches locality info for all splits from 
> ZooKeeper becomes a bottleneck. A few workers aren't able to even iterate 
> once over the list, run into increased GC pauses, and eventually time out.
> Furthermore, depending on the cluster configuration, it's not always 
> possible/useful to exploit locality.
> We should add a flag so that the feature can be optionally disabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to