[ https://issues.apache.org/jira/browse/GIRAPH-477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alessandro Presta updated GIRAPH-477: ------------------------------------- Attachment: GIRAPH-477.patch This patch adds a flag to optionally disable storing locality info and fetching it from ZK. > Fetching locality info in InputSplitPathOrganizer causes jobs to hang > --------------------------------------------------------------------- > > Key: GIRAPH-477 > URL: https://issues.apache.org/jira/browse/GIRAPH-477 > Project: Giraph > Issue Type: Bug > Reporter: Alessandro Presta > Assignee: Alessandro Presta > Attachments: GIRAPH-477.patch > > > In the presence of many input splits (>6000 in our case) and input split > threads (3000), the loop that fetches locality info for all splits from > ZooKeeper becomes a bottleneck. A few workers aren't able to even iterate > once over the list, run into increased GC pauses, and eventually time out. > Furthermore, depending on the cluster configuration, it's not always > possible/useful to exploit locality. > We should add a flag so that the feature can be optionally disabled. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira