[ https://issues.apache.org/jira/browse/GIRAPH-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457299#comment-13457299 ]
Eli Reisman commented on GIRAPH-326: ------------------------------------ Thinking a little more on this, I agree speed is good, but I've developed a few reservations. Using more threads for anything non-crucial is kind of a no-go in my book given that many use cases for Giraph are on existing Hadoop clusters, in use, where our worker tasks will share hardware resources directly with other MR/Pig/Hive/etc. jobs coming and going while Giraph runs, and we don't want to claim anything we don't need badly. At one point I was instrumenting INPUT_SUPERSTEP a lot and testing on large cluster runs with 1000's of input splits. Even with 1000's of splits, I saw the entire INPUT_SUPERSTEP done in a couple minutes, ready to go on to superstep 0. I never once saw the initial master's write of the splits to the znode tree take more than a few seconds in all these job runs. This was on a cluster, not in pseudo mode, but begs the question: what was the nature of the setup when this ZK slowdown was observed? One more thought: since ZK has to sync() writes with the rest of the quorum to provide total ordering, the speed of write requests (or the identities of the processes delivering them) is not the bottleneck here, the bottleneck is typically how fast the ZK quorum can update and come to agreement about the delivery of each of the proposed writes. From this perspective, does it matter speed-wise which threads or even which clients are making proposals to the quorum? Does this mean the issue might be with the user's ZK config and not Giraph at all? The multithreaded ZK write idea does make sense, but I'd like to hear more details on how this problem was encountered before I'll be convinced this makes sense for Giraph. > Writing input splits to ZooKeeper in parallel > --------------------------------------------- > > Key: GIRAPH-326 > URL: https://issues.apache.org/jira/browse/GIRAPH-326 > Project: Giraph > Issue Type: Improvement > Reporter: Maja Kabiljo > Attachments: GIRAPH-326.patch > > > (Posting issue and the patch from a colleague) > Writing input splits to zookeeper can take a lot of time. From his > experiments: serial 2m45s, with 16 cores 15s. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira