[ https://issues.apache.org/jira/browse/GIRAPH-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13458948#comment-13458948 ]
Eli Reisman commented on GIRAPH-326: ------------------------------------ I agree. I would say if that spot needs progress calls, put up a JIRA to add them right away, we need those with or without the threaded writes. I am not against the multithreaded option to write to ZK. I am wondering still: if the quorum is agreeing on the order for each proposed write before it is delivered, is the real speed bottleneck the number of writes making it to ZK fast enough from the master, or the ZK quorum syncing itself on the writes as it delivers them? Its surprising to me that the extra writers would speed this up. Do the repeated write calls from the single thread writer block until the watch is signalled for each call or something? Since the system isn't really doing anything important during the ZK connections and input split write, it shouldn't hurt to have the threads used this way. I would like to see those thread resources cleaned up or repurposed before the next stages of the job begin, other than that it sounds good to me. > Writing input splits to ZooKeeper in parallel > --------------------------------------------- > > Key: GIRAPH-326 > URL: https://issues.apache.org/jira/browse/GIRAPH-326 > Project: Giraph > Issue Type: Improvement > Reporter: Maja Kabiljo > Attachments: GIRAPH-326.patch > > > (Posting issue and the patch from a colleague) > Writing input splits to zookeeper can take a lot of time. From his > experiments: serial 2m45s, with 16 cores 15s. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira