[ https://issues.apache.org/jira/browse/GIRAPH-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477591#comment-13477591 ]
Hudson commented on GIRAPH-374: ------------------------------- Integrated in Giraph-trunk-Commit #244 (See [https://builds.apache.org/job/Giraph-trunk-Commit/244/]) GIRAPH-374: Multithreading in input split loading and compute (aching). (Revision 1399090) Result = SUCCESS aching : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1399090 Files : * /giraph/trunk/CHANGELOG * /giraph/trunk/giraph-formats-contrib/src/main/java/org/apache/giraph/io/hbase/HBaseVertexInputFormat.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/GiraphConfiguration.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/bsp/CentralizedService.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/bsp/CentralizedServiceMaster.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/bsp/CentralizedServiceWorker.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/SendMessageCache.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/SendPartitionCache.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/WorkerClient.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/WorkerClientRequestProcessor.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/WorkerServer.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/ChannelRotater.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyClient.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyServer.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyWorkerClient.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyWorkerClientRequestProcessor.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyWorkerClientServer.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyWorkerServer.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/handler/AddressRequestIdGenerator.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/examples/SimpleSuperstepVertex.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/AggregatorWrapper.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/BspServiceMaster.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/BspServiceWorker.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/ComputeCallable.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/FinishedSuperstepStats.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/GraphMapper.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/GraphState.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/InputSplitsCallable.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/MutableVertex.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/SimpleMutableVertex.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/Vertex.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/partition/HashWorkerPartitioner.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/partition/PartitionStats.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/partition/PartitionStore.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/utils/LoggerUtils.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/utils/ProgressableUtils.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/utils/Time.java * /giraph/trunk/giraph/src/main/java/org/apache/giraph/zk/ZooKeeperExt.java * /giraph/trunk/giraph/src/test/java/org/apache/giraph/BspCase.java * /giraph/trunk/giraph/src/test/java/org/apache/giraph/TestBspBasic.java * /giraph/trunk/giraph/src/test/java/org/apache/giraph/TestPageRank.java * /giraph/trunk/giraph/src/test/java/org/apache/giraph/utils/MockUtils.java > Multithreading in input split loading and compute > ------------------------------------------------- > > Key: GIRAPH-374 > URL: https://issues.apache.org/jira/browse/GIRAPH-374 > Project: Giraph > Issue Type: Improvement > Reporter: Avery Ching > Assignee: Avery Ching > Attachments: GIRAPH-374.2.patch > > > Cleaned up the WorkerClient hierarchy > - WorkerClientRequestProcessor is a request cache for every thread (input > split loading / compute) > - With RPC gone, got rid of ugly WorkerClientServer and > NettyWorkerClientServer > SendPartitionCache > Made GraphState immutable for multi-threading > Added multithreading for loading the input splits > Added multithreading for compute > Added thread-level debugging as an option > Added additional testing on the number of vertices, edges > Optimization on HashWorkerPartitioner to use CopyOnWriteArrayList instead of > sychronized list (this is a bottleneck) > Added multithreaded TestPageRank test case > I ran the PageRankBenchmark on 20 workers with 10M vertices, 1B edges. All > supersteps are about the same time, so I just compared superstep 0 from every > test. Compute performance gains are quite nice (even a little faster than > before with one thread). Actual gains will depend heavily on the number of > cores you have and possible parallelism of the application. > {code} > Trunk > # threads compute time (secs) total time (secs) > 1 89 97.543 > Multithreading > 1 86.70094 92.477 > 2 50.41521 57.850 > 4 38.07716 50.246 > 8 38.63188 45.940 > 16 22.999943 48.607 > 24 23.649189 45.112 > 32 21.412325 44.201 > {code} > We also saw similar gains on the input split loading on an internal app. > Future work can be to further improve the scalability of multithreading. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira