----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/9449/ -----------------------------------------------------------
(Updated Feb. 15, 2013, 1:24 a.m.) Review request for giraph. Changes ------- Refactored with common abstractions for sending edges/messages, as per Maja's advice. ByteArrayVertexIdData extended by ByteArrayVertexId(Messages/Edges) SendCache extended by Send(Message/Edge)Cache SendWorkerDataRequest extended by SendWorker(Messages/Edges)Request Description ------- This patch adds the following classes: - SendWorkerEdgesRequest: a request used to send edges during input superstep, similar to the corresponding one for messages - SendEdgeCache: similar to SendMessageCache - ByteArrayVertexIdEdges: serialized representation for lists of edges (for different source vertices), similar to the corresponding one for messages - EdgeStore: a server-side structure that stores transient edges from incoming requests, and later moves them to the owning vertices. - ByteArrayEdges: an edge list (for the same source vertex) stored as a byte-array. The standard way of iterating is by reusing Edge objects, but an alternative iterator that instantiates new objects is provided. Depending on the vertex implementation, we use one of the other. This is a refactor of the byte-array code in RepresentativeVertex, which now contains an instance of ByteArrayEdges. When calling setEdges(), RepresentativeVertex is smart to realize that the passed Iterable is actually an instance of ByteArrayEdges, and simply takes ownership of it (without iterating). If using something like EdgeListVertex (which keeps references to the passed edges), we will use the alternative iterable (this is of course less memory-efficient). I've also renamed RepresentativeVertex to ByteArrayVertex because it was misleading (it doesn't need to be used with ByteArrayPartition, it's perfectly fine to have multiple Vertex objects, each storing its edges in a byte-array). Future work: EdgeStore could become an interface in the future, allowing for different implementations (e.g. out-of-core) and handling permanent edge storage in place of Vertex. That way, we would have only one Vertex class, and pluggable storage implementations (which makes it easier to switch without changing user code). This addresses bug GIRAPH-515. https://issues.apache.org/jira/browse/GIRAPH-515 Diffs (updated) ----- giraph-core/src/main/java/org/apache/giraph/benchmark/ByteArrayVertexPageRankBenchmark.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/benchmark/MultiGraphByteArrayVertexPageRankBenchmark.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/benchmark/MultiGraphRepresentativeVertexPageRankBenchmark.java 96288323e6028e779113d2520ea9edad497bb0e1 giraph-core/src/main/java/org/apache/giraph/benchmark/PageRankBenchmark.java 19b08bdb19df21b1dc56dad2cebb499222f9b19e giraph-core/src/main/java/org/apache/giraph/benchmark/RepresentativeVertexPageRankBenchmark.java 331ae41a2c0df6b124cbf33944b05f080b49ce94 giraph-core/src/main/java/org/apache/giraph/comm/SendCache.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/comm/SendEdgeCache.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/comm/SendMessageCache.java 3cbf0eb4775fa3ff0b0351f247df87783bf05995 giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java 3655d79d8f249338da30ae2bb38b9cfd6b7b1f56 giraph-core/src/main/java/org/apache/giraph/comm/WorkerClientRequestProcessor.java 0c043e29ae3160bbfc389c435427cf57010a91e1 giraph-core/src/main/java/org/apache/giraph/comm/WorkerServer.java e60db5529b7fef0b16441ef88df7053d6856ffc5 giraph-core/src/main/java/org/apache/giraph/comm/messages/ByteArrayMessagesPerVertexStore.java 65caa5d2777b90fa8e14bee7c8d69316d512c651 giraph-core/src/main/java/org/apache/giraph/comm/netty/NettyWorkerClientRequestProcessor.java d4e919ed1aa1f977a2e487531f57b3a2fc0fad47 giraph-core/src/main/java/org/apache/giraph/comm/netty/NettyWorkerServer.java 1b7cc5410aa4d7e1b9ae4580dd5ed484e09ff7ed giraph-core/src/main/java/org/apache/giraph/comm/requests/RequestType.java aac00289f915f61e61334cdcd92c93c1ef3b5419 giraph-core/src/main/java/org/apache/giraph/comm/requests/SendWorkerDataRequest.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/comm/requests/SendWorkerEdgesRequest.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/comm/requests/SendWorkerMessagesRequest.java 641c795521006c460138d6b3b6d9ceb3c3e7eccf giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 9e129efebe39c42bab9d59b3246055b79cdbdfa3 giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 8797c0e80824558bf544650f7c896bddd3f873fb giraph-core/src/main/java/org/apache/giraph/conf/ImmutableClassesGiraphConfiguration.java 3e158afdc480656b3937508f5d86ec294bfa3b99 giraph-core/src/main/java/org/apache/giraph/graph/EdgeStore.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/partition/ByteArrayPartition.java 12989180a4aabed19c3aefa52ef38ad6d7aa6851 giraph-core/src/main/java/org/apache/giraph/partition/DiskBackedPartitionStore.java 844a229096005059e9cd05b5bf213d2afa1d41dd giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayEdges.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayVertexIdData.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayVertexIdEdges.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/utils/ByteArrayVertexIdMessages.java dea4229f10224edb30f59626d5987ea840e8a271 giraph-core/src/main/java/org/apache/giraph/utils/VertexIdIterator.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/vertex/ByteArrayVertex.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/vertex/ByteArrayVertexBase.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/vertex/EdgeListVertex.java 9ae692fc00432e28f0b87f11ed5981e600c95019 giraph-core/src/main/java/org/apache/giraph/vertex/MultiGraphByteArrayVertex.java PRE-CREATION giraph-core/src/main/java/org/apache/giraph/vertex/MultiGraphRepresentativeVertex.java 4733e2a6011ec8e1cc4eef1d2eb61abe777ec310 giraph-core/src/main/java/org/apache/giraph/vertex/RepresentativeVertex.java f805007b8bb8f89e9388cf89c2e81f92328b2b1c giraph-core/src/main/java/org/apache/giraph/vertex/RepresentativeVertexBase.java 4de6ed85b499e74b04e93c3780324a6b9e9f2b83 giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java fa3ab49f11d61352a5f6f69699375abd2bf1e527 giraph-core/src/main/java/org/apache/giraph/worker/EdgeInputSplitsCallable.java bdf9f5705811340748172a70dc952493d5ececfc giraph-core/src/test/java/org/apache/giraph/comm/RequestFailureTest.java 2845c90cbfd38f2f35e70e3b79767e1386d54a7e giraph-core/src/test/java/org/apache/giraph/comm/RequestTest.java d779fe46377eaa8fa2debf0836f975a30ec6e21f giraph-core/src/test/java/org/apache/giraph/utils/MockUtils.java 82dc2839d83f80ebcf52bad252886d50310eacc5 giraph-core/src/test/java/org/apache/giraph/vertex/TestMultiGraphVertex.java a5a3545de7dc9e30ab0f30926122049fdbe1173b giraph-core/src/test/java/org/apache/giraph/vertex/TestMutableVertex.java ca4ba1a336f68b584c4fdbaf74be60dbe41644d5 Diff: https://reviews.apache.org/r/9449/diff/ Testing ------- mvn verify Tested on both benchmarks and real-world applications. This typically brings requirements down a lot: in an application using a few hundred billion edges, which previously only ran with 300 workers, we're now able to run with 100 workers, with a lot of memory to spare and even faster than before (from around 600s to 400s). Thanks, Alessandro Presta