----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19405/#review39247 -----------------------------------------------------------
giraph-core/src/main/java/org/apache/giraph/partition/primitives/LongByteArrayPartition.java <https://reviews.apache.org/r/19405/#comment71612> I am just curious: ByteArrayPartion was marked as @NotThreadSafe - do we really need to serialize the collection ? Why we dont serialize it on Partion level ? giraph-core/src/main/java/org/apache/giraph/partition/primitives/LongByteArrayPartition.java <https://reviews.apache.org/r/19405/#comment71611> Small note: When the partition is configured to use UnsafeByteArrayOutputStream it allocates memory in multiples of two in method ensureSize. It might make sense for graphs heavily mutated during computation but for lot of applications size of buffer never changes after starting first iteration or changes but not so much. LOG: Current buffer size is 153 bytes, Current buffer position is 150 (bytes), I need 10 more bytes Alocating new buffer with size 326 bytes private void ensureSize(int size) { if (pos + size > buf.length) { byte[] newBuf = new byte[(buf.length + size) << 1]; System.arraycopy(buf, 0, newBuf, 0, pos); buf = newBuf; } } giraph-core/src/main/java/org/apache/giraph/partition/primitives/LongByteArrayPartition.java <https://reviews.apache.org/r/19405/#comment71610> Small note: Some algorithms may benefit from iterating vertices ordered by key. We cant use sorted iterator by default since partitions could be big but there could be configuration option to turn in on. - Lukas Nalezenec On March 21, 2014, 4:22 p.m., Craig Muchinsky wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/19405/ > ----------------------------------------------------------- > > (Updated March 21, 2014, 4:22 p.m.) > > > Review request for giraph. > > > Repository: giraph-git > > > Description > ------- > > This patch adds 2 new byte array partition variations that are optimized for > int/long ids. They leverage fastutil primitive maps and allow for vertex > object reuse during iteration because they don't keep a reference to the > vertexId object in the primitive map. > > Additional unit tests were added to TestPartitionStores which cover the new > IntByteArrayPartition class, which is functionally identical to > LongByteArrayPartition. > > > Diffs > ----- > > > giraph-core/src/main/java/org/apache/giraph/partition/primitives/IntByteArrayPartition.java > PRE-CREATION > > giraph-core/src/main/java/org/apache/giraph/partition/primitives/LongByteArrayPartition.java > PRE-CREATION > > giraph-core/src/main/java/org/apache/giraph/partition/primitives/package-info.java > PRE-CREATION > > giraph-core/src/test/java/org/apache/giraph/partition/TestPartitionStores.java > 08f4544 > > Diff: https://reviews.apache.org/r/19405/diff/ > > > Testing > ------- > > Successful "mvn clean verify" with hadoop_2 profile, and 4B vertex 5B edge > graph tested on 18 node 432 core cluster. > > > Thanks, > > Craig Muchinsky > >