> On April 2, 2014, 9:11 a.m., Lukas Nalezenec wrote: > > giraph-core/src/main/java/org/apache/giraph/partition/primitives/LongByteArrayPartition.java, > > line 232 > > <https://reviews.apache.org/r/19405/diff/2/?file=531353#file531353line232> > > > > Small note: > > > > When the partition is configured to use UnsafeByteArrayOutputStream it > > allocates memory > > in multiples of two in method ensureSize. It might make sense for > > graphs heavily mutated during computation > > but for lot of applications size of buffer never changes after starting > > first iteration or changes but not so much. > > > > LOG: > > Current buffer size is 153 bytes, Current buffer position is 150 > > (bytes), I need 10 more bytes > > Alocating new buffer with size 326 bytes > > > > private void ensureSize(int size) { > > if (pos + size > buf.length) { > > byte[] newBuf = new byte[(buf.length + size) << 1]; > > System.arraycopy(buf, 0, newBuf, 0, pos); > > buf = newBuf; > > } > > } > > > >
sure, I think we should make it somehow configurable with default being implemented as doubling every so often. would you like to write code for that? also please take a look at GIRAPH-892 before doing so, since it defines a few more DataOutputs > On April 2, 2014, 9:11 a.m., Lukas Nalezenec wrote: > > giraph-core/src/main/java/org/apache/giraph/partition/primitives/LongByteArrayPartition.java, > > line 293 > > <https://reviews.apache.org/r/19405/diff/2/?file=531353#file531353line293> > > > > Small note: > > Some algorithms may benefit from iterating vertices ordered by key. We > > cant use sorted iterator by default since partitions could be big but there > > could be configuration option to turn in on. > > Any example of such algorithms? Since anything done on a vertex or by it, in a superstep is not visible until the next superstep, I cannot see how this would be helpful - Pavan Kumar ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19405/#review39247 ----------------------------------------------------------- On March 21, 2014, 4:22 p.m., Craig Muchinsky wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/19405/ > ----------------------------------------------------------- > > (Updated March 21, 2014, 4:22 p.m.) > > > Review request for giraph. > > > Repository: giraph-git > > > Description > ------- > > This patch adds 2 new byte array partition variations that are optimized for > int/long ids. They leverage fastutil primitive maps and allow for vertex > object reuse during iteration because they don't keep a reference to the > vertexId object in the primitive map. > > Additional unit tests were added to TestPartitionStores which cover the new > IntByteArrayPartition class, which is functionally identical to > LongByteArrayPartition. > > > Diffs > ----- > > > giraph-core/src/main/java/org/apache/giraph/partition/primitives/IntByteArrayPartition.java > PRE-CREATION > > giraph-core/src/main/java/org/apache/giraph/partition/primitives/LongByteArrayPartition.java > PRE-CREATION > > giraph-core/src/main/java/org/apache/giraph/partition/primitives/package-info.java > PRE-CREATION > > giraph-core/src/test/java/org/apache/giraph/partition/TestPartitionStores.java > 08f4544 > > Diff: https://reviews.apache.org/r/19405/diff/ > > > Testing > ------- > > Successful "mvn clean verify" with hadoop_2 profile, and 4B vertex 5B edge > graph tested on 18 node 432 core cluster. > > > Thanks, > > Craig Muchinsky > >
