> On April 2, 2014, 9:11 a.m., Lukas Nalezenec wrote:
> > giraph-core/src/main/java/org/apache/giraph/partition/primitives/LongByteArrayPartition.java,
> >  line 232
> > <https://reviews.apache.org/r/19405/diff/2/?file=531353#file531353line232>
> >
> >     Small note:
> >     
> >     When the partition is configured to use UnsafeByteArrayOutputStream it 
> > allocates memory 
> >     in multiples of two in method ensureSize. It might make sense for 
> > graphs heavily mutated during computation
> >     but for lot of applications size of buffer never changes after starting 
> > first iteration or changes but not so much. 
> >     
> >     LOG:
> >     Current buffer size is 153 bytes, Current buffer position is 150 
> > (bytes), I need 10 more bytes
> >     Alocating new buffer with size 326 bytes
> >     
> >       private void ensureSize(int size) {
> >         if (pos + size > buf.length) {
> >           byte[] newBuf = new byte[(buf.length + size) << 1];
> >           System.arraycopy(buf, 0, newBuf, 0, pos);
> >           buf = newBuf;
> >         }
> >       }
> >     
> >

sure, I think we should make it somehow configurable with default being 
implemented as doubling every so often. 
would you like to write code for that? also please take a look at GIRAPH-892 
before doing so, since it defines a few more DataOutputs


> On April 2, 2014, 9:11 a.m., Lukas Nalezenec wrote:
> > giraph-core/src/main/java/org/apache/giraph/partition/primitives/LongByteArrayPartition.java,
> >  line 293
> > <https://reviews.apache.org/r/19405/diff/2/?file=531353#file531353line293>
> >
> >     Small note:
> >     Some algorithms may benefit from iterating vertices ordered by key. We 
> > cant use sorted iterator by default since partitions could be big but there 
> > could be configuration option to turn in on.
> >

Any example of such algorithms? 
Since anything done on a vertex or by it, in a superstep is not visible until 
the next superstep, I cannot see how this would be helpful


- Pavan Kumar


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19405/#review39247
-----------------------------------------------------------


On March 21, 2014, 4:22 p.m., Craig Muchinsky wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19405/
> -----------------------------------------------------------
> 
> (Updated March 21, 2014, 4:22 p.m.)
> 
> 
> Review request for giraph.
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> This patch adds 2 new byte array partition variations that are optimized for 
> int/long ids. They leverage fastutil primitive maps and allow for vertex 
> object reuse during iteration because they don't keep a reference to the 
> vertexId object in the primitive map.
> 
> Additional unit tests were added to TestPartitionStores which cover the new 
> IntByteArrayPartition class, which is functionally identical to 
> LongByteArrayPartition.
> 
> 
> Diffs
> -----
> 
>   
> giraph-core/src/main/java/org/apache/giraph/partition/primitives/IntByteArrayPartition.java
>  PRE-CREATION 
>   
> giraph-core/src/main/java/org/apache/giraph/partition/primitives/LongByteArrayPartition.java
>  PRE-CREATION 
>   
> giraph-core/src/main/java/org/apache/giraph/partition/primitives/package-info.java
>  PRE-CREATION 
>   
> giraph-core/src/test/java/org/apache/giraph/partition/TestPartitionStores.java
>  08f4544 
> 
> Diff: https://reviews.apache.org/r/19405/diff/
> 
> 
> Testing
> -------
> 
> Successful "mvn clean verify" with hadoop_2 profile, and 4B vertex 5B edge 
> graph tested on 18 node 432 core cluster.
> 
> 
> Thanks,
> 
> Craig Muchinsky
> 
>

Reply via email to