-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19405/#review39247
-----------------------------------------------------------



giraph-core/src/main/java/org/apache/giraph/partition/primitives/LongByteArrayPartition.java
<https://reviews.apache.org/r/19405/#comment71612>

    I am just curious:
    ByteArrayPartion was marked as @NotThreadSafe - do we really need to 
serialize the collection ? Why we dont serialize it on Partion level ?
    
    
    



giraph-core/src/main/java/org/apache/giraph/partition/primitives/LongByteArrayPartition.java
<https://reviews.apache.org/r/19405/#comment71611>

    Small note:
    
    When the partition is configured to use UnsafeByteArrayOutputStream it 
allocates memory 
    in multiples of two in method ensureSize. It might make sense for graphs 
heavily mutated during computation
    but for lot of applications size of buffer never changes after starting 
first iteration or changes but not so much. 
    
    LOG:
    Current buffer size is 153 bytes, Current buffer position is 150 (bytes), I 
need 10 more bytes
    Alocating new buffer with size 326 bytes
    
      private void ensureSize(int size) {
        if (pos + size > buf.length) {
          byte[] newBuf = new byte[(buf.length + size) << 1];
          System.arraycopy(buf, 0, newBuf, 0, pos);
          buf = newBuf;
        }
      }
    
    



giraph-core/src/main/java/org/apache/giraph/partition/primitives/LongByteArrayPartition.java
<https://reviews.apache.org/r/19405/#comment71610>

    Small note:
    Some algorithms may benefit from iterating vertices ordered by key. We cant 
use sorted iterator by default since partitions could be big but there could be 
configuration option to turn in on.
    


- Lukas Nalezenec


On March 21, 2014, 4:22 p.m., Craig Muchinsky wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19405/
> -----------------------------------------------------------
> 
> (Updated March 21, 2014, 4:22 p.m.)
> 
> 
> Review request for giraph.
> 
> 
> Repository: giraph-git
> 
> 
> Description
> -------
> 
> This patch adds 2 new byte array partition variations that are optimized for 
> int/long ids. They leverage fastutil primitive maps and allow for vertex 
> object reuse during iteration because they don't keep a reference to the 
> vertexId object in the primitive map.
> 
> Additional unit tests were added to TestPartitionStores which cover the new 
> IntByteArrayPartition class, which is functionally identical to 
> LongByteArrayPartition.
> 
> 
> Diffs
> -----
> 
>   
> giraph-core/src/main/java/org/apache/giraph/partition/primitives/IntByteArrayPartition.java
>  PRE-CREATION 
>   
> giraph-core/src/main/java/org/apache/giraph/partition/primitives/LongByteArrayPartition.java
>  PRE-CREATION 
>   
> giraph-core/src/main/java/org/apache/giraph/partition/primitives/package-info.java
>  PRE-CREATION 
>   
> giraph-core/src/test/java/org/apache/giraph/partition/TestPartitionStores.java
>  08f4544 
> 
> Diff: https://reviews.apache.org/r/19405/diff/
> 
> 
> Testing
> -------
> 
> Successful "mvn clean verify" with hadoop_2 profile, and 4B vertex 5B edge 
> graph tested on 18 node 432 core cluster.
> 
> 
> Thanks,
> 
> Craig Muchinsky
> 
>

Reply via email to