-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8611/#review16868
-----------------------------------------------------------

Ship it!


Forgot to say, I'm +1 on this :-)

- Maja Kabiljo


On Feb. 21, 2013, 6:17 p.m., Nitay Joffe wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8611/
> -----------------------------------------------------------
> 
> (Updated Feb. 21, 2013, 6:17 p.m.)
> 
> 
> Review request for giraph.
> 
> 
> Description
> -------
> 
> One particular thing I added was the concept of "profiles", allowing for 
> easily reading / writing from multiple tables. This should remove a lot of 
> the cruft around the GiraphHCat* classes.
> 
> Note in the diff I separated the code so that there would be a 
> Giraph-unrelated Hive-only portion (under package org.apache.hadoop.hive). 
> Things under this package (and its children) do not touch any Giraph code, 
> and so can be contributed as an IOFormat back to Hive itself.
> 
> Also note the new (I think improved) interface: Users do not need to actually 
> implement an XInputFormat anymore. They just create a class the implements 
> the HiveToVertex (HiveToEdge, VertexToHive) interface, plug that in, and use 
> HiveVertexInputFormat. Should make user code much cleaner.
> 
> 
> This addresses bug GIRAPH-453.
>     https://issues.apache.org/jira/browse/GIRAPH-453
> 
> 
> Diffs
> -----
> 
>   giraph-accumulo/pom.xml cb9fbc02e6fc8adcb0ec41e0c6aeff75b1ef3f06 
>   giraph-core/src/main/java/org/apache/giraph/comm/netty/NettyClient.java 
> 89ef87fea7a370354156fb7be02ef4249e0a6111 
>   giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java 
> ddeaeb769b548eb1002ccf8c18ffe048eb096f8d 
>   giraph-hbase/pom.xml 7bbbd98c0b3db6878aee4be21eecd821448da7ef 
>   giraph-hcatalog/pom.xml 019f02083012704a997ffe715cefe3adeb153dd9 
>   
> giraph-hcatalog/src/main/java/org/apache/giraph/io/hcatalog/HCatGiraphRunner.java
>  PRE-CREATION 
>   
> giraph-hcatalog/src/main/java/org/apache/giraph/io/hcatalog/HiveGiraphRunner.java
>  313bab04c50ed6be7143254de80e36a4ba291516 
>   giraph-hcatalog/src/main/java/org/apache/giraph/io/hcatalog/HiveUtils.java 
> c1f76f1a46d1fc9af489a916256884520c138cb4 
>   giraph-hive/pom.xml PRE-CREATION 
>   giraph-hive/src/main/assembly/compile.xml PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java 
> PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveProfiles.java 
> PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/common/package-info.java 
> PRE-CREATION 
>   
> giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeInputFormat.java
>  PRE-CREATION 
>   
> giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeReader.java
>  PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveToEdge.java 
> PRE-CREATION 
>   
> giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/package-info.java 
> PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/input/package-info.java 
> PRE-CREATION 
>   
> giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveToVertex.java
>  PRE-CREATION 
>   
> giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveVertexInputFormat.java
>  PRE-CREATION 
>   
> giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveVertexReader.java
>  PRE-CREATION 
>   
> giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/package-info.java
>  PRE-CREATION 
>   
> giraph-hive/src/main/java/org/apache/giraph/hive/output/HiveVertexOutputFormat.java
>  PRE-CREATION 
>   
> giraph-hive/src/main/java/org/apache/giraph/hive/output/HiveVertexWriter.java 
> PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/output/VertexToHive.java 
> PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/output/package-info.java 
> PRE-CREATION 
>   giraph-hive/src/main/java/org/apache/giraph/hive/package-info.java 
> PRE-CREATION 
>   pom.xml c075762cddd7a698c92aaad4017cd74915160e41 
> 
> Diff: https://reviews.apache.org/r/8611/diff/
> 
> 
> Testing
> -------
> 
> Ran on some production jobs and verified results were exactly the same.
> 
> Here's a comparison of performance on real work loads ("base" is hcatalog, 
> "mine" is hive):
> https://gist.github.com/nitay/880d8fb20d2ac86015d4/raw/6b297fcb287bf8d3dc8175bad217aa86544b4f18/high+school
> 
> Basically we see slight improvement which is expected because I haven't done 
> a lot in terms of performance yet.
> There are few performance improvement ideas coming, this is just the first 
> working version.
> 
> 
> Thanks,
> 
> Nitay Joffe
> 
>

Reply via email to