----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/8611/#review16868 -----------------------------------------------------------
Ship it! Forgot to say, I'm +1 on this :-) - Maja Kabiljo On Feb. 21, 2013, 6:17 p.m., Nitay Joffe wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/8611/ > ----------------------------------------------------------- > > (Updated Feb. 21, 2013, 6:17 p.m.) > > > Review request for giraph. > > > Description > ------- > > One particular thing I added was the concept of "profiles", allowing for > easily reading / writing from multiple tables. This should remove a lot of > the cruft around the GiraphHCat* classes. > > Note in the diff I separated the code so that there would be a > Giraph-unrelated Hive-only portion (under package org.apache.hadoop.hive). > Things under this package (and its children) do not touch any Giraph code, > and so can be contributed as an IOFormat back to Hive itself. > > Also note the new (I think improved) interface: Users do not need to actually > implement an XInputFormat anymore. They just create a class the implements > the HiveToVertex (HiveToEdge, VertexToHive) interface, plug that in, and use > HiveVertexInputFormat. Should make user code much cleaner. > > > This addresses bug GIRAPH-453. > https://issues.apache.org/jira/browse/GIRAPH-453 > > > Diffs > ----- > > giraph-accumulo/pom.xml cb9fbc02e6fc8adcb0ec41e0c6aeff75b1ef3f06 > giraph-core/src/main/java/org/apache/giraph/comm/netty/NettyClient.java > 89ef87fea7a370354156fb7be02ef4249e0a6111 > giraph-core/src/main/java/org/apache/giraph/conf/GiraphConfiguration.java > ddeaeb769b548eb1002ccf8c18ffe048eb096f8d > giraph-hbase/pom.xml 7bbbd98c0b3db6878aee4be21eecd821448da7ef > giraph-hcatalog/pom.xml 019f02083012704a997ffe715cefe3adeb153dd9 > > giraph-hcatalog/src/main/java/org/apache/giraph/io/hcatalog/HCatGiraphRunner.java > PRE-CREATION > > giraph-hcatalog/src/main/java/org/apache/giraph/io/hcatalog/HiveGiraphRunner.java > 313bab04c50ed6be7143254de80e36a4ba291516 > giraph-hcatalog/src/main/java/org/apache/giraph/io/hcatalog/HiveUtils.java > c1f76f1a46d1fc9af489a916256884520c138cb4 > giraph-hive/pom.xml PRE-CREATION > giraph-hive/src/main/assembly/compile.xml PRE-CREATION > giraph-hive/src/main/java/org/apache/giraph/hive/HiveGiraphRunner.java > PRE-CREATION > giraph-hive/src/main/java/org/apache/giraph/hive/common/HiveProfiles.java > PRE-CREATION > giraph-hive/src/main/java/org/apache/giraph/hive/common/package-info.java > PRE-CREATION > > giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeInputFormat.java > PRE-CREATION > > giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveEdgeReader.java > PRE-CREATION > giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/HiveToEdge.java > PRE-CREATION > > giraph-hive/src/main/java/org/apache/giraph/hive/input/edge/package-info.java > PRE-CREATION > giraph-hive/src/main/java/org/apache/giraph/hive/input/package-info.java > PRE-CREATION > > giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveToVertex.java > PRE-CREATION > > giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveVertexInputFormat.java > PRE-CREATION > > giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/HiveVertexReader.java > PRE-CREATION > > giraph-hive/src/main/java/org/apache/giraph/hive/input/vertex/package-info.java > PRE-CREATION > > giraph-hive/src/main/java/org/apache/giraph/hive/output/HiveVertexOutputFormat.java > PRE-CREATION > > giraph-hive/src/main/java/org/apache/giraph/hive/output/HiveVertexWriter.java > PRE-CREATION > giraph-hive/src/main/java/org/apache/giraph/hive/output/VertexToHive.java > PRE-CREATION > giraph-hive/src/main/java/org/apache/giraph/hive/output/package-info.java > PRE-CREATION > giraph-hive/src/main/java/org/apache/giraph/hive/package-info.java > PRE-CREATION > pom.xml c075762cddd7a698c92aaad4017cd74915160e41 > > Diff: https://reviews.apache.org/r/8611/diff/ > > > Testing > ------- > > Ran on some production jobs and verified results were exactly the same. > > Here's a comparison of performance on real work loads ("base" is hcatalog, > "mine" is hive): > https://gist.github.com/nitay/880d8fb20d2ac86015d4/raw/6b297fcb287bf8d3dc8175bad217aa86544b4f18/high+school > > Basically we see slight improvement which is expected because I haven't done > a lot in terms of performance yet. > There are few performance improvement ideas coming, this is just the first > working version. > > > Thanks, > > Nitay Joffe > >