----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/35491/ -----------------------------------------------------------
Review request for pig.
Bugs: PIG-4574
https://issues.apache.org/jira/browse/PIG-4574
Repository: pig
Description
-------
Reading orderby/skewed join data from HDFS in Partitioner vertex, instead of
getting from sampler vertex.
This jira does not optimize the case of
A = LOAD 'x' ...;
B = LOAD 'y' ...;
C = UNION A, B;
D = ORDER C BY ..;
This depends on UnionOptimizer being turned on and will need more changes. So
will leave this for another jira.
Diffs
-----
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezCompiler.java
1685498
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/operator/POIdentityInOutTez.java
1685498
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/operator/POLocalRearrangeTez.java
1685498
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Limit-2.gld
1685498
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Order-1.gld
1685498
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Order-2.gld
PRE-CREATION
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-SkewJoin-1.gld
1685498
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-SkewJoin-2.gld
PRE-CREATION
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-16-OPTOFF.gld
1685498
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-16.gld
1685498
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezAutoParallelism.java
1685498
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezCompiler.java
1685498
Diff: https://reviews.apache.org/r/35491/diff/
Testing
-------
Ran subset of e2e tests -
SkewedJoin,Union,Order,MultiQuery_Self,MultiQuery_Union
Ran L9.pig. Before the patch
File System Counters
FILE_BYTES_READ=2028282366911
FILE_BYTES_WRITTEN=4049785379197
HDFS_BYTES_READ=1011533488395
HDFS_BYTES_WRITTEN=1010554380555
After the patch
File System Counters
FILE_BYTES_READ=1007449863330
FILE_BYTES_WRITTEN=2016036957653
HDFS_BYTES_READ=2023066976790
HDFS_BYTES_WRITTEN=1010554380555
Thanks,
Rohini Palaniswamy
