----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15949/ -----------------------------------------------------------
Review request for pig, Alex Bain, Cheolsoo Park, Daniel Dai, and Mark Wagner.
Bugs: PIG-3564 and PIG-3565
https://issues.apache.org/jira/browse/PIG-3564
https://issues.apache.org/jira/browse/PIG-3565
Repository: pig
Description
-------
- POStore and POLocalRearrange are replaced by POStoreTez and
POLocalRearrangeTez which have the name of the LogicalOutput. Output is
directly written through them and output related code removed from
PigProcessor. In the case of combiner, PigCombiner writes through the reduce
Context which is routed to LogicalOutput (MRCombiner in Tez handles this).
- This patch also contains the security related fixes for PIG-3564. Did not
separate it out as I was doing most of the e2e testing with that. Will use
PIG-3564 to checkin any incremental changes required after TEZ-606 is fixed.
Still need to handle few cases:
- custom partitioner
- secondary sort key
- memory management (In pig or Tez?) - Was hitting OOM with multiple logical
outputs as sort on the split vertex was taking up thrice the amount of memory
for 3 logical outputs (OOM in Tez DefaultSorter.java kvbuffer = new
byte[maxMemUsage]; )
Diffs
-----
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/PigServer.java
1546896
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigInputFormat.java
1546896
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/PhysicalOperator.java
1546896
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POLocalRearrange.java
1546896
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POStore.java
1546896
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/CombinerOptimizer.java
1546896
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POLocalRearrangeTez.java
PRE-CREATION
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POStoreTez.java
PRE-CREATION
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java
1546896
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/SecurityHelper.java
PRE-CREATION
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java
1546896
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDAG.java
PRE-CREATION
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java
1546896
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezJob.java
1546896
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezJobControlCompiler.java
1546896
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java
1546896
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOutput.java
PRE-CREATION
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezPOPackageAnnotator.java
1546896
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezSessionManager.java
1546896
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/tools/pigstats/tez/TezStats.java
1546896
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC6.gld
PRE-CREATION
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC7.gld
PRE-CREATION
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC8.gld
PRE-CREATION
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/tez/TestTezCompiler.java
1546896
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/tez/TestTezJobControlCompiler.java
1546896
Diff: https://reviews.apache.org/r/15949/diff/
Testing
-------
- Manually tested SPLIT and store within a single vertex, SPLIT output to
multiple vertexes and case where there is POSplit when grouping on same data on
different keys.
- Yet to test different combiners on different edges, but should mostly work.
- Have some problem with getting e2e to run. Will update tez.conf with e2e
tests in a separate jira later.
Thanks,
Rohini Palaniswamy
