> On Jan. 23, 2014, 12:09 a.m., Cheolsoo Park wrote:
> > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java,
> >  lines 757-760
> > <https://reviews.apache.org/r/17191/diff/1/?file=435393#file435393line757>
> >
> >     Do we need to set GROUP_COMPARATOR and SECONDARY_COMPARATOR when 
> > !tezOp.isUsedSecondaryKey()?

I just put that there in for now with all three having same value and a TODO so 
that we can revisit that later. SkewedJoin should be using 
PigGroupingPartitionWritableComparator  as GROUP_COMPARATOR 
(SECONDARY_COMPARATOR is same. In Tez MR both names were used. Kept that till 
we know the reasoning behind two names) and a normal comparator for 
TEZ_RUNTIME_INTERMEDIATE_INPUT_KEY_COMPARATOR_CLASS. That is what MR does. I 
started with it, but decided to do that later as with too many changes was 
difficult to track which change was breaking what. 


> On Jan. 23, 2014, 12:09 a.m., Cheolsoo Park wrote:
> > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java,
> >  lines 749-751
> > <https://reviews.apache.org/r/17191/diff/1/?file=435393#file435393line749>
> >
> >     This gets overwritten if tezOp.isSkewedJoin(). Why not moving it into 
> > an else block like you did for selectOutputComparator()?

Will move that to the else block before committing. My initial intention was to 
set INTERMEDIATE_INPUT_KEY_COMPARATOR_CLASS in common, and if skewed join only 
set the group comparator. But that did not work right out of the box. So 
overwrote it again in if(tezOp.isSkewedJoin()). We can switch again when we fix 
group comparator for skewed join.


- Rohini


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17191/#review32581
-----------------------------------------------------------


On Jan. 22, 2014, 6:18 p.m., Rohini Palaniswamy wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17191/
> -----------------------------------------------------------
> 
> (Updated Jan. 22, 2014, 6:18 p.m.)
> 
> 
> Review request for pig, Alex Bain, Cheolsoo Park, Daniel Dai, and Mark Wagner.
> 
> 
> Repository: pig
> 
> 
> Description
> -------
> 
> Added support for secondary key sort. Also fixed combiners and custom 
> partitioners to work for multiple outputs. Fixed bugs in couple of places as 
> I encountered them.
> 
> 
> Diffs
> -----
> 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/shims/test/hadoop23/org/apache/pig/test/MiniCluster.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/shims/test/hadoop23/org/apache/pig/test/TezMiniCluster.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/PigConfiguration.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/ColumnInfo.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizerMR.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/optimizer/SecondaryKeyOptimizer.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/PhysicalOperator.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POLocalRearrange.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POReservoirSample.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/CombinerOptimizer.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POShuffleTezLoad.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/SecondaryKeyOptimizerTez.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/SkewedPartitionerTez.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompilerUtil.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezEdgeDescriptor.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezPrinter.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/util/CombinerOptimizerUtil.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/util/SecondaryKeyOptimizerUtil.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/impl/io/NullableTuple.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/impl/io/PigNullableWritable.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/MiniGenericCluster.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/TestAccumulator.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/TestCombiner.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/TestCustomPartitioner.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/TestEvalPipeline2.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/TestPigServer.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/TestSecondarySort.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/TestSecondarySortMR.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/TestSkewedJoin.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/TestSplitStore.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/Util.java
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC14.gld
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC15.gld
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC16.gld
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC4.gld
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC5.gld
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC7.gld
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC8.gld
>  1559126 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/tez/TestSecondarySortTez.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/tez/TestTezCompiler.java
>  1559126 
>   http://svn.apache.org/repos/asf/pig/branches/tez/test/tez-tests 1559126 
> 
> Diff: https://reviews.apache.org/r/17191/diff/
> 
> 
> Testing
> -------
> 
> Enabled more unit tests. Memory is a problem with few tests. Also seeing 
> cases of Tez failing with file.out.index (map output) not found and hangs on 
> waiting for input. Not sure if it is with me upgrading latest tez trunk. But 
> definitely seems to be a Tez bug. Will create a TEZ jira once I dig a little 
> more.
> 
> Currently skewed join e2e tests - Join_7 and Join_8 fail with a NPE with this 
> patch. Others pass. Will try and fix that shortly or in a different jira. 
> Would like to get this patch in quickly as it has been long pending and 
> rebasing is a pain.
> 
> 
> Thanks,
> 
> Rohini Palaniswamy
> 
>

Reply via email to