----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/16463/#review30867 -----------------------------------------------------------
Patch looks good. Just few minor review comments. > Note that 3 test cases in TestAccumulator are annotated as @Ignore because > SecondaryKeyOptimizer in Tez is not implement yet. The test cases expect > accumulator optimizer is applied when order-by and distinct are present in a > nested foreash because sort operator is removed by SecondaryKeyOptimizer. > Added TODO comments accordingly. I have the SecondaryKeyOptimizer fixed in my patch for PIG-3626. Will take care of reverting them back in that patch. src/org/apache/pig/backend/hadoop/executionengine/tez/POShuffleTezLoad.java <https://reviews.apache.org/r/16463/#comment59082> I see that unsafe typecast to Iterable<NullableTuple> is not possible in java. But copying over to a new arraylist seems to be inefficient just to get a Iterable<NullableTuple>. Can we write a wrapper iterator that implements Iterable<NullableTuple> wrapping Iterable<Object> instead? test/org/apache/pig/test/TestAccumulator.java <https://reviews.apache.org/r/16463/#comment59079> Why do we specifically need homeDirOnDfs? Isn't that the default current directory? test/org/apache/pig/test/TestAccumulator.java <https://reviews.apache.org/r/16463/#comment59078> Any reason for adding this newly within tests? tearDown() already has it. test/org/apache/pig/test/TestAccumulator.java <https://reviews.apache.org/r/16463/#comment59080> Can be removed. tearDown already does it test/org/apache/pig/test/TestCombiner.java <https://reviews.apache.org/r/16463/#comment59077> We seem to be doing this in a couple of places now. If there are issues we hit because of session reuse (static variables, etc), then we should try fix them so that session reuse works with grunt shell. Not important for this jira as we need to get the core stuff out first. Can you create a separate jira mentioning the issues that you faced so far with session reuse so that we come back later and fix them some time. - Rohini Palaniswamy On Dec. 26, 2013, 6:11 a.m., Cheolsoo Park wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/16463/ > ----------------------------------------------------------- > > (Updated Dec. 26, 2013, 6:11 a.m.) > > > Review request for pig, Alex Bain, Daniel Dai, Mark Wagner, and Rohini > Palaniswamy. > > > Bugs: PIG-3636 > https://issues.apache.org/jira/browse/PIG-3636 > > > Repository: pig-git > > > Description > ------- > > The patch implements accumulator optimization in Tez. The changes include- > * Create AccumulatorOptimizer in Tez. > * Create AccumulatorOptimizerUtil class and factor out common functions in MR > and Tez. > * Implement accumulator logic in POShuffleTezLoad. > * Update TestAccumulator to make it run in Tez mode. > > > Diffs > ----- > > src/org/apache/pig/PigConfiguration.java 0a26e8c > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java > 7f9e15a > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java > 9eed25c > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java > 6e04513 > > src/org/apache/pig/backend/hadoop/executionengine/tez/AccumulatorOptimizer.java > e69de29 > src/org/apache/pig/backend/hadoop/executionengine/tez/POShuffleTezLoad.java > 722b9f6 > src/org/apache/pig/backend/hadoop/executionengine/tez/POUnionTezLoad.java > 742a33a > src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java > d42ce89 > src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java > c6af682 > src/org/apache/pig/backend/hadoop/executionengine/tez/TezPlanContainer.java > e33a7c6 > > src/org/apache/pig/backend/hadoop/executionengine/util/AccumulatorOptimizerUtil.java > e69de29 > test/org/apache/pig/test/TestAccumulator.java b979649 > test/org/apache/pig/test/TestCombiner.java a227d18 > test/tez-tests fcb573e > > Diff: https://reviews.apache.org/r/16463/diff/ > > > Testing > ------- > > * TestAccumulator passes in Tez mode. > * All unit tests pass. > * All e2e tests pass. > > Note that 3 test cases in TestAccumulator are annotated as @Ignore because > SecondaryKeyOptimizer in Tez is not implement yet. The test cases expect > accumulator optimizer is applied when order-by and distinct are present in a > nested foreash because sort operator is removed by SecondaryKeyOptimizer. > Added TODO comments accordingly. > > > Thanks, > > Cheolsoo Park > >
