> On March 31, 2014, 5:22 p.m., Cheolsoo Park wrote:
> > Hi Rohini, you gave me an aha moment with this patch. It would be nice if 
> > TEZ-1003 could be fixed before Tez 0.4.0 release.
> > 
> > The approach that you're taking looks great! I had few minor comments while 
> > reading through your patch.

Thanks :). Just took the idea from MultiQueryOptimizer Daniel did as could not 
manage that within TezCompiler itself. And it was painful debugging and getting 
it to work :). I might not have handled some stuff and broken somethings with 
this patch. Will try to run the full suite of e2e after TEZ-1003 is fixed. I 
don't think we have test cases for union followed by groupby, join, etc and 
think we need to add new e2e tests. Created a separate jira (PIG-3855) for 
that. Union followed by Limit did look too complex, but my brain was totally 
fried it could not process any more. I think we need to take a look at 
optimizing Limit as well. Created PIG-3854 for that. 


> On March 31, 2014, 5:22 p.m., Cheolsoo Park wrote:
> > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java,
> >  lines 578-598
> > <https://reviews.apache.org/r/19836/diff/1/?file=542223#file542223line578>
> >
> >     This can be simplified as follows-
> >     
> >                 if (tezOp.getVertexGroupStores() != null) {
> >                     if 
> > (tezOp.getVertexGroupStores().containsKey(store.getOperatorKey())) {
> >                         continue;
> >                     }
> >                 }
> >                 vertex.addOutput(store.getOperatorKey().toString(),
> >                         storeOutDescriptor, MROutputCommitter.class);

Good suggestion. Had the key as the vertex group and value as store for some 
other reason. That is not used anymore. So will switch the key and value as it 
simplifies code.


> On March 31, 2014, 5:22 p.m., Cheolsoo Park wrote:
> > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/optimizers/UnionOptimizer.java,
> >  line 174
> > <https://reviews.apache.org/r/19836/diff/1/?file=542237#file542237line174>
> >
> >     Why do we need this conversion: List => ArrayList?

Since we are iterating over it and also disconnecting it from union which 
removes it from the succs list we need to copy over to a new ArrayList to avoid 
ConcurrentModificationException.


- Rohini


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19836/#review39048
-----------------------------------------------------------


On March 31, 2014, 9:18 a.m., Rohini Palaniswamy wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19836/
> -----------------------------------------------------------
> 
> (Updated March 31, 2014, 9:18 a.m.)
> 
> 
> Review request for pig, Cheolsoo Park and Daniel Dai.
> 
> 
> Bugs: PIG-3835
>     https://issues.apache.org/jira/browse/PIG-3835
> 
> 
> Repository: pig
> 
> 
> Description
> -------
> 
> Changes done:
> Changed POLocalRearrangeTez to POValueOutputTez
> Wrote a UnionOptimizer
> Got union followed by store working using vertexgroup.
> Also implemented the case where union followed by group by or join only has 3 
> vertices instead of 4 using vertexgroup. But that is not working because 
> ConcatenatedMergedKeyValuesInput is not working as expected. It does not 
> group values from the two input together. The values are grouped only within 
> each input. Will file a Tez bug on that.
> Putting up the patch to get approach vetted. Still to run unit and e2e tests 
> and some minor cleanup pending.
> 
> 
> Diffs
> -----
> 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/PigConfiguration.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POLimit.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/MultiQueryOptimizerTez.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POFRJoinTez.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POIdentityInOutTez.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POLocalRearrangeTez.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POShuffleTezLoad.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POSimpleTezLoad.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POStoreTez.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POValueInputTez.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POValueOutputTez.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POVertexGroupInputTez.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/ReadScalarsTez.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezEdgeDescriptor.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezInput.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLoad.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperPlan.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOutput.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezPOPackageAnnotator.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezPrinter.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/operators/POCounterStatsTez.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/operators/POCounterTez.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/operators/PORankTez.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/operators/POShuffledValueInputTez.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/optimizers/UnionOptimizer.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/tools/pigstats/tez/TezStats.java
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC-Union-1.gld
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC-Union-2.gld
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC-Union-3.gld
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC-Union-4.gld
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC-Union-5.gld
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC-Union-6.gld
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC-Union-7.gld
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC16.gld
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC17.gld
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC19.gld
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC21.gld
>  1583081 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/tez/TestTezCompiler.java
>  1583081 
> 
> Diff: https://reviews.apache.org/r/19836/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Rohini Palaniswamy
> 
>

Reply via email to