> On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote:
> > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java,
> >  lines 772-773
> > <https://reviews.apache.org/r/17878/diff/2/?file=486610#file486610line772>
> >
> >     It would be easier to keep this here for curTezOp.getSplitParent() != 
> > null (no multiquery)
> 
> Daniel Dai wrote:
>     Not sure if I understand, but seems there is nothing special for filter 
> for splitter, can you be more specific?
> 
> Rohini Palaniswamy wrote:
>     There is always a extra empty POFilter after POSplit. In MR it is removed 
> in NoopFilterRemover.java. It is easy/efficient to do it here as it saves on 
> traversing the whole plan again and then finding it and removing it.

I try to do it in MultiQueryOptimizerTez, there is a TODO tag for that. 
Actually we need to check the filter condition, cuz not filter in splittee is 
redundant, right?


> On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote:
> > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java,
> >  line 2110
> > <https://reviews.apache.org/r/17878/diff/2/?file=486610#file486610line2110>
> >
> >     You will need these, else there will be hanging operators in the plan
> 
> Daniel Dai wrote:
>     I also changed compile(PhysicalOperator op), which now handles split 
> differently, splitter will only be added once
> 
> Rohini Palaniswamy wrote:
>     I am not very sure. Can you check once if removing this does not 
> introduce PIG-3620 again?

During my debugging, I manually examined bunch of plans, I am pretty sure it 
should not be the case.


> On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote:
> > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java,
> >  line 2112
> > <https://reviews.apache.org/r/17878/diff/2/?file=486610#file486610line2112>
> >
> >     You will need this as well, else input will be pointing to wrong 
> > physical operataors.
> 
> Daniel Dai wrote:
>     Checked seems to be Ok. Guess that is also due the change of how I 
> generate the first split. Splitter's physical plan is composed with 
> PhysicalPlan.connect, which should deal with inputs correctly.
> 
> Rohini Palaniswamy wrote:
>     I hit this problem with multiquery off. With multiquery on this was never 
> a problem.

I can run e2e tests with mutiquery off to see.


> On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote:
> > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java,
> >  line 172
> > <https://reviews.apache.org/r/17878/diff/2/?file=486612#file486612line172>
> >
> >     Might be easier if this optimizer is the first one to be called before 
> > Combiner and secondary key optimizer. Just a thought. If there are no 
> > issues with having it later, then fine.
> 
> Daniel Dai wrote:
>     This is follow MR version. In MR, multiquery and combiner optimization 
> are mutually exclusive (multiquery MR cannot have combiner), and we find 
> combiner is more desired than multiquery, so we run combiner optimization 
> first. Fortunately that is not the case for Tez. But I still want to keep the 
> sequence of rules triggered to minimize Tez only issue.
> 
> Rohini Palaniswamy wrote:
>     Earlier I had fixed the combiner and secondary key optimizer of Tez to 
> work with splits. If you don't move it to beginning then you will have to fix 
> MultiQueryOptimizer to copy over combine plans and secondary sort orders in 
> the TezEdgeDescriptor.

The good thing is combiner/secondary key property is on edge. When POSplit 
consume a vertex A, I keep the original edge from A to its successor. So 
combiner/secondary key property is still there.


- Daniel


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17878/#review34061
-----------------------------------------------------------


On Feb. 17, 2014, 4:25 a.m., Daniel Dai wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17878/
> -----------------------------------------------------------
> 
> (Updated Feb. 17, 2014, 4:25 a.m.)
> 
> 
> Review request for pig, Cheolsoo Park and Rohini Palaniswamy.
> 
> 
> Bugs: PIG-3757
>     https://issues.apache.org/jira/browse/PIG-3757
> 
> 
> Repository: pig
> 
> 
> Description
> -------
> 
> See PIG-3757
> 
> 
> Diffs
> -----
> 
>   
> branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java
>  1567297 
>   
> branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java
>  1567297 
>   
> branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java
>  1567297 
>   
> branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/MultiQueryOptimizerTez.java
>  PRE-CREATION 
>   
> branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java
>  1567297 
>   
> branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/SecondaryKeyOptimizerTez.java
>  1567297 
>   
> branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java
>  1567297 
>   
> branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java
>  1567297 
>   
> branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java
>  1567297 
>   
> branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java
>  1567297 
>   
> branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java
>  1567297 
>   branches/tez/src/org/apache/pig/impl/builtin/ReadScalarsTez.java 
> PRE-CREATION 
>   branches/tez/src/org/apache/pig/newplan/logical/visitor/ScalarVisitor.java 
> 1567297 
> 
> Diff: https://reviews.apache.org/r/17878/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Daniel Dai
> 
>

Reply via email to