> On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote: > > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java, > > lines 772-773 > > <https://reviews.apache.org/r/17878/diff/2/?file=486610#file486610line772> > > > > It would be easier to keep this here for curTezOp.getSplitParent() != > > null (no multiquery) > > Daniel Dai wrote: > Not sure if I understand, but seems there is nothing special for filter > for splitter, can you be more specific? > > Rohini Palaniswamy wrote: > There is always a extra empty POFilter after POSplit. In MR it is removed > in NoopFilterRemover.java. It is easy/efficient to do it here as it saves on > traversing the whole plan again and then finding it and removing it.
I try to do it in MultiQueryOptimizerTez, there is a TODO tag for that. Actually we need to check the filter condition, cuz not filter in splittee is redundant, right? > On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote: > > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java, > > line 2110 > > <https://reviews.apache.org/r/17878/diff/2/?file=486610#file486610line2110> > > > > You will need these, else there will be hanging operators in the plan > > Daniel Dai wrote: > I also changed compile(PhysicalOperator op), which now handles split > differently, splitter will only be added once > > Rohini Palaniswamy wrote: > I am not very sure. Can you check once if removing this does not > introduce PIG-3620 again? During my debugging, I manually examined bunch of plans, I am pretty sure it should not be the case. > On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote: > > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java, > > line 2112 > > <https://reviews.apache.org/r/17878/diff/2/?file=486610#file486610line2112> > > > > You will need this as well, else input will be pointing to wrong > > physical operataors. > > Daniel Dai wrote: > Checked seems to be Ok. Guess that is also due the change of how I > generate the first split. Splitter's physical plan is composed with > PhysicalPlan.connect, which should deal with inputs correctly. > > Rohini Palaniswamy wrote: > I hit this problem with multiquery off. With multiquery on this was never > a problem. I can run e2e tests with mutiquery off to see. > On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote: > > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java, > > line 172 > > <https://reviews.apache.org/r/17878/diff/2/?file=486612#file486612line172> > > > > Might be easier if this optimizer is the first one to be called before > > Combiner and secondary key optimizer. Just a thought. If there are no > > issues with having it later, then fine. > > Daniel Dai wrote: > This is follow MR version. In MR, multiquery and combiner optimization > are mutually exclusive (multiquery MR cannot have combiner), and we find > combiner is more desired than multiquery, so we run combiner optimization > first. Fortunately that is not the case for Tez. But I still want to keep the > sequence of rules triggered to minimize Tez only issue. > > Rohini Palaniswamy wrote: > Earlier I had fixed the combiner and secondary key optimizer of Tez to > work with splits. If you don't move it to beginning then you will have to fix > MultiQueryOptimizer to copy over combine plans and secondary sort orders in > the TezEdgeDescriptor. The good thing is combiner/secondary key property is on edge. When POSplit consume a vertex A, I keep the original edge from A to its successor. So combiner/secondary key property is still there. - Daniel ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17878/#review34061 ----------------------------------------------------------- On Feb. 17, 2014, 4:25 a.m., Daniel Dai wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/17878/ > ----------------------------------------------------------- > > (Updated Feb. 17, 2014, 4:25 a.m.) > > > Review request for pig, Cheolsoo Park and Rohini Palaniswamy. > > > Bugs: PIG-3757 > https://issues.apache.org/jira/browse/PIG-3757 > > > Repository: pig > > > Description > ------- > > See PIG-3757 > > > Diffs > ----- > > > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java > 1567297 > > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java > 1567297 > > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java > 1567297 > > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/MultiQueryOptimizerTez.java > PRE-CREATION > > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java > 1567297 > > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/SecondaryKeyOptimizerTez.java > 1567297 > > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java > 1567297 > > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java > 1567297 > > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java > 1567297 > > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java > 1567297 > > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java > 1567297 > branches/tez/src/org/apache/pig/impl/builtin/ReadScalarsTez.java > PRE-CREATION > branches/tez/src/org/apache/pig/newplan/logical/visitor/ScalarVisitor.java > 1567297 > > Diff: https://reviews.apache.org/r/17878/diff/ > > > Testing > ------- > > > Thanks, > > Daniel Dai > >
