-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43498/
-----------------------------------------------------------
Review request for pig and Daniel Dai.
Bugs: PIG-4759
https://issues.apache.org/jira/browse/PIG-4759
Repository: pig
Description
-------
After the initial fix which changed to a separate DAG for LOAD after STORE in
TezCompiler, TestMultiQueryLocal.testStoreOrder broke.
Problem was split code was iterating over the operators picking the ones to be
split and splitting them. The dependencies were messed up and the resulting
plan created DAGs that were not correctly connected and there were two empty
DAGs which caused an exception. The new logic is as follows
1) Traverse the plan from root to leaves and find operators to segment
from the root.
2) Disconnect all its successors before moving into new plans. Previously
this was not done and due to dependencies the operToSegment was also moved to
new plan causing empty DAGs.
3) Move each of the successors into their new plans and recursively split
the successor. When successors are moved into new plans, they can also move
other successors along with them due to dependencies and with the recursive
splits they can end up in any container. So we scan the plan to see which
container they ended up and then connect with that to maintain the dependency
correctly.
4) Repeat steps 1 to 3 for remaining operators to segment in the initial
plan.
Diffs
-----
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPlanContainer.java
1729745
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPlanContainerPrinter.java
1729745
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPrinter.java
1729745
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/tools/pigstats/tez/TezDAGStats.java
1729745
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-1.gld
1729745
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-2-JDK7.gld
PRE-CREATION
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-2.gld
PRE-CREATION
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Native-1.gld
1729745
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezCompiler.java
1729745
Diff: https://reviews.apache.org/r/43498/diff/
Testing
-------
Added additional tests in TestTezCompiler and also printed more information in
the explain output to see the DAG dependency graph.
Was trying to get the ordering same for JDK7 and JDK8, but it was kind of
wasting a lot of time and I did not get to the bottom of it. So ended up doing
two different golden files. Will track that down sometime later to see if it
can be changed to make explain output same for JDK7 and JDK8.
Thanks,
Rohini Palaniswamy