----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34006/ -----------------------------------------------------------
(Updated May 8, 2015, 5:20 p.m.) Review request for drill and Aman Sinha. Changes ------- code cleanup. Repository: drill-git Description ------- Drill current use VolcanoPlanner in join planning. This planner has two known issues: 1. The search space is increased exponentially with increased # of tables joined. If query has more than > 10 tables join, the planning time itself could be minutes, if not longer. 2. Drill did not enable a rule to swap both sides of join, due to the search space problem. We only do a swap join afterwards. See DRILL-2236. This means the join order chosen by Drill's VolcanoPlanner might not be optimal. To address the above two issues, we are going to provide another planner for the purpose of join ordering planning. This planner will use a different optimization rules, and the search space is not increased exponentially with # of table. The main logic of this new planner: 1) Let VolcanoPlanner do all the rule transformations same as the current planner's logical planning, except for the join permutation rule. 2) After that, pass to HepPlanner with Calcite LOPT optimization rule, to let it do the join ordering. Feed with the HepPlanner with Drill's RelMetaDataProvider, to leverage the statistics (rowcount) available in Drill's table/files. 3) Continue with the same physical planning as before. With the limited statistics available in Drill, the new planner seems to produce better query plan than the current, for several TPCH queries. Preliminary performance results show this planner run faster than the existing one, and the join plan seems to be same or better than the plan chosen by the existing planner. Will update more in detail about the comparison. Diffs (updated) ----- exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillJoinRelBase.java 5ab416c exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillProjectRelBase.java 42ef6ac exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillDefaultRelMetadataProvider.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillRelMdDistinctRowCount.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterRel.java dbd08f4 exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillJoinRel.java dcccdb0 exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillProjectRel.java 6e132aa exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushProjIntoScan.java 2981de8 exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRelFactories.java PRE-CREATION exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java 53e1bff exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java 7d8dd97 exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java 3c78c08 exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java eda1b5f exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java 4d8b034 Diff: https://reviews.apache.org/r/34006/diff/ Testing ------- Unit test / Regression suite. Thanks, Jinfeng Ni
