-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34006/
-----------------------------------------------------------

(Updated May 8, 2015, 5:20 p.m.)


Review request for drill and Aman Sinha.


Changes
-------

code cleanup.


Repository: drill-git


Description
-------

Drill current use VolcanoPlanner in join planning. This planner has two known 
issues:

1. The search space is increased exponentially with increased # of tables 
joined. If query has more than > 10 tables join, the planning time itself could 
be minutes, if not longer.

2. Drill did not enable a rule to swap both sides of join, due to the search 
space problem. We only do a swap join afterwards. See DRILL-2236. This means 
the join order chosen by Drill's VolcanoPlanner might not be optimal.

To address the above two issues, we are going to provide another planner for 
the purpose of join ordering planning. This planner will use a different 
optimization rules, and the search space is not increased exponentially with # 
of table. 

The main logic of this new planner:
1) Let VolcanoPlanner do all the rule transformations same as the current 
planner's logical planning, except for the join permutation rule.
2) After that, pass to HepPlanner with Calcite LOPT optimization rule, to let 
it do the join ordering. Feed with the HepPlanner with Drill's 
RelMetaDataProvider, to leverage the statistics (rowcount) available in Drill's 
table/files. 
3) Continue with the same physical planning as before.

With the limited statistics available in Drill, the new planner seems to 
produce better query plan than the current, for several TPCH queries. 

Preliminary performance results show this planner run faster than the existing 
one, and the join plan seems to be same or better than the plan chosen by the 
existing planner. 

Will update more in detail about the comparison.


Diffs (updated)
-----

  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillJoinRelBase.java
 5ab416c 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillProjectRelBase.java
 42ef6ac 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillDefaultRelMetadataProvider.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/cost/DrillRelMdDistinctRowCount.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillFilterRel.java
 dbd08f4 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillJoinRel.java
 dcccdb0 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillProjectRel.java
 6e132aa 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushProjIntoScan.java
 2981de8 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRelFactories.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java
 53e1bff 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
 7d8dd97 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java
 3c78c08 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java
 eda1b5f 
  
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
 4d8b034 

Diff: https://reviews.apache.org/r/34006/diff/


Testing
-------

Unit test / Regression suite.


Thanks,

Jinfeng Ni

Reply via email to