Github user ioana-delaney commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17546#discussion_r110740755
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
 ---
    @@ -54,8 +54,6 @@ case class CostBasedJoinReorder(conf: SQLConf) extends 
Rule[LogicalPlan] with Pr
     
       private def reorder(plan: LogicalPlan, output: Seq[Attribute]): 
LogicalPlan = {
         val (items, conditions) = extractInnerJoins(plan)
    -    // TODO: Compute the set of star-joins and use them in the join 
enumeration
    -    // algorithm to prune un-optimal plan choices.
    --- End diff --
    
    @cloud-fan Star-schema detection is first called to compute the set of 
tables connected by star-schema relationship e.g. {F1, D1, D2} in our code 
example. This call does not do any join reordering among the tables. It simply 
computes the set of tables in a star-schema relationship. Then, DP join 
enumeration generates all possible plan combinations among the entire set of 
tables in a the join e.g. {F1, D1}, {F1, T1}, {T2, T3}, etc. Star-filter, if 
called, will eliminate plan combinations among the star and non-star tables 
until the star join combinations are built. For example, {F1, D1} combination 
will be retained since it involves tables in a star schema, but {F1, T1} will 
be eliminated since it mixes star and non-star tables. Star-filter simply 
decides what combinations to retain but it will not decide on the order of 
execution of those tables. The order of the joins within a star-join and for 
the overall plan is decided by the DP join enumeration. Star-filter only 
ensures that
  tables in a star-join are planned together. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to