[ 
https://issues.apache.org/jira/browse/SPARK-20366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenhua Wang updated SPARK-20366:
---------------------------------
    Description: 
If a plan has multi-level successive joins, e.g.:

         Join
         /     \
     Union  t5
      /    \
    Join t4
    /    \
  Join t3
  /  \
 t1  t2

Currently we fail to reorder the inside joins, i.e. t1, t2, t3.

In join reorder, we use `OrderedJoin` to indicate a join has been ordered, such 
that when transforming down the plan, these joins don't need to be rerodered 
again.

But there's a problem in the definition of `OrderedJoin`:
The real join node is a parameter, but not its child. This breaks the transform 
procedure because `mapChildren` applies transform function on parameters which 
should be children.


  was:
If a plan has multi-level successive joins, e.g.:
```
         Join
         /     \
     Union  t5
      /    \
    Join t4
    /    \
  Join t3
  /  \
 t1  t2
```
Currently we fail to reorder the inside joins, i.e. t1, t2, t3.

In join reorder, we use `OrderedJoin` to indicate a join has been ordered, such 
that when transforming down the plan, these joins don't need to be rerodered 
again.

But there's a problem in the definition of `OrderedJoin`:
The real join node is a parameter, but not its child. This breaks the transform 
procedure because `mapChildren` applies transform function on parameters which 
should be children.



> Fix recursive join reordering: inside joins are not reordered
> -------------------------------------------------------------
>
>                 Key: SPARK-20366
>                 URL: https://issues.apache.org/jira/browse/SPARK-20366
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Zhenhua Wang
>
> If a plan has multi-level successive joins, e.g.:
>          Join
>          /     \
>      Union  t5
>       /    \
>     Join t4
>     /    \
>   Join t3
>   /  \
>  t1  t2
> Currently we fail to reorder the inside joins, i.e. t1, t2, t3.
> In join reorder, we use `OrderedJoin` to indicate a join has been ordered, 
> such that when transforming down the plan, these joins don't need to be 
> rerodered again.
> But there's a problem in the definition of `OrderedJoin`:
> The real join node is a parameter, but not its child. This breaks the 
> transform procedure because `mapChildren` applies transform function on 
> parameters which should be children.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to