[ 
https://issues.apache.org/jira/browse/SPARK-27815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-27815:
---------------------------------
    Description: The current catalyst optimizer's predicate pushdown is divided 
into two separate rules: PushDownPredicate and PushThroughJoin. This is not 
efficient for optimizing cascading joins such as TPC-DS q64, where a whole 
default batch is re-executed just due to this. We need a more efficient 
approach to pushdown predicate as much as possible in a single pass.  (was: 
Currently there is a hack in `DataFrameWriter`, which passes `SaveMode` to file 
source v2. This should be removed and file source v2 should not accept 
SaveMode.)

> do not leak SaveMode to file source v2
> --------------------------------------
>
>                 Key: SPARK-27815
>                 URL: https://issues.apache.org/jira/browse/SPARK-27815
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Wenchen Fan
>            Priority: Blocker
>
> The current catalyst optimizer's predicate pushdown is divided into two 
> separate rules: PushDownPredicate and PushThroughJoin. This is not efficient 
> for optimizing cascading joins such as TPC-DS q64, where a whole default 
> batch is re-executed just due to this. We need a more efficient approach to 
> pushdown predicate as much as possible in a single pass.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to