[
https://issues.apache.org/jira/browse/SPARK-27815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon updated SPARK-27815:
---------------------------------
Description: The current catalyst optimizer's predicate pushdown is divided
into two separate rules: PushDownPredicate and PushThroughJoin. This is not
efficient for optimizing cascading joins such as TPC-DS q64, where a whole
default batch is re-executed just due to this. We need a more efficient
approach to pushdown predicate as much as possible in a single pass. (was:
Currently there is a hack in `DataFrameWriter`, which passes `SaveMode` to file
source v2. This should be removed and file source v2 should not accept
SaveMode.)
> do not leak SaveMode to file source v2
> --------------------------------------
>
> Key: SPARK-27815
> URL: https://issues.apache.org/jira/browse/SPARK-27815
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.0.0
> Reporter: Wenchen Fan
> Priority: Blocker
>
> The current catalyst optimizer's predicate pushdown is divided into two
> separate rules: PushDownPredicate and PushThroughJoin. This is not efficient
> for optimizing cascading joins such as TPC-DS q64, where a whole default
> batch is re-executed just due to this. We need a more efficient approach to
> pushdown predicate as much as possible in a single pass.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]