[jira] [Updated] (SPARK-36819) DPP: Don't insert redundant filters in case static partition pruning can be done
[ https://issues.apache.org/jira/browse/SPARK-36819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swinky Mann updated SPARK-36819: Description: Don't insert dynamic partition pruning filters in case the filters already referred statically. In case the filtering predicate on dimension table is in joinKey, no need to insert DPP filter in that case. DPP is not required in this Sample query: {code:java} SELECT f.date_id, f.pid, f.sid FROM (select date_id, product_id as pid, store_id as sid from fact_stats) as f JOIN dim_stats s ON f.sid = s.store_id WHERE s.store_id = 3{code} was: Don't insert dynamic partition pruning filters in case the filters already referred statically. In case the filtering predicate on dimension table is in joinKey, no need to insert DPP filter in that case. DPP is not required in this Sample query: {{SELECT f.date_id, f.pid, f.sid FROM (select date_id, product_id as pid, store_id as sid from fact_stats) as f JOIN dim_stats s ON f.sid = s.store_id WHERE s.store_id = 3}} > DPP: Don't insert redundant filters in case static partition pruning can be > done > > > Key: SPARK-36819 > URL: https://issues.apache.org/jira/browse/SPARK-36819 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.2 >Reporter: Swinky Mann >Priority: Minor > > Don't insert dynamic partition pruning filters in case the filters already > referred statically. In case the filtering predicate on dimension table is in > joinKey, no need to insert DPP filter in that case. > DPP is not required in this Sample query: > {code:java} > SELECT f.date_id, f.pid, f.sid FROM > (select date_id, product_id as pid, store_id as sid from fact_stats) as f > JOIN dim_stats s > ON f.sid = s.store_id WHERE s.store_id = 3{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36819) DPP: Don't insert redundant filters in case static partition pruning can be done
[ https://issues.apache.org/jira/browse/SPARK-36819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17418300#comment-17418300 ] Swinky Mann commented on SPARK-36819: - https://github.com/apache/spark/pull/34062 > DPP: Don't insert redundant filters in case static partition pruning can be > done > > > Key: SPARK-36819 > URL: https://issues.apache.org/jira/browse/SPARK-36819 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.2 >Reporter: Swinky Mann >Priority: Minor > > Don't insert dynamic partition pruning filters in case the filters already > referred statically. In case the filtering predicate on dimension table is in > joinKey, no need to insert DPP filter in that case. > DPP is not required in this Sample query: > {{SELECT f.date_id, f.pid, f.sid FROM > (select date_id, product_id as pid, store_id as sid from fact_stats) as f > JOIN dim_stats s > ON f.sid = s.store_id WHERE s.store_id = 3}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36819) DPP: Don't insert redundant filters in case static partition pruning can be done
Swinky Mann created SPARK-36819: --- Summary: DPP: Don't insert redundant filters in case static partition pruning can be done Key: SPARK-36819 URL: https://issues.apache.org/jira/browse/SPARK-36819 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.1.2 Reporter: Swinky Mann Don't insert dynamic partition pruning filters in case the filters already referred statically. In case the filtering predicate on dimension table is in joinKey, no need to insert DPP filter in that case. DPP is not required in this Sample query: {{SELECT f.date_id, f.pid, f.sid FROM (select date_id, product_id as pid, store_id as sid from fact_stats) as f JOIN dim_stats s ON f.sid = s.store_id WHERE s.store_id = 3}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35911) DPP: Update exprId for IN subquery
[ https://issues.apache.org/jira/browse/SPARK-35911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swinky Mann updated SPARK-35911: Summary: DPP: Update exprId for IN subquery (was: DPP: Update exprId for in subquery) > DPP: Update exprId for IN subquery > -- > > Key: SPARK-35911 > URL: https://issues.apache.org/jira/browse/SPARK-35911 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.2 >Reporter: Swinky Mann >Priority: Minor > > Change exprId for IN subquery for DPP in executed plan; to have same expr Id > as DynamicPruning filter in optimized plan. > This minor change shall make debugging easier in complex queries. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35911) DPP: Fix exprId for in subquery
Swinky Mann created SPARK-35911: --- Summary: DPP: Fix exprId for in subquery Key: SPARK-35911 URL: https://issues.apache.org/jira/browse/SPARK-35911 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.1.2 Reporter: Swinky Mann Change exprId for IN subquery for DPP in executed plan; to have same expr Id as DynamicPruning filter in optimized plan. This minor change shall make debugging easier in complex queries. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35911) DPP: Update exprId for in subquery
[ https://issues.apache.org/jira/browse/SPARK-35911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swinky Mann updated SPARK-35911: Summary: DPP: Update exprId for in subquery (was: DPP: Fix exprId for in subquery) > DPP: Update exprId for in subquery > -- > > Key: SPARK-35911 > URL: https://issues.apache.org/jira/browse/SPARK-35911 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.2 >Reporter: Swinky Mann >Priority: Minor > > Change exprId for IN subquery for DPP in executed plan; to have same expr Id > as DynamicPruning filter in optimized plan. > This minor change shall make debugging easier in complex queries. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-34598) RewritePredicateSubquery Rule must not update Filters without subqueries
[ https://issues.apache.org/jira/browse/SPARK-34598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293930#comment-17293930 ] Swinky Mann commented on SPARK-34598: - I am working on this, shall submit a PR. > RewritePredicateSubquery Rule must not update Filters without subqueries > > > Key: SPARK-34598 > URL: https://issues.apache.org/jira/browse/SPARK-34598 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.1 >Reporter: Swinky Mann >Priority: Minor > > 1. Currently RewritePredicateSubquery rule updates Filter node for queries > without any subquery as well. This shouldn't happen. > 2. Also `Filter(conditions.reduce(And), child)` in the rule might create a > skewed expression tree even though original expression is balanced. > > {noformat} > === Applying Rule > org.apache.spark.sql.catalyst.optimizer.RewritePredicateSubquery === > Project [a#0]Project > [a#0] > !+- Filter (((a#0 > 1) OR (b#1 > 2)) AND ((c#2 > 1) AND (d#3 > 2))) +- > Filter a#0 > 1) OR (b#1 > 2)) AND (c#2 > 1)) AND (d#3 > 2)) > +- LocalRelation , [a#0, b#1, c#2, d#3] +- > LocalRelation , [a#0, b#1, c#2, d#3]{noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-34598) RewritePredicateSubquery Rule must not update Filters without subqueries
Swinky Mann created SPARK-34598: --- Summary: RewritePredicateSubquery Rule must not update Filters without subqueries Key: SPARK-34598 URL: https://issues.apache.org/jira/browse/SPARK-34598 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.1.1 Reporter: Swinky Mann 1. Currently RewritePredicateSubquery rule updates Filter node for queries without any subquery as well. This shouldn't happen. 2. Also `Filter(conditions.reduce(And), child)` in the rule might create a skewed expression tree even though original expression is balanced. {noformat} === Applying Rule org.apache.spark.sql.catalyst.optimizer.RewritePredicateSubquery === Project [a#0]Project [a#0] !+- Filter (((a#0 > 1) OR (b#1 > 2)) AND ((c#2 > 1) AND (d#3 > 2))) +- Filter a#0 > 1) OR (b#1 > 2)) AND (c#2 > 1)) AND (d#3 > 2)) +- LocalRelation , [a#0, b#1, c#2, d#3] +- LocalRelation , [a#0, b#1, c#2, d#3]{noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-34222) Enhance Boolean Simplification Rule
[ https://issues.apache.org/jira/browse/SPARK-34222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17271214#comment-17271214 ] Swinky Mann commented on SPARK-34222: - https://github.com/apache/spark/pull/31318 > Enhance Boolean Simplification Rule > --- > > Key: SPARK-34222 > URL: https://issues.apache.org/jira/browse/SPARK-34222 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.1 >Reporter: Swinky Mann >Priority: Minor > > Enhance boolean simplification rule by handling following scenarios: > # (((a && b) && a && (a && c))) => a && b && c) > # (((a || b) || a || (a || c))) => a || b || c -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-34222) Enhance Boolean Simplification Rule
[ https://issues.apache.org/jira/browse/SPARK-34222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17271208#comment-17271208 ] Swinky Mann commented on SPARK-34222: - I am working on this to open a PR. > Enhance Boolean Simplification Rule > --- > > Key: SPARK-34222 > URL: https://issues.apache.org/jira/browse/SPARK-34222 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.1 >Reporter: Swinky Mann >Priority: Minor > > Enhance boolean simplification rule by handling following scenarios: > # (((a && b) && a && (a && c))) => a && b && c) > # (((a || b) || a || (a || c))) => a || b || c -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-34222) Enhance Boolean Simplification Rule
Swinky Mann created SPARK-34222: --- Summary: Enhance Boolean Simplification Rule Key: SPARK-34222 URL: https://issues.apache.org/jira/browse/SPARK-34222 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.0.1 Reporter: Swinky Mann Enhance boolean simplification rule by handling following scenarios: # (((a && b) && a && (a && c))) => a && b && c) # (((a || b) || a || (a || c))) => a || b || c -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org