[jira] [Updated] (SPARK-36819) DPP: Don't insert redundant filters in case static partition pruning can be done

2021-09-21 Thread Swinky Mann (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-36819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swinky Mann updated SPARK-36819:

Description: 
Don't insert dynamic partition pruning filters in case the filters already 
referred statically. In case the filtering predicate on dimension table is in 
joinKey, no need to insert DPP filter in that case.

DPP is not required in this Sample query:
{code:java}
SELECT f.date_id, f.pid, f.sid FROM
 (select date_id, product_id as pid, store_id as sid from fact_stats) as f
 JOIN dim_stats s
 ON f.sid = s.store_id WHERE s.store_id = 3{code}

  was:
Don't insert dynamic partition pruning filters in case the filters already 
referred statically. In case the filtering predicate on dimension table is in 
joinKey, no need to insert DPP filter in that case.

DPP is not required in this Sample query:
 {{SELECT f.date_id, f.pid, f.sid FROM
(select date_id, product_id as pid, store_id as sid from fact_stats) as f
JOIN dim_stats s
ON f.sid = s.store_id WHERE s.store_id = 3}}


> DPP: Don't insert redundant filters in case static partition pruning can be 
> done
> 
>
> Key: SPARK-36819
> URL: https://issues.apache.org/jira/browse/SPARK-36819
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.1.2
>Reporter: Swinky Mann
>Priority: Minor
>
> Don't insert dynamic partition pruning filters in case the filters already 
> referred statically. In case the filtering predicate on dimension table is in 
> joinKey, no need to insert DPP filter in that case.
> DPP is not required in this Sample query:
> {code:java}
> SELECT f.date_id, f.pid, f.sid FROM
>  (select date_id, product_id as pid, store_id as sid from fact_stats) as f
>  JOIN dim_stats s
>  ON f.sid = s.store_id WHERE s.store_id = 3{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-36819) DPP: Don't insert redundant filters in case static partition pruning can be done

2021-09-21 Thread Swinky Mann (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-36819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17418300#comment-17418300
 ] 

Swinky Mann commented on SPARK-36819:
-

https://github.com/apache/spark/pull/34062

> DPP: Don't insert redundant filters in case static partition pruning can be 
> done
> 
>
> Key: SPARK-36819
> URL: https://issues.apache.org/jira/browse/SPARK-36819
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.1.2
>Reporter: Swinky Mann
>Priority: Minor
>
> Don't insert dynamic partition pruning filters in case the filters already 
> referred statically. In case the filtering predicate on dimension table is in 
> joinKey, no need to insert DPP filter in that case.
> DPP is not required in this Sample query:
>  {{SELECT f.date_id, f.pid, f.sid FROM
> (select date_id, product_id as pid, store_id as sid from fact_stats) as f
> JOIN dim_stats s
> ON f.sid = s.store_id WHERE s.store_id = 3}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-36819) DPP: Don't insert redundant filters in case static partition pruning can be done

2021-09-21 Thread Swinky Mann (Jira)
Swinky Mann created SPARK-36819:
---

 Summary: DPP: Don't insert redundant filters in case static 
partition pruning can be done
 Key: SPARK-36819
 URL: https://issues.apache.org/jira/browse/SPARK-36819
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.1.2
Reporter: Swinky Mann


Don't insert dynamic partition pruning filters in case the filters already 
referred statically. In case the filtering predicate on dimension table is in 
joinKey, no need to insert DPP filter in that case.

DPP is not required in this Sample query:
 {{SELECT f.date_id, f.pid, f.sid FROM
(select date_id, product_id as pid, store_id as sid from fact_stats) as f
JOIN dim_stats s
ON f.sid = s.store_id WHERE s.store_id = 3}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-35911) DPP: Update exprId for IN subquery

2021-06-27 Thread Swinky Mann (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-35911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swinky Mann updated SPARK-35911:

Summary: DPP: Update exprId for IN subquery  (was: DPP: Update exprId for 
in subquery)

> DPP: Update exprId for IN subquery
> --
>
> Key: SPARK-35911
> URL: https://issues.apache.org/jira/browse/SPARK-35911
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.1.2
>Reporter: Swinky Mann
>Priority: Minor
>
> Change exprId for IN subquery for DPP in executed plan; to have same expr Id 
> as DynamicPruning filter in optimized plan.
> This minor change shall make debugging easier in complex queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-35911) DPP: Fix exprId for in subquery

2021-06-27 Thread Swinky Mann (Jira)
Swinky Mann created SPARK-35911:
---

 Summary: DPP: Fix exprId for in subquery
 Key: SPARK-35911
 URL: https://issues.apache.org/jira/browse/SPARK-35911
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.1.2
Reporter: Swinky Mann


Change exprId for IN subquery for DPP in executed plan; to have same expr Id as 
DynamicPruning filter in optimized plan.
This minor change shall make debugging easier in complex queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-35911) DPP: Update exprId for in subquery

2021-06-27 Thread Swinky Mann (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-35911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swinky Mann updated SPARK-35911:

Summary: DPP: Update exprId for in subquery  (was: DPP: Fix exprId for in 
subquery)

> DPP: Update exprId for in subquery
> --
>
> Key: SPARK-35911
> URL: https://issues.apache.org/jira/browse/SPARK-35911
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.1.2
>Reporter: Swinky Mann
>Priority: Minor
>
> Change exprId for IN subquery for DPP in executed plan; to have same expr Id 
> as DynamicPruning filter in optimized plan.
> This minor change shall make debugging easier in complex queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-34598) RewritePredicateSubquery Rule must not update Filters without subqueries

2021-03-02 Thread Swinky Mann (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17293930#comment-17293930
 ] 

Swinky Mann commented on SPARK-34598:
-

I am working on this, shall submit a PR.

> RewritePredicateSubquery Rule must not update Filters without subqueries
> 
>
> Key: SPARK-34598
> URL: https://issues.apache.org/jira/browse/SPARK-34598
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.1.1
>Reporter: Swinky Mann
>Priority: Minor
>
> 1. Currently RewritePredicateSubquery rule updates Filter node for queries 
> without any subquery as well. This shouldn't happen. 
> 2. Also `Filter(conditions.reduce(And), child)` in the rule might create a 
> skewed expression tree even though original expression is  balanced.
>  
> {noformat}
> === Applying Rule 
> org.apache.spark.sql.catalyst.optimizer.RewritePredicateSubquery ===
>  Project [a#0]Project 
> [a#0]
> !+- Filter (((a#0 > 1) OR (b#1 > 2)) AND ((c#2 > 1) AND (d#3 > 2)))   +- 
> Filter a#0 > 1) OR (b#1 > 2)) AND (c#2 > 1)) AND (d#3 > 2))
> +- LocalRelation , [a#0, b#1, c#2, d#3]   +- 
> LocalRelation , [a#0, b#1, c#2, d#3]{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-34598) RewritePredicateSubquery Rule must not update Filters without subqueries

2021-03-02 Thread Swinky Mann (Jira)
Swinky Mann created SPARK-34598:
---

 Summary: RewritePredicateSubquery Rule must not update Filters 
without subqueries
 Key: SPARK-34598
 URL: https://issues.apache.org/jira/browse/SPARK-34598
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.1.1
Reporter: Swinky Mann


1. Currently RewritePredicateSubquery rule updates Filter node for queries 
without any subquery as well. This shouldn't happen. 
2. Also `Filter(conditions.reduce(And), child)` in the rule might create a 
skewed expression tree even though original expression is  balanced.

 
{noformat}
=== Applying Rule 
org.apache.spark.sql.catalyst.optimizer.RewritePredicateSubquery ===
 Project [a#0]Project 
[a#0]
!+- Filter (((a#0 > 1) OR (b#1 > 2)) AND ((c#2 > 1) AND (d#3 > 2)))   +- Filter 
a#0 > 1) OR (b#1 > 2)) AND (c#2 > 1)) AND (d#3 > 2))
+- LocalRelation , [a#0, b#1, c#2, d#3]   +- 
LocalRelation , [a#0, b#1, c#2, d#3]{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-34222) Enhance Boolean Simplification Rule

2021-01-25 Thread Swinky Mann (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17271214#comment-17271214
 ] 

Swinky Mann commented on SPARK-34222:
-

https://github.com/apache/spark/pull/31318

> Enhance Boolean Simplification Rule
> ---
>
> Key: SPARK-34222
> URL: https://issues.apache.org/jira/browse/SPARK-34222
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.1
>Reporter: Swinky Mann
>Priority: Minor
>
> Enhance boolean simplification rule by handling following scenarios:
>  # (((a && b) && a && (a && c))) => a && b && c)
>  # (((a || b) || a || (a || c))) => a || b || c



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-34222) Enhance Boolean Simplification Rule

2021-01-25 Thread Swinky Mann (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-34222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17271208#comment-17271208
 ] 

Swinky Mann commented on SPARK-34222:
-

I am working on this to open a PR.

> Enhance Boolean Simplification Rule
> ---
>
> Key: SPARK-34222
> URL: https://issues.apache.org/jira/browse/SPARK-34222
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.1
>Reporter: Swinky Mann
>Priority: Minor
>
> Enhance boolean simplification rule by handling following scenarios:
>  # (((a && b) && a && (a && c))) => a && b && c)
>  # (((a || b) || a || (a || c))) => a || b || c



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-34222) Enhance Boolean Simplification Rule

2021-01-25 Thread Swinky Mann (Jira)
Swinky Mann created SPARK-34222:
---

 Summary: Enhance Boolean Simplification Rule
 Key: SPARK-34222
 URL: https://issues.apache.org/jira/browse/SPARK-34222
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.0.1
Reporter: Swinky Mann


Enhance boolean simplification rule by handling following scenarios:
 # (((a && b) && a && (a && c))) => a && b && c)
 # (((a || b) || a || (a || c))) => a || b || c



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org