Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/20868
Note to the reviewers: This performance PR contains two commits: (1)
dependent DDL changes from SPARK-21784 and (2) the actual rewrite changes. The
DDL changes should be reviewed as part of
GitHub user ioana-delaney opened a pull request:
https://github.com/apache/spark/pull/20868
[SPARK-23750][SQL] Inner Join Elimination based on Informational RI
constraints
## What changes were proposed in this pull request?
This transformation detects RI joins and
Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/18994
@sureshthalamati Hi Suresh, We are planning to proceed with the performance
improvements. Will you be able to continue working on this PR? Thanks
Github user ioana-delaney closed the pull request at:
https://github.com/apache/spark/pull/13867
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/13867
With the ongoing changes to the subquery design, queries with deep
correlation will return more meaningful errors. For example, the above
mentioned query will issue the following error
Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/13867
@gatorsmile There were changes to the subquery design since this PR was
implemented, including changes in the error propagation. I will need to
reinvestigate how this PR fits into the new sub
Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/17546
@cloud-fan Thank you for merging!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r94456
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -327,3 +349,109 @@ object
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r92785
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -327,3 +349,109 @@ object
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r92197
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -150,12 +148,15 @@ object
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r91433
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -54,8 +54,6 @@ case class
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r111063912
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -54,8 +54,6 @@ case class
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110989528
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/StarJoinCostBasedReorderSuite.scala
---
@@ -0,0 +1,426
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110988847
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/StarSchemaDetection.scala
---
@@ -76,7 +76,7 @@ case class
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110987936
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -218,28 +220,48 @@ object
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110987408
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -327,3 +349,110 @@ object
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110987294
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/StarJoinCostBasedReorderSuite.scala
---
@@ -0,0 +1,428
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110987218
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -218,28 +220,48 @@ object
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110984951
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -54,8 +54,6 @@ case class
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110742657
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -218,28 +220,44 @@ object
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110742361
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -327,3 +345,104 @@ object
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110741246
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -327,3 +345,104 @@ object
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110740755
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -54,8 +54,6 @@ case class
Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/17546
@cloud-fan Do you have any comments?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110522547
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -54,14 +54,12 @@ case class
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110495345
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -736,6 +736,12 @@ object SQLConf {
.checkValue
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110466604
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -736,6 +736,12 @@ object SQLConf {
.checkValue
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110438558
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -736,6 +736,12 @@ object SQLConf {
.checkValue
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110437532
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -134,7 +132,7 @@ case class
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110318802
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -134,7 +132,7 @@ case class
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110318621
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -736,6 +736,12 @@ object SQLConf {
.checkValue
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110318101
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -327,3 +345,104 @@ object
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110314839
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/StarJoinCostBasedReorderSuite.scala
---
@@ -0,0 +1,428
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110314588
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/StarJoinCostBasedReorderSuite.scala
---
@@ -0,0 +1,428
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110314369
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -736,6 +736,12 @@ object SQLConf {
.checkValue
Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/17546
@wzhfy Yes, star-schema is called from both ```ReorderJoin``` and
```CostBasedJoinReorder```.
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110221774
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/StarJoinCostBasedReorderSuite.scala
---
@@ -0,0 +1,426
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110213152
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/StarJoinCostBasedReorderSuite.scala
---
@@ -0,0 +1,426
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110212976
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/StarJoinCostBasedReorderSuite.scala
---
@@ -0,0 +1,426
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110211908
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -327,3 +345,104 @@ object
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110211689
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -327,3 +345,104 @@ object
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110211394
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -327,3 +345,104 @@ object
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110210819
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -327,3 +345,104 @@ object
Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/17546
@wzhfy @gatorsmile @cloud-fan I've integrated star-join with join
enumeration. Would you please take a look? Thanks.
---
If your project is set up for it, you can reply to this emai
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17546#discussion_r110076651
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -150,12 +148,15 @@ object
GitHub user ioana-delaney opened a pull request:
https://github.com/apache/spark/pull/17546
[SPARK-20233] [SQL] Apply star-join filter heuristics to dynamic
programming join enumeration
## What changes were proposed in this pull request?
Implements star-join filter to
Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/17544
@gatorsmile Thank you!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/17544#discussion_r11001
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,339 +20,13 @@ package
Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/17544
@gatorsmile I did a small refactoring for star schema. Would you please
review. Thank you.
---
If your project is set up for it, you can reply to this email and have your
reply appear on
GitHub user ioana-delaney opened a pull request:
https://github.com/apache/spark/pull/17544
[SPARK-20231] [SQL] Refactor star schema code for the subsequent star join
detection in CBO
## What changes were proposed in this pull request?
This commit moves star schema code
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r107019056
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/StarJoinReorderSuite.scala
---
@@ -0,0 +1,580
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r107018483
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/StarJoinReorderSuite.scala
---
@@ -0,0 +1,580
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r107018102
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,340 @@ package
Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/15363
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/15363
@gatorsmile @cloud-fan I rewrote the test cases to align to the join
reorder suite. Please take a look. Thanks.
---
If your project is set up for it, you can reply to this email and have
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106828049
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,347 @@ package
Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/15363
@gatorsmile Thank you. It fails on a clean build as well.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106794032
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,340 @@ package
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106793898
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,347 @@ package
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106791067
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/StarJoinSuite.scala
---
@@ -0,0 +1,488 @@
+/*
+ * Licensed to the
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106790932
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,347 @@ package
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106790507
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SimpleCatalystConf.scala
---
@@ -40,6 +40,9 @@ case class SimpleCatalystConf
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106790475
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,347 @@ package
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106790425
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,347 @@ package
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106789403
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,347 @@ package
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106789422
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,347 @@ package
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106789293
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,347 @@ package
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106789103
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -83,9 +411,19 @@ object ReorderJoin extends Rule
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106789008
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,347 @@ package
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106788869
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,347 @@ package
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106788720
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,347 @@ package
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106788630
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,347 @@ package
Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/17286
@gatorsmile Your example is correct.
Given A J1 B J2 C:
⢠level 0: (A), (B), (C)
⢠level 1: {A, B}, ~{A, C}~, {B, C}
⢠level 3: {A, B, C}
Given A J1 B J2 C
Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/15363
@cloud-fan Thank you for the comments. I am looking at them.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/17286
@wzhfy Given a set of input plans (either base table access or plans over
derived/complex plans), one can build a graph based on the join conditions
among the plans. I think join enumeration
Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/17286
@wzhfy Some thoughts on how to solve the Cartesian problem as part of the
join enumeration algorithm is to apply a similar strategy to the one that we
discuss for star-plans. You keep track
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106767465
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,347 @@ package
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106706226
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala
---
@@ -167,8 +167,8 @@ object
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106558961
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -51,6 +51,11 @@ case class
Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/15363
@gatorsmile @hvanhovell @wzhfy @ron8hu Please let me know if we can move
forward with this review. @wzhfy I removed the star-join call from
CostBasedJoinReorder until the two are integrated
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106282966
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -51,6 +51,11 @@ case class
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106255224
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -51,6 +51,11 @@ case class
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106088842
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -51,6 +51,11 @@ case class
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106074890
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -51,6 +51,11 @@ case class
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r106012628
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
---
@@ -51,6 +51,11 @@ case class
Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/15363
@wzhfy I made some changes to call star join detection from
```CostBasedJoinReorder```. I didnât integrate with the DP algorithm though.
For now, I only made the star join available to cbo
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r105515039
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,340 @@ package
Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/15363
@wzhfy Thank you, Zhenhua. With the cost based optimizer in place, yes, it
make sense to only call Star schema detection in the context of the new
algorithm.
There are two parts to
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r105089766
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,340 @@ package
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r105089751
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -389,6 +389,18 @@ object SQLConf {
.booleanConf
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r105089699
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala
---
@@ -167,7 +167,8 @@ object
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r105089676
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala
---
@@ -257,3 +258,28 @@ object PhysicalAggregation
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r105044802
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala
---
@@ -257,3 +258,28 @@ object PhysicalAggregation
Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/15363
@wzhfy I've looked at the new CBO join reordering. The star schema
detection can be used as follows:
Assume a four-way join: A, B, C, D.
Star schema join detection is c
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r105023746
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,340 @@ package
Github user ioana-delaney commented on the issue:
https://github.com/apache/spark/pull/15363
@wzhfy Star schema detection works with both positional join and cost based
optimizer as it finds relationships among the joined tables. So I don't see a
major reason for postponin
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r104825507
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,342 @@ package
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r104825494
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,342 @@ package
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r104825547
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,342 @@ package
Github user ioana-delaney commented on a diff in the pull request:
https://github.com/apache/spark/pull/15363#discussion_r104825251
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
---
@@ -20,19 +20,342 @@ package
1 - 100 of 142 matches
Mail list logo