[GitHub] spark issue #20868: [SPARK-23750][SQL] Inner Join Elimination based on Infor...

2018-03-20 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/20868 Note to the reviewers: This performance PR contains two commits: (1) dependent DDL changes from SPARK-21784 and (2) the actual rewrite changes. The DDL changes should be reviewed as part

[GitHub] spark pull request #20868: [SPARK-23750][SQL] Inner Join Elimination based o...

2018-03-20 Thread ioana-delaney
GitHub user ioana-delaney opened a pull request: https://github.com/apache/spark/pull/20868 [SPARK-23750][SQL] Inner Join Elimination based on Informational RI constraints ## What changes were proposed in this pull request? This transformation detects RI joins

[GitHub] spark issue #18994: [SPARK-21784][SQL] Adds support for defining information...

2018-03-10 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/18994 @sureshthalamati Hi Suresh, We are planning to proceed with the performance improvements. Will you be able to continue working on this PR? Thanks

[GitHub] spark pull request #13867: [SPARK-16161][SQL] Ambiguous error message for un...

2017-06-28 Thread ioana-delaney
Github user ioana-delaney closed the pull request at: https://github.com/apache/spark/pull/13867 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #13867: [SPARK-16161][SQL] Ambiguous error message for unsupport...

2017-06-28 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/13867 With the ongoing changes to the subquery design, queries with deep correlation will return more meaningful errors. For example, the above mentioned query will issue the following error

[GitHub] spark issue #13867: [SPARK-16161][SQL] Ambiguous error message for unsupport...

2017-06-26 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/13867 @gatorsmile There were changes to the subquery design since this PR was implemented, including changes in the error propagation. I will need to reinvestigate how this PR fits into the new sub

[GitHub] spark issue #17546: [SPARK-20233] [SQL] Apply star-join filter heuristics to...

2017-04-13 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/17546 @cloud-fan Thank you for merging! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-12 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r94456 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -327,3 +349,109 @@ object

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-12 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r92785 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -327,3 +349,109 @@ object

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-12 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r92197 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -150,12 +148,15 @@ object

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-12 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r91433 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -54,8 +54,6 @@ case class

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-11 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r111063912 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -54,8 +54,6 @@ case class

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-11 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110989528 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/StarJoinCostBasedReorderSuite.scala --- @@ -0,0 +1,426

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-11 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110988847 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/StarSchemaDetection.scala --- @@ -76,7 +76,7 @@ case class

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-11 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110987936 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -218,28 +220,48 @@ object

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-11 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110987408 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -327,3 +349,110 @@ object

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-11 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110987294 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/StarJoinCostBasedReorderSuite.scala --- @@ -0,0 +1,428

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-11 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110987218 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -218,28 +220,48 @@ object

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-11 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110984951 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -54,8 +54,6 @@ case class

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-10 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110742657 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -218,28 +220,44 @@ object

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-10 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110742361 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -327,3 +345,104 @@ object

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-10 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110741246 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -327,3 +345,104 @@ object

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-10 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110740755 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -54,8 +54,6 @@ case class

[GitHub] spark issue #17546: [SPARK-20233] [SQL] Apply star-join filter heuristics to...

2017-04-08 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/17546 @cloud-fan Do you have any comments? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-08 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110522547 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -54,14 +54,12 @@ case class

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-07 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110495345 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -736,6 +736,12 @@ object SQLConf { .checkValue

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-07 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110466604 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -736,6 +736,12 @@ object SQLConf { .checkValue

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-07 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110438558 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -736,6 +736,12 @@ object SQLConf { .checkValue

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-07 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110437532 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -134,7 +132,7 @@ case class

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-06 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110318802 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -134,7 +132,7 @@ case class

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-06 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110318621 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -736,6 +736,12 @@ object SQLConf { .checkValue

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-06 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110318101 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -327,3 +345,104 @@ object

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-06 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110314839 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/StarJoinCostBasedReorderSuite.scala --- @@ -0,0 +1,428

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-06 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110314588 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/StarJoinCostBasedReorderSuite.scala --- @@ -0,0 +1,428

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-06 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110314369 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -736,6 +736,12 @@ object SQLConf { .checkValue

[GitHub] spark issue #17546: [SPARK-20233] [SQL] Apply star-join filter heuristics to...

2017-04-06 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/17546 @wzhfy Yes, star-schema is called from both ```ReorderJoin``` and ```CostBasedJoinReorder```. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-06 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110221774 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/StarJoinCostBasedReorderSuite.scala --- @@ -0,0 +1,426

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-06 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110213152 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/StarJoinCostBasedReorderSuite.scala --- @@ -0,0 +1,426

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-06 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110212976 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/StarJoinCostBasedReorderSuite.scala --- @@ -0,0 +1,426

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-06 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110211908 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -327,3 +345,104 @@ object

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-06 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110211689 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -327,3 +345,104 @@ object

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-06 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110211394 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -327,3 +345,104 @@ object

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-06 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110210819 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -327,3 +345,104 @@ object

[GitHub] spark issue #17546: [SPARK-20233] [SQL] Apply star-join filter heuristics to...

2017-04-05 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/17546 @wzhfy @gatorsmile @cloud-fan I've integrated star-join with join enumeration. Would you please take a look? Thanks. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-05 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110076651 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -150,12 +148,15 @@ object

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-05 Thread ioana-delaney
GitHub user ioana-delaney opened a pull request: https://github.com/apache/spark/pull/17546 [SPARK-20233] [SQL] Apply star-join filter heuristics to dynamic programming join enumeration ## What changes were proposed in this pull request? Implements star-join filter

[GitHub] spark issue #17544: [SPARK-20231] [SQL] Refactor star schema code for the su...

2017-04-05 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/17544 @gatorsmile Thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17544: [SPARK-20231] [SQL] Refactor star schema code for...

2017-04-05 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17544#discussion_r11001 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,339 +20,13 @@ package

[GitHub] spark issue #17544: [SPARK-20231] [SQL] Refactor star schema code for the su...

2017-04-05 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/17544 @gatorsmile I did a small refactoring for star schema. Would you please review. Thank you. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request #17544: [SPARK-20231] [SQL] Refactor star schema code for...

2017-04-05 Thread ioana-delaney
GitHub user ioana-delaney opened a pull request: https://github.com/apache/spark/pull/17544 [SPARK-20231] [SQL] Refactor star schema code for the subsequent star join detection in CBO ## What changes were proposed in this pull request? This commit moves star schema code

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-20 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r107019056 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/StarJoinReorderSuite.scala --- @@ -0,0 +1,580

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-20 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r107018483 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/StarJoinReorderSuite.scala --- @@ -0,0 +1,580

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-20 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r107018102 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,340 @@ package

[GitHub] spark issue #15363: [SPARK-17791][SQL] Join reordering using star schema det...

2017-03-19 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/15363 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #15363: [SPARK-17791][SQL] Join reordering using star schema det...

2017-03-19 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/15363 @gatorsmile @cloud-fan I rewrote the test cases to align to the join reorder suite. Please take a look. Thanks. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-19 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106828049 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,347 @@ package

[GitHub] spark issue #15363: [SPARK-17791][SQL] Join reordering using star schema det...

2017-03-18 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/15363 @gatorsmile Thank you. It fails on a clean build as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-18 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106794032 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,340 @@ package

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-18 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106793898 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,347 @@ package

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-18 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106791067 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/StarJoinSuite.scala --- @@ -0,0 +1,488 @@ +/* + * Licensed

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-18 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106790932 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,347 @@ package

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-18 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106790507 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SimpleCatalystConf.scala --- @@ -40,6 +40,9 @@ case class SimpleCatalystConf

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-18 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106790475 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,347 @@ package

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-18 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106790425 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,347 @@ package

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-18 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106789403 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,347 @@ package

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-18 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106789422 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,347 @@ package

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-18 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106789293 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,347 @@ package

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-18 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106789103 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -83,9 +411,19 @@ object ReorderJoin extends Rule

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-18 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106789008 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,347 @@ package

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-18 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106788869 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,347 @@ package

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-18 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106788720 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,347 @@ package

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-18 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106788630 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,347 @@ package

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-18 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/17286 @gatorsmile Your example is correct. Given A J1 B J2 C: • level 0: (A), (B), (C) • level 1: {A, B}, ~{A, C}~, {B, C} • level 3: {A, B, C} Given A J1 B J2 C

[GitHub] spark issue #15363: [SPARK-17791][SQL] Join reordering using star schema det...

2017-03-18 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/15363 @cloud-fan Thank you for the comments. I am looking at them. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-18 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/17286 @wzhfy Given a set of input plans (either base table access or plans over derived/complex plans), one can build a graph based on the join conditions among the plans. I think join enumeration

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-17 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/17286 @wzhfy Some thoughts on how to solve the Cartesian problem as part of the join enumeration algorithm is to apply a similar strategy to the one that we discuss for star-plans. You keep track

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-17 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106767465 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,347 @@ package

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-17 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106706226 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala --- @@ -167,8 +167,8 @@ object

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-16 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106558961 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -51,6 +51,11 @@ case class

[GitHub] spark issue #15363: [SPARK-17791][SQL] Join reordering using star schema det...

2017-03-15 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/15363 @gatorsmile @hvanhovell @wzhfy @ron8hu Please let me know if we can move forward with this review. @wzhfy I removed the star-join call from CostBasedJoinReorder until the two are integrated

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-15 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106282966 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -51,6 +51,11 @@ case class

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-15 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106255224 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -51,6 +51,11 @@ case class

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-14 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106088842 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -51,6 +51,11 @@ case class

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-14 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106074890 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -51,6 +51,11 @@ case class

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-14 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106012628 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -51,6 +51,11 @@ case class

[GitHub] spark issue #15363: [SPARK-17791][SQL] Join reordering using star schema det...

2017-03-10 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/15363 @wzhfy I made some changes to call star join detection from ```CostBasedJoinReorder```. I didn’t integrate with the DP algorithm though. For now, I only made the star join available to cbo

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-10 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r105515039 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,340 @@ package

[GitHub] spark issue #15363: [SPARK-17791][SQL] Join reordering using star schema det...

2017-03-09 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/15363 @wzhfy Thank you, Zhenhua. With the cost based optimizer in place, yes, it make sense to only call Star schema detection in the context of the new algorithm. There are two parts

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-08 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r105089766 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,340 @@ package

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-08 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r105089751 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -389,6 +389,18 @@ object SQLConf { .booleanConf

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-08 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r105089699 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala --- @@ -167,7 +167,8 @@ object

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-08 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r105089676 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala --- @@ -257,3 +258,28 @@ object PhysicalAggregation

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-08 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r105044802 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala --- @@ -257,3 +258,28 @@ object PhysicalAggregation

[GitHub] spark issue #15363: [SPARK-17791][SQL] Join reordering using star schema det...

2017-03-08 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/15363 @wzhfy I've looked at the new CBO join reordering. The star schema detection can be used as follows: Assume a four-way join: A, B, C, D. Star schema join detection is called

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-08 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r105023746 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,340 @@ package

[GitHub] spark issue #15363: [SPARK-17791][SQL] Join reordering using star schema det...

2017-03-07 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/15363 @wzhfy Star schema detection works with both positional join and cost based optimizer as it finds relationships among the joined tables. So I don't see a major reason for postponing its

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-07 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r104825507 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,342 @@ package

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-07 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r104825494 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,342 @@ package

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-07 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r104825547 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,342 @@ package

[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...

2017-03-07 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r104825251 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,19 +20,342 @@ package

  1   2   >