[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-19 Thread wzhfy
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/17286 @ioana-delaney Thanks for the replies. > Given A J1 B J2 C: - case 1 • level 0: (A), (B), (C) • level 1: {A, B}, ~~{A, C}~~, {B, C} • level 3: {A, B, C} Given A J1 B J2

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-18 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17286 Right. I misread it. if there is no join predicate between a table and any cluster of tables, we should not consider that table in the join enumeration at all. We can simply push that table to be the

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-18 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/17286 @gatorsmile Your example is correct. Given A J1 B J2 C: • level 0: (A), (B), (C) • level 1: {A, B}, ~{A, C}~, {B, C} • level 3: {A, B, C} Given A J1 B J2 C

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17286 My example is not related to inequality join or equi join. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-18 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/17286 @gatorsmile An equality join in most cases has a better filtering than an inequality join. This can be used heuristically. However, this is not always true. An equality join can be a lookup join from

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17286 Given {A, B, C} with the join conditions {A.a = B.b, B.b > C.c}. `{A, C}` should be pruned. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-18 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/17286 @wzhfy Given a set of input plans (either base table access or plans over derived/complex plans), one can build a graph based on the join conditions among the plans. I think join enumeration

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-18 Thread wzhfy
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/17286 @ioana-delaney How can we get disconnected parts before reordering? They can not only be leaf tables (this case is easier to deal with), but also subplans (tables can be joined internally but among

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17286 thanks, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-17 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/17286 @wzhfy Some thoughts on how to solve the Cartesian problem as part of the join enumeration algorithm is to apply a similar strategy to the one that we discuss for star-plans. You keep track

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17286 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74741/ Test PASSed. ---

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17286 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17286 **[Test build #74741 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74741/testReport)** for PR 17286 at commit

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17286 **[Test build #74741 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74741/testReport)** for PR 17286 at commit

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17286 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17286 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17286 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74737/ Test FAILed. ---

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17286 **[Test build #74737 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74737/testReport)** for PR 17286 at commit

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17286 **[Test build #74737 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74737/testReport)** for PR 17286 at commit

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-16 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17286 LGTM except some minor comments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17286 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74656/ Test PASSed. ---

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17286 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17286 **[Test build #74656 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74656/testReport)** for PR 17286 at commit

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17286 **[Test build #74656 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74656/testReport)** for PR 17286 at commit

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-16 Thread wzhfy
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/17286 Co-existing cross joins (join without a condition) and inner joins make the reordering procedure cumbersome. On one hand, putting cross join candidates into memo is not good in terms of search

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17286 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74528/ Test PASSed. ---

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17286 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17286 **[Test build #74528 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74528/testReport)** for PR 17286 at commit

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17286 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74527/ Test FAILed. ---

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17286 Build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17286 **[Test build #74527 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74527/testReport)** for PR 17286 at commit

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17286 **[Test build #74528 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74528/testReport)** for PR 17286 at commit

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17286 **[Test build #74527 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74527/testReport)** for PR 17286 at commit

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17286 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17286 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74500/ Test PASSed. ---

[GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...

2017-03-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17286 **[Test build #74500 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74500/testReport)** for PR 17286 at commit