[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-08 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13765 thanks, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-08 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13765 Unfortunately I'm having network issue here and failed to fetch the branch from GitHub :( @cloud-fan Could you please help merge this one? Thanks. --- If your project is set up for it,

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-08 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13765 Definitely, it's my bad. I should spend more time to make a reasonable scenario from the first. Thank you, @liancheng , @cloud-fan , @gatorsmile . --- If your project is set up for it,

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-08 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13765 Thanks for the examples. This makes sense and LGTM now. Merging into master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-08 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13765 @gatorsmile Does it make sense to you? I inserted SQL case for you. :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-08 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13765 Hi, All. Here is a better example. I'll update the PR description with this. Thank you, all. **Target Scenario** ```scala scala> val dsView1 =

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-08 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13765 By the way, this PR is about improving the existing `CollapseRepartition`, not about introducing new one. :) --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-08 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13765 Hi, @cloud-fan , @liancheng , @gatorsmile . We can imagine some data sciences environment having many predefined system-wide tables or datasets. But, here, I simplify that into an

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-08 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13765 Thank you for review, @gatorsmile . You work at night, too! :) BTW, as you know, I don't have a real query. May I make some sample queries from my mind? --- If your project is set up

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-08 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/13765 Can you show us an example query? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-07 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13765 There are three possibilities. 1. User mistakes. (Rarely) 2. Intermediate results of optimization. (More frequently.) 3. `View` (or pre-designed `Dataset`). --- If your

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-07 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13765 Under what circumstances will a user use 2 or more adjacent re-partitioning operators? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13765 Ping @cloud-fan and @yhuai . :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13765 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13765 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61834/ Test PASSed. ---

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13765 **[Test build #61834 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61834/consoleFull)** for PR 13765 at commit

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13765 **[Test build #61834 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61834/consoleFull)** for PR 13765 at commit

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-05 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13765 Hi, @yhuai . Could you review this PR when you have some time? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-03 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13765 LGTM, cc @yhuai to take another look --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13765 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13765 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61663/ Test PASSed. ---

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13765 **[Test build #61663 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61663/consoleFull)** for PR 13765 at commit

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13765 **[Test build #61663 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61663/consoleFull)** for PR 13765 at commit

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-07-02 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13765 Rebased to the master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-06-30 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13765 Hi, @cloud-fan . Could you review `CollapseRepartition` optimizer again when you have some time? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13765 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-06-30 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13765 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61564/ Test PASSed. ---

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-06-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13765 **[Test build #61564 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61564/consoleFull)** for PR 13765 at commit

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-06-30 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13765 **[Test build #61564 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61564/consoleFull)** for PR 13765 at commit

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-06-28 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13765 @rxin . For this `CollapseRepartition` optimizer, may I split into another file? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13765 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13765 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61398/ Test PASSed. ---

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13765 **[Test build #61398 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61398/consoleFull)** for PR 13765 at commit

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13765 **[Test build #61398 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61398/consoleFull)** for PR 13765 at commit

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-06-28 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13765 Retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13765 **[Test build #61390 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61390/consoleFull)** for PR 13765 at commit

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-06-28 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13765 Hi, @cloud-fan and @yhuai . I added more description about the logic. I hope that makes the intention of this PR more clear. --- If your project is set up for it, you can reply to this

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-06-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13765 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61309/ Test PASSed. ---

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-06-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13765 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-06-27 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13765 **[Test build #61309 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61309/consoleFull)** for PR 13765 at commit

[GitHub] spark issue #13765: [SPARK-16052][SQL] Improve `CollapseRepartition` optimiz...

2016-06-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/13765 Hi, @cloud-fan . Now, this PR can handle all combinations of Repartition and RepartitionBy. I updated the description of PR and JIRA, too. Thank you so much for making this PR much