[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2018-01-23 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19082 Sorry for missing the excellent discussions here. After more thoughts, I feel this patch still makes sense. Currently we have 3 cases that may cause large method with whole stage codegen:

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-13 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/19082 @gatorsmile can you close the jira ticket with @viirya assigned? Thanks! --- - To unsubscribe, e-mail:

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-13 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/19082 Since https://github.com/apache/spark/pull/19813 fixes this issue, so I'll close this. Thanks! --- - To unsubscribe, e-mail:

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84807/ Test PASSed. ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #84807 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84807/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #84807 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84807/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84774/ Test PASSed. ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #84774 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84774/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84770/ Test PASSed. ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #84770 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84770/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #84774 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84774/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #84770 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84770/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-12 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/19082 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84734/ Test FAILed. ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #84734 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84734/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #84734 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84734/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19082 Just FYI, the test failure is not related to this PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84724/ Test FAILed. ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #84724 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84724/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #84724 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84724/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19082 Jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84717/ Test FAILed. ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #84717 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84717/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84713/ Test FAILed. ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #84713 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84713/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84711/ Test FAILed. ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #84711 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84711/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #84717 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84717/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #84713 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84713/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #84711 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84711/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-10 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/19082 Ya, I was just trying to do that tonight! Thanks for pinging me. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-12-10 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19082 @maropu could you please resolve conflicts? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84126/ Test PASSed. ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-11-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #84126 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84126/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-11-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #84126 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84126/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-11-21 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/19082 ya, enjoy! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-11-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19082 Will review it carefully after I finish my vacation. Thanks! --- - To unsubscribe, e-mail:

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-11-18 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/19082 ping --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-11-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-11-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83982/ Test PASSed. ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-11-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #83982 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83982/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-11-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #83982 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83982/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-11-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-11-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83981/ Test FAILed. ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-11-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #83981 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83981/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-11-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #83981 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83981/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82615/ Test PASSed. ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #82615 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82615/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 @kiszk @gatorsmile @rednaxelafx Based on the following reasoning, I don't think the cutting of whole-stage operators you may propose would be much beneficial than the approach did in #18931:

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #82615 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82615/testReport)** for PR 19082 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 > If we can cut the boundary of whole-stage operators better, we still can make the byte code size smaller than 8K. That means, we are not completely disabling the whole-stage codegen.

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 >In the above case, 1. is slower than 2. due to the advantage of JIT compilation. >#18931 can cut the boundary of whole-stage operators. Hopefully, this cutting makes the byte code size smaller

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 > You know, when we disable the whole-stage codegen, we still do the expression codegen, whose byte code size is smaller than 8K at most cases. Thus, it could be faster than whole-stage codegen.

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 > At the beginning, I did not notice #18931 is using private or final attributes; otherwise, inlining does not benefit. It will be the same as the whole-stage codegen is turned off. The current

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 @kiszk Thanks for explanation! It sounds reasonable behavior by JIT compiler. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19082 @kiszk This is one of the solutions if we can remove the limit of num of parameters. However, this does not resolve all the issues. For example, the method becomes too big to inline. ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19082 @gatorsmile let me clarify. In other words, do you want to find how to cut the boundary to ensure enabling method inlining for callee methods? It could maximize advantage of both whole-stage codegen

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19082 https://github.com/apache/spark/pull/18931 is not what we want, although it can partially resolves some issues. Simply disabling the whole-stage codegen might trigger the regression like Q66.

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19082 I agree with you. #18810 compares the following two code. 1. Interpreter execution of Java code by whole-stage codegen with passing row data in scalar values 2. JITted execution of Java code

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19082 You know, when we disable the whole-stage codegen, we still do the expression codegen, which byte code size is smaller than 8K at most cases. Thus, it could be faster than whole-stage codegen.

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19082 The whole-stage codegen has two advantages according to [this paper](http://www.vldb.org/pvldb/vol4/p539-neumann.pdf). 1. enable compiler optimizations among operations (3. in page 2) 2. pass

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19082 @kiszk Thanks for your summary. I had a few related discussions with @rednaxelafx and @liancheng in the recent weeks. Vertical cuts like https://github.com/apache/spark/pull/19082 is pretty

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19082 To my best guess, trying to make huge method by method inlining will not fail JIT compilation in Hotspot. It may fail method inlining. According to these blog entries

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 @kiszk Thanks for summarizing the PRs. I just have a question about inlining method by JIT compiler. So you mean JIT compiler will inline methods into larger unit and then do JIT compilation

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19082 Let me summarize recent interesting PRs for code generation regarding JVM bytecode limit for JIT compilation. These PRs encourages to apply JIT compilation to more methods since most of JIT compilers

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/19082 Aha, I feel fair enough. Based the insight, there is one of solutions to make the wholestage codegen consider #calls of gen'd functions though, it seems the approach is not simple. So, splitting

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 The above reasoning also explains the motivation and the effect of #18931 too. The generated codes of query operators are extracted to individual smaller functions. It is beneficial to step

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 @maropu The codes to do aggregation are actually wrapped in a function `doAggregateWithKeys`/`doAggregateWithoutKey`. This is also the part of generated codes this PR improves by extracting

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/19082 Either way, I think we first need to know why the regression on `q66` happens when turning off wholestage codegen. We first thought turning off too-long functions had better performance, but it is

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19082 Yes no need for a back-and-forth. @gatorsmile I think it's reasonable to ask for a little more detail on your comment. --- - To

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 @gatorsmile If my words make you upset, I'm sorry. It's you right to raise suspicion against any PRs. I do respect this right. Maybe I'm wrong and there actually is a possible regression.

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19082 OK. Based on your attitudes, I do not care it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 If you really can't or don't willing to explain, we can wait for @kiszk or @rednaxelafx to explain it. --- - To unsubscribe,

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 @gatorsmile You can't just drop few words without explanation and simply say that others works don't work. I do respect your work/comments and you should respect my works too. ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19082 Just try to help you understand it. If you don't want it, I can keep quiet. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 No. I think you should explain it because this is your concern. You just need to point out, where is the single blocking loop for the two operators? ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19082 Sorry, it is really hard to explain it to you. Maybe you can first think about why the whole-stage codegen works better the previous solution? ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 > After horizontal/ring cutting, each operator will be in a single blocking loop. That means, whole-stage codegen is off. And do we have anything to cut the operators to individual single

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 Yeah, once we exceed `hugeMethodLimit`, wholestage codegen is disabled. Can you tell me how `hugeMethodLimit` makes each operator to a single blocking loop? We didn't run codegen now. ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19082 The relation is very clear. Let me copy what I said above > Just imagine we have two nodes, we want to do a horizontal/ring cut. Basically, in this scenario, horizontal/ring cutting

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 Ok. Any comments from @kiszk and @rednaxelafx on this topic? :) --- - To unsubscribe, e-mail:

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19082 Nope. You can wait for the answers from the JVM experts. : ) --- - To unsubscribe, e-mail:

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-09 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 @gatorsmile Are you talking about this: https://en.wikipedia.org/wiki/Loop_nest_optimization? --- - To unsubscribe,

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-08 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19082 Basically, cutting is to decide the boundaries of `blocking loop`. @kiszk and @rednaxelafx can explain what I said above better. This is related to how JVM works and how whole-stage

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-08 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 > The regression caused by spark.sql.codegen.hugeMethodLimit shows the potential regression caused by horizontal cuts, although spark.sql.codegen.hugeMethodLimit does nothing.

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-08 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 Btw, I'd like to know what the horizontal/vertical cuts you meant. Can you give a simple example? --- - To unsubscribe, e-mail:

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-08 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19082 The regression caused by `spark.sql.codegen.hugeMethodLimit` shows the potential regression caused by horizontal cuts, although `spark.sql.codegen.hugeMethodLimit` does nothing. ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-08 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 I don't think `spark.sql.codegen.hugeMethodLimit` is the same level thing as #18931 or this PR. `hugeMethodLimit` didn't do anything to affect how generated codes are split. ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-08 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19082 The current `spark.sql.codegen.hugeMethodLimit` shows an extreme case. Just imagine we have two nodes, we want to do a horizontal/ring cut. Basically, in this scenario, horizontal/ring

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-08 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 @gatorsmile hmm, I don't know how you get to the conclusion. Is `spark.sql.codegen.hugeMethodLimit` any related to codegen cut? I think it is just a threshold used to determine whether to enable

  1   2   >