Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/19082
Sorry for missing the excellent discussions here. After more thoughts, I
feel this patch still makes sense. Currently we have 3 cases that may cause
large method with whole stage codegen:
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/19082
@gatorsmile can you close the jira ticket with @viirya assigned? Thanks!
---
-
To unsubscribe, e-mail:
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/19082
Since https://github.com/apache/spark/pull/19813 fixes this issue, so I'll
close this. Thanks!
---
-
To unsubscribe, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84807/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #84807 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84807/testReport)**
for PR 19082 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #84807 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84807/testReport)**
for PR 19082 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84774/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #84774 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84774/testReport)**
for PR 19082 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84770/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #84770 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84770/testReport)**
for PR 19082 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #84774 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84774/testReport)**
for PR 19082 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #84770 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84770/testReport)**
for PR 19082 at commit
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/19082
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84734/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #84734 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84734/testReport)**
for PR 19082 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #84734 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84734/testReport)**
for PR 19082 at commit
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/19082
Just FYI, the test failure is not related to this PR.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84724/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #84724 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84724/testReport)**
for PR 19082 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #84724 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84724/testReport)**
for PR 19082 at commit
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/19082
Jenkins, retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84717/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #84717 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84717/testReport)**
for PR 19082 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84713/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #84713 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84713/testReport)**
for PR 19082 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84711/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #84711 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84711/testReport)**
for PR 19082 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #84717 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84717/testReport)**
for PR 19082 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #84713 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84713/testReport)**
for PR 19082 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #84711 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84711/testReport)**
for PR 19082 at commit
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/19082
Ya, I was just trying to do that tonight! Thanks for pinging me.
---
-
To unsubscribe, e-mail:
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/19082
@maropu could you please resolve conflicts?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84126/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #84126 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84126/testReport)**
for PR 19082 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #84126 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84126/testReport)**
for PR 19082 at commit
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/19082
ya, enjoy!
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/19082
Will review it carefully after I finish my vacation. Thanks!
---
-
To unsubscribe, e-mail:
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/19082
ping
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83982/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #83982 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83982/testReport)**
for PR 19082 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #83982 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83982/testReport)**
for PR 19082 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83981/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #83981 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83981/testReport)**
for PR 19082 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #83981 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83981/testReport)**
for PR 19082 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19082
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82615/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #82615 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82615/testReport)**
for PR 19082 at commit
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19082
@kiszk @gatorsmile @rednaxelafx Based on the following reasoning, I don't
think the cutting of whole-stage operators you may propose would be much
beneficial than the approach did in #18931:
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19082
**[Test build #82615 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82615/testReport)**
for PR 19082 at commit
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19082
> If we can cut the boundary of whole-stage operators better, we still can
make the byte code size smaller than 8K. That means, we are not completely
disabling the whole-stage codegen.
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19082
>In the above case, 1. is slower than 2. due to the advantage of JIT
compilation.
>#18931 can cut the boundary of whole-stage operators. Hopefully, this
cutting makes the byte code size smaller
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19082
> You know, when we disable the whole-stage codegen, we still do the
expression codegen, whose byte code size is smaller than 8K at most cases.
Thus, it could be faster than whole-stage codegen.
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19082
> At the beginning, I did not notice #18931 is using private or final
attributes; otherwise, inlining does not benefit. It will be the same as the
whole-stage codegen is turned off. The current
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19082
@kiszk Thanks for explanation! It sounds reasonable behavior by JIT
compiler.
---
-
To unsubscribe, e-mail:
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/19082
@kiszk This is one of the solutions if we can remove the limit of num of
parameters. However, this does not resolve all the issues. For example, the
method becomes too big to inline.
---
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/19082
@gatorsmile let me clarify. In other words, do you want to find how to cut
the boundary to ensure enabling method inlining for callee methods? It could
maximize advantage of both whole-stage codegen
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/19082
https://github.com/apache/spark/pull/18931 is not what we want, although it
can partially resolves some issues. Simply disabling the whole-stage codegen
might trigger the regression like Q66.
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/19082
I agree with you. #18810 compares the following two code.
1. Interpreter execution of Java code by whole-stage codegen with passing
row data in scalar values
2. JITted execution of Java code
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/19082
You know, when we disable the whole-stage codegen, we still do the
expression codegen, which byte code size is smaller than 8K at most cases.
Thus, it could be faster than whole-stage codegen.
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/19082
The whole-stage codegen has two advantages according to [this
paper](http://www.vldb.org/pvldb/vol4/p539-neumann.pdf).
1. enable compiler optimizations among operations (3. in page 2)
2. pass
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/19082
@kiszk Thanks for your summary. I had a few related discussions with
@rednaxelafx and @liancheng in the recent weeks. Vertical cuts like
https://github.com/apache/spark/pull/19082 is pretty
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/19082
To my best guess, trying to make huge method by method inlining will not
fail JIT compilation in Hotspot. It may fail method inlining.
According to these blog entries
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19082
@kiszk Thanks for summarizing the PRs.
I just have a question about inlining method by JIT compiler. So you mean
JIT compiler will inline methods into larger unit and then do JIT compilation
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/19082
Let me summarize recent interesting PRs for code generation regarding JVM
bytecode limit for JIT compilation. These PRs encourages to apply JIT
compilation to more methods since most of JIT compilers
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/19082
Aha, I feel fair enough. Based the insight, there is one of solutions to
make the wholestage codegen consider #calls of gen'd functions though, it seems
the approach is not simple. So, splitting
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19082
The above reasoning also explains the motivation and the effect of #18931
too.
The generated codes of query operators are extracted to individual smaller
functions. It is beneficial to step
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19082
@maropu The codes to do aggregation are actually wrapped in a function
`doAggregateWithKeys`/`doAggregateWithoutKey`. This is also the part of
generated codes this PR improves by extracting
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/19082
Either way, I think we first need to know why the regression on `q66`
happens when turning off wholestage codegen. We first thought turning off
too-long functions had better performance, but it is
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/19082
Yes no need for a back-and-forth. @gatorsmile I think it's reasonable to
ask for a little more detail on your comment.
---
-
To
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19082
@gatorsmile If my words make you upset, I'm sorry. It's you right to raise
suspicion against any PRs. I do respect this right.
Maybe I'm wrong and there actually is a possible regression.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/19082
OK. Based on your attitudes, I do not care it.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19082
If you really can't or don't willing to explain, we can wait for @kiszk or
@rednaxelafx to explain it.
---
-
To unsubscribe,
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19082
@gatorsmile You can't just drop few words without explanation and simply
say that others works don't work. I do respect your work/comments and you
should respect my works too.
---
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/19082
Just try to help you understand it. If you don't want it, I can keep quiet.
---
-
To unsubscribe, e-mail:
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19082
No. I think you should explain it because this is your concern. You just
need to point out, where is the single blocking loop for the two operators?
---
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/19082
Sorry, it is really hard to explain it to you. Maybe you can first think
about why the whole-stage codegen works better the previous solution?
---
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19082
> After horizontal/ring cutting, each operator will be in a single blocking
loop. That means, whole-stage codegen is off.
And do we have anything to cut the operators to individual single
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19082
Yeah, once we exceed `hugeMethodLimit`, wholestage codegen is disabled. Can
you tell me how `hugeMethodLimit` makes each operator to a single blocking
loop? We didn't run codegen now.
---
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/19082
The relation is very clear. Let me copy what I said above
> Just imagine we have two nodes, we want to do a horizontal/ring cut.
Basically, in this scenario, horizontal/ring cutting
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19082
Ok. Any comments from @kiszk and @rednaxelafx on this topic? :)
---
-
To unsubscribe, e-mail:
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/19082
Nope. You can wait for the answers from the JVM experts. : )
---
-
To unsubscribe, e-mail:
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19082
@gatorsmile Are you talking about this:
https://en.wikipedia.org/wiki/Loop_nest_optimization?
---
-
To unsubscribe,
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/19082
Basically, cutting is to decide the boundaries of `blocking loop`.
@kiszk and @rednaxelafx can explain what I said above better. This is
related to how JVM works and how whole-stage
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19082
> The regression caused by spark.sql.codegen.hugeMethodLimit shows the
potential regression caused by horizontal cuts, although
spark.sql.codegen.hugeMethodLimit does nothing.
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19082
Btw, I'd like to know what the horizontal/vertical cuts you meant. Can you
give a simple example?
---
-
To unsubscribe, e-mail:
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/19082
The regression caused by `spark.sql.codegen.hugeMethodLimit` shows the
potential regression caused by horizontal cuts, although
`spark.sql.codegen.hugeMethodLimit` does nothing.
---
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19082
I don't think `spark.sql.codegen.hugeMethodLimit` is the same level thing
as #18931 or this PR.
`hugeMethodLimit` didn't do anything to affect how generated codes are
split.
---
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/19082
The current `spark.sql.codegen.hugeMethodLimit` shows an extreme case.
Just imagine we have two nodes, we want to do a horizontal/ring cut.
Basically, in this scenario, horizontal/ring
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19082
@gatorsmile hmm, I don't know how you get to the conclusion. Is
`spark.sql.codegen.hugeMethodLimit` any related to codegen cut? I think it is
just a threshold used to determine whether to enable
1 - 100 of 179 matches
Mail list logo