[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19720 I don't have a strong preference, but there were many 64kb compile error fixes for 2.2 or prior(e.g. `CreateStruct`, `CreateArray`, `Invoke`, `CreateExternalRow`, erc.). They all add more global variables, and it's werid to me that we stop doing this because the master branch has SPARK-18016. @kiszk what do you think? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19720 No, a query with a `coalesce` with many/complex parameters will hit this problem. A query with a lot of small `coalesce` will not have the problem. For `AtLeastNNonNulls ` the fix would be safe to be backported, because no class variables are defined, but for `coalesce` it is safer to fix it only with SPARK-18016. In particular, the ongoing PR will solve the issue. The same is true also for all the other similar PRs. Maybe what we can do to backport this to branch-2.2 is to do the splitting and define class level variables only after a threshold of parameter is met, otherwise we go on with the previous code generation (without splitting). In this way we don't have any regression. Or maybe we can backport to 2.2 only those fix which are not introducing class level variables, like for `AtLeastNNonNulls`. Actually I think that the most important of all of these fixes is `AtLeastNNonNulls` indeed, because it is used to drop rows containing all nulls and this fails with dataset with a lot of columns before this PR. All the other functions are less likely to have a huge amount of parameters, despite this may happen and we should support it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19720 If there is a query with a lot of coalesce function, wouldn't it hit the 64kb issue? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19720 It's not about running slower. This PR solves the problem which makes the user facing an exception if there are a lot of arguments in `coalesce` (or `AtLestNNonNulls`), but what I am doing in the `coalesce` function here is that I am adding to variables for each coalesce function. If there is a query with a lot of coalesce function (instead of a coalesce with a lot of parameters), this might result in having much more variables than before. This can cause the problem and the exception described in SPARK-18016. Thus a query that was previously running can fail. The same thing is true for all the other PRs similar to this one submitted by @kiszk. Then, we should keep all these changes only on master, where part of SPARK-18016 is landing and hopefully soon it will be completely solved. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19720 hmm, isn't running slower better than can't run? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19720 @cloud-fan please do not backport this to 2.2. In 2.2 we don't have SPARK-18016 and this is adding new variables in the case of coalesce. Thus it can generate an higher pressure on the constant pool and this may even cause a regression IMHO. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19720 thanks, merging to master/2.2! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19720 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19720 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19720 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83843/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19720 **[Test build #83843 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83843/testReport)** for PR 19720 at commit [`3a5c683`](https://github.com/apache/spark/commit/3a5c683149a198c79578453518c1d8170a52ff94). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19720 **[Test build #83843 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83843/testReport)** for PR 19720 at commit [`3a5c683`](https://github.com/apache/spark/commit/3a5c683149a198c79578453518c1d8170a52ff94). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19720 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83797/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19720 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19720 **[Test build #83797 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83797/testReport)** for PR 19720 at commit [`a548ddb`](https://github.com/apache/spark/commit/a548ddb567141143e89aee2099f1c3309e02475c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19720 **[Test build #83797 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83797/testReport)** for PR 19720 at commit [`a548ddb`](https://github.com/apache/spark/commit/a548ddb567141143e89aee2099f1c3309e02475c). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19720 I reviewed the PR according to what came out on the conversations on the other PRs by @kiszk . @viirya @kiszk May I kindly ask you to review it again now? Thank you very much. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19720 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19720 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83781/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19720 **[Test build #83781 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83781/testReport)** for PR 19720 at commit [`423245e`](https://github.com/apache/spark/commit/423245e9d2211acb0bef5191dd3a3745f63b49ae). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19720 **[Test build #83781 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83781/testReport)** for PR 19720 at commit [`423245e`](https://github.com/apache/spark/commit/423245e9d2211acb0bef5191dd3a3745f63b49ae). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19720 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19720 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83754/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19720 **[Test build #83754 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83754/testReport)** for PR 19720 at commit [`924aab9`](https://github.com/apache/spark/commit/924aab9f41d29c303fa69ecbf37d3f4a3c9df4ef). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19720 **[Test build #83754 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83754/testReport)** for PR 19720 at commit [`924aab9`](https://github.com/apache/spark/commit/924aab9f41d29c303fa69ecbf37d3f4a3c9df4ef). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19720 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19720 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19720 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83743/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19720 **[Test build #83743 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83743/testReport)** for PR 19720 at commit [`e8320d6`](https://github.com/apache/spark/commit/e8320d61266e1e5835cfae0d037ede6ae92c8666). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19720 **[Test build #83743 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83743/testReport)** for PR 19720 at commit [`e8320d6`](https://github.com/apache/spark/commit/e8320d61266e1e5835cfae0d037ede6ae92c8666). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19720 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83742/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19720 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19720 **[Test build #83742 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83742/testReport)** for PR 19720 at commit [`911e172`](https://github.com/apache/spark/commit/911e1727155ec511609a2380f026c81f615370a0). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19720 **[Test build #83742 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83742/testReport)** for PR 19720 at commit [`911e172`](https://github.com/apache/spark/commit/911e1727155ec511609a2380f026c81f615370a0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19720 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19720 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83719/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19720 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19720 **[Test build #83719 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83719/testReport)** for PR 19720 at commit [`1722d12`](https://github.com/apache/spark/commit/1722d12430c8676e4e9ec8a7d62a7907e09baea9). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19720 **[Test build #83719 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83719/testReport)** for PR 19720 at commit [`1722d12`](https://github.com/apache/spark/commit/1722d12430c8676e4e9ec8a7d62a7907e09baea9). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19720 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19720 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83707/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19720 **[Test build #83707 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83707/testReport)** for PR 19720 at commit [`f0edd7e`](https://github.com/apache/spark/commit/f0edd7e077c84b6a890f7ed9cff2eefadf5eee33). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19720: [SPARK-22494][SQL] Fix 64KB limit exception with Coalesc...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19720 **[Test build #83707 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83707/testReport)** for PR 19720 at commit [`f0edd7e`](https://github.com/apache/spark/commit/f0edd7e077c84b6a890f7ed9cff2eefadf5eee33). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org