[GitHub] spark pull request: [SPARK-4664][Core] Throw an exception when spa...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/3527 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4664][Core] Throw an exception when spa...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/3527#issuecomment-65033421 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4664][Core] Throw an exception when spa...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/3527#issuecomment-65033393 Yea as @aarondav pointed out, I don't think akka framesize is going to be a problem anymore in 1.2+, regardless of the number of partitions. Still good to have this check to be defensive. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4664][Core] Throw an exception when spa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3527#issuecomment-65028820 [Test build #23974 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23974/consoleFull) for PR 3527 at commit [`0089c7a`](https://github.com/apache/spark/commit/0089c7abaf58c7c8d014d0e0d86b00efcee4e100). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4664][Core] Throw an exception when spa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3527#issuecomment-65028826 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23974/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4664][Core] Throw an exception when spa...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/3527#issuecomment-65028666 > I believe it is only 1 bit, not byte, per block Thank you for correcting me. Was not aware of `HighlyCompressedMapStatus`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4664][Core] Throw an exception when spa...
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/3527#issuecomment-65028176 I believe it is only 1 bit, not byte, per block. Further I would estimate compression on largely uniform data to be at least around 10x. So your example would ideally only use around 1.2MB. Anyway, you can arbitrarily multiply the number of partitions to demonstrate the issue. 1mil by 1mil is still a tough cookie to crack, but we don't really want users to have to meddle with frame sizes. Having this check is fine, of course, whether or not users should have to change it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4664][Core] Throw an exception when spa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3527#issuecomment-65027426 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23973/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4664][Core] Throw an exception when spa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3527#issuecomment-65027423 [Test build #23973 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23973/consoleFull) for PR 3527 at commit [`0089c7a`](https://github.com/apache/spark/commit/0089c7abaf58c7c8d014d0e0d86b00efcee4e100). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4664][Core] Throw an exception when spa...
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/3527#issuecomment-65027150 1 partitions doesn't sound that extreme to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4664][Core] Throw an exception when spa...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/3527#issuecomment-65026993 > @zsxwing Note that the case you mentioned should no longer cause this issue either, as we use an extra compressed data structure when dealing with very large numbers of map partitions. In extreme case, it's still possible. For example, assume that there are 1 partitions in map side. If the user does not set a new `numPartition`, there will be 1 reducer. If all of these blocks are not 0, there will be huge `MapStatus`s: 1 * 1 * 1 = 100MB. I'm not sure what the compression ratio of `GZIPOutputStream` will be, but it may exceed `spark.akka.frameSize`. Admittedly, this might be a user mistake and the user should set a proper `numPartition`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4664][Core] Throw an exception when spa...
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/3527#issuecomment-65025980 @zsxwing Note that the case you mentioned should no longer cause this issue either, as we use an extra compressed data structure when dealing with very large numbers of map partitions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4664][Core] Throw an exception when spa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3527#issuecomment-65024260 [Test build #23974 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23974/consoleFull) for PR 3527 at commit [`0089c7a`](https://github.com/apache/spark/commit/0089c7abaf58c7c8d014d0e0d86b00efcee4e100). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4664][Core] Throw an exception when spa...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3527#issuecomment-65023903 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4664][Core] Throw an exception when spa...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3527#issuecomment-65023301 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23971/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4664][Core] Throw an exception when spa...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/3527#issuecomment-65023051 A potential usage of `spark.akka.frameSize` is when the size of `MapStatus`s exceeds `spark.akka.frameSize`, such as large number of mappers and reducers. A relevant issue is in the following thread: http://apache-spark-developers-list.1001551.n3.nabble.com/Eliminate-copy-while-sending-data-any-Akka-experts-here-td7127.html --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4664][Core] Throw an exception when spa...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3527#issuecomment-65022875 [Test build #23973 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23973/consoleFull) for PR 3527 at commit [`0089c7a`](https://github.com/apache/spark/commit/0089c7abaf58c7c8d014d0e0d86b00efcee4e100). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4664][Core] Throw an exception when spa...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3527#issuecomment-65022618 Nice catch. I don't think that it's very common to set `spark.akka.frameSize` these days, since 1.1's task broadcasting should have addressed the most common causes of messages that exceeded the frame size, but it certainly doesn't hurt to warn / guard against bad inputs. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4664][Core] Throw an exception when spa...
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/3527 [SPARK-4664][Core] Throw an exception when spark.akka.frameSize > 2047 If `spark.akka.frameSize` > 2047, it will overflow and become negative. Should have some assertion in `maxFrameSizeBytes` to warn people. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zsxwing/spark SPARK-4664 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3527.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3527 commit f12f0b6d323f1b7b7e62e24950122ae95c257050 Author: zsxwing Date: 2014-12-01T05:27:27Z Throw an exception when spark.akka.frameSize > 2047 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org