[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/9243#issuecomment-150580739 @srowen @drcrallen I've updated this patch to remove RoaringBitmap from KryoSerializer and pom. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...
Github user drcrallen commented on the pull request: https://github.com/apache/spark/pull/9243#issuecomment-150579219 @srowen I was improperly asking if that change would be appropriate to include in this PR :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/9243#issuecomment-150576264 @drcrallen yes they would have to be. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...
Github user drcrallen commented on the pull request: https://github.com/apache/spark/pull/9243#issuecomment-150575409 Regarding SPARK-11016 , can references to Roaring also be removed from core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9243#issuecomment-150564407 **[Test build #1946 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1946/consoleFull)** for PR 9243 at commit [`392975d`](https://github.com/apache/spark/commit/392975d3b5c48bc61bfa6caff7bfd23b9e095cde). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9243#issuecomment-150558750 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44213/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9243#issuecomment-150558712 **[Test build #44213 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44213/consoleFull)** for PR 9243 at commit [`392975d`](https://github.com/apache/spark/commit/392975d3b5c48bc61bfa6caff7bfd23b9e095cde). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9243#issuecomment-150558748 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/9243#issuecomment-150538206 @viirya oh, well, that would certainly kill two birds with one stone. It can come out of the poms, kryo serializer too. I personally favor this even only on the grounds of simplification. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/9243#issuecomment-150534793 @srowen Actually I can't find where in the Spark RoaringBitmap is used other than MapStatus. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/9243#issuecomment-150532600 @viirya we have an outstanding issue about RoaringBitmap not being serialized by kryo at the moment: https://issues.apache.org/jira/browse/SPARK-11016 It sounds like we have two bitset implementations (three if you count the JDK). What's your view on replacing RoaringBitmap everywhere? It's kind of unfortunate Spark reimplemented the wheel here but, given that this is done (and maybe has some marginal advantage), what about removing this dependency entirely? that is do you see an argument for RoaringBitmap in Spark? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9243#issuecomment-150506790 **[Test build #44213 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44213/consoleFull)** for PR 9243 at commit [`392975d`](https://github.com/apache/spark/commit/392975d3b5c48bc61bfa6caff7bfd23b9e095cde). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/9243#issuecomment-150505415 Why is an uncompressed bitset using less space than a compressed one? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9243#issuecomment-150504119 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9243#issuecomment-150504102 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/9243 [SPARK-11271][Core] Use Spark BitSet instead of RoaringBitmap to reduce memory usage JIRA: https://issues.apache.org/jira/browse/SPARK-11271 As reported in the JIRA ticket, when there are too many tasks, the memory usage of MapStatus will cause problem. Use BitSet instead of RoaringBitMap can reduce the memory usage. You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 mapstatus-bitset Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9243.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9243 commit 392975d3b5c48bc61bfa6caff7bfd23b9e095cde Author: Liang-Chi Hsieh Date: 2015-10-23T07:59:17Z Use Spark BitSet instead of RoaringBitmap to reduce memory usage. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org