[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...

2015-10-23 Thread viirya
Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/9243#issuecomment-150580739
  
@srowen @drcrallen I've updated this patch to remove RoaringBitmap from 
KryoSerializer and pom.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...

2015-10-23 Thread drcrallen
Github user drcrallen commented on the pull request:

https://github.com/apache/spark/pull/9243#issuecomment-150579219
  
@srowen I was improperly asking if that change would be appropriate to 
include in this PR :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...

2015-10-23 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/9243#issuecomment-150576264
  
@drcrallen yes they would have to be.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...

2015-10-23 Thread drcrallen
Github user drcrallen commented on the pull request:

https://github.com/apache/spark/pull/9243#issuecomment-150575409
  
Regarding SPARK-11016 , can references to Roaring also be removed from 
core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...

2015-10-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9243#issuecomment-150564407
  
**[Test build #1946 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1946/consoleFull)**
 for PR 9243 at commit 
[`392975d`](https://github.com/apache/spark/commit/392975d3b5c48bc61bfa6caff7bfd23b9e095cde).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...

2015-10-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9243#issuecomment-150558750
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44213/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...

2015-10-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9243#issuecomment-150558712
  
**[Test build #44213 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44213/consoleFull)**
 for PR 9243 at commit 
[`392975d`](https://github.com/apache/spark/commit/392975d3b5c48bc61bfa6caff7bfd23b9e095cde).
 * This patch **fails from timeout after a configured wait of \`250m\`**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...

2015-10-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9243#issuecomment-150558748
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...

2015-10-23 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/9243#issuecomment-150538206
  
@viirya oh, well, that would certainly kill two birds with one stone. It 
can come out of the poms, kryo serializer too. I personally favor this even 
only on the grounds of simplification.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...

2015-10-23 Thread viirya
Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/9243#issuecomment-150534793
  
@srowen Actually I can't find where in the Spark RoaringBitmap is used 
other than MapStatus.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...

2015-10-23 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/9243#issuecomment-150532600
  
@viirya we have an outstanding issue about RoaringBitmap not being 
serialized by kryo at the moment: 
https://issues.apache.org/jira/browse/SPARK-11016  It sounds like we have two 
bitset implementations (three if you count the JDK). What's your view on 
replacing RoaringBitmap everywhere? It's kind of unfortunate Spark 
reimplemented the wheel here but, given that this is done (and maybe has some 
marginal advantage), what about removing this dependency entirely? that is do 
you see an argument for RoaringBitmap in Spark?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...

2015-10-23 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9243#issuecomment-150506790
  
**[Test build #44213 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44213/consoleFull)**
 for PR 9243 at commit 
[`392975d`](https://github.com/apache/spark/commit/392975d3b5c48bc61bfa6caff7bfd23b9e095cde).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...

2015-10-23 Thread rxin
Github user rxin commented on the pull request:

https://github.com/apache/spark/pull/9243#issuecomment-150505415
  
Why is an uncompressed bitset using less space than a compressed one?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...

2015-10-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9243#issuecomment-150504119
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...

2015-10-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9243#issuecomment-150504102
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11271][Core] Use Spark BitSet instead o...

2015-10-23 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/9243

[SPARK-11271][Core] Use Spark BitSet instead of RoaringBitmap to reduce 
memory usage

JIRA: https://issues.apache.org/jira/browse/SPARK-11271

As reported in the JIRA ticket, when there are too many tasks, the memory 
usage of MapStatus will cause problem. Use BitSet instead of RoaringBitMap can 
reduce the memory usage.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 mapstatus-bitset

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9243.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9243


commit 392975d3b5c48bc61bfa6caff7bfd23b9e095cde
Author: Liang-Chi Hsieh 
Date:   2015-10-23T07:59:17Z

Use Spark BitSet instead of RoaringBitmap to reduce memory usage.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org