Github user rxin commented on the pull request: https://github.com/apache/spark/pull/9243#issuecomment-150708942 If I understand the problem here correctly, we are using a bitmap to track non-zero-size blocks (or zero-size blocks) when the number of partitions is large (> 2000). If we use uncompressed bitset, the tracking would be fixed size. The compression at a high level is essentially run length encoding. If we use a compressed one, when there are lots of zeros, it should be compressed to a very small size? The worst case for the compressed bitmap is alternating 0s and 1s. Am I understanding the situation incorrectly?
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org