[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

andrewor14 Mon, 21 Jul 2014 17:44:30 -0700

Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/1165#issuecomment-49685219
  
    @mateiz Yes, currently we unroll it in deserialized form even if we only 
want to store it in serialized form. One issue with storing it directly in 
bytes is that `CacheManager` still needs to return the original iterator, 
meaning, after we serialize the values we need to deserialize it back. This may 
cause a performance regression for the `MEMORY_*_SER` storage levels because 
the deserialization process may be expensive. (This was attempted in #1083, but 
we later decided against it for this reason).



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

Reply via email to