The Iterable from cogroup is CompactBuffer, which is already materialized. It's not a lazy Iterable. So now Spark cannot handle skewed data that some key has too many values that cannot be fit into the memory.
- recent join/iterator fix Stephen Haberman
- Re: recent join/iterator fix Sean Owen
- Re: recent join/iterator fix Stephen Haberman
- Re: recent join/iterator fix Sean Owen
- Re: recent join/iterator fix Shixiong Zhu
- Re: recent join/iterator fix Stephen Haberman