this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/all-values-for-a-key-must-fit-in-memory-tp6342p6791.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
for this for the meantime? I'm out of
ideas.
Thanks,
Nilesh
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/all-values-for-a-key-must-fit-in-memory-tp6342p6791.html
Sent from the Apache Spark Developers List mailing list archive at
Nabble.com.
of which thankfully works
with 0.9.1 too, no new API changes there.
Cheers,
Nilesh
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/all-values-for-a-key-must-fit-in-memory-tp6342p6794.html
Sent from the Apache Spark Developers List mailing list archive
OK for me here, though it might
turn out to be slow.
Cheers,
Nilesh
PS: Can't wait for 1.0! ^_^ Looks like it's been RC10 till now.
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/all-values-for-a-key-must-fit-in-memory-tp6342p6796.html
Sent from
Thanks Matei and Mridul - was basically wondering whether we would be able
to change the shuffle to accommodate this after 1.0, and from your answers
it sounds like we can.
On Mon, Apr 21, 2014 at 12:31 AM, Mridul Muralidharan mri...@gmail.comwrote:
As Matei mentioned, the Values is now an
An iterator does not imply data has to be memory resident.
Think merge sort output as an iterator (disk backed).
Tom is actually planning to work on something similar with me on this
hopefully this or next month.
Regards,
Mridul
On Sun, Apr 20, 2014 at 11:46 PM, Sandy Ryza
The issue isn't that the Iterator[P] can't be disk-backed. It's that, with
a groupBy, each P is a (Key, Values) tuple, and the entire tuple is read
into memory at once. The ShuffledRDD is agnostic to what goes inside P.
On Sun, Apr 20, 2014 at 11:36 AM, Mridul Muralidharan