Hello. I've got big RDD(1gb) in yarn cluster. On local machine, which use this cluster I have only 512 mb. I'd like to iterate over values in result RDD on my local machine. I can't use collect(), because it would create too big array locally which more then my heap. I need some iterative way. There is method iterator(), but it requires some additional information, I can't provide. ( http://stackoverflow.com/questions/21698443/best-practice-for-retrieving-big-data-from-rdd-to-local-machine )
-- *Sincerely yoursEgor PakhomovScala Developer, Yandex*