, make the whole
process inconsistent and unstable.
Some rough opinions on the immutable feature of rdd, full discuss can
make it more clear. Any ideas?
--
发件人: hequn cheng chenghe...@gmail.com
发送时间: 2014/3/25 10:40
收件人: user@spark.apache.org
主题: Re: 答复: RDD
Hi hequn, a relative question, is that mean the memory usage will doubled? And
further more, if the compute function in a rdd is not idempotent, rdd will
changed during the job running, is that right?
-原始邮件-
发件人: hequn cheng chenghe...@gmail.com
发送时间: 2014/3/25 9:35
收件人:
First question:
If you save your modified RDD like this:
points.foreach(p=p.y = another_value).collect() or
points.foreach(p=p.y = another_value).saveAsTextFile(...)
the modified RDD will be materialized and this will not use any work's
memory.
If you have more transformatins after the map(), the
discuss can make it
more clear. Any ideas?
-原始邮件-
发件人: hequn cheng chenghe...@gmail.com
发送时间: 2014/3/25 10:40
收件人: user@spark.apache.org user@spark.apache.org
主题: Re: 答复: RDD usage
First question:
If you save your modified RDD like this:
points.foreach(p=p.y = another_value).collect