Re: 答复: 答复: RDD usage

2014-03-29 Thread Chieh-Yen
, make the whole process inconsistent and unstable. Some rough opinions on the immutable feature of rdd, full discuss can make it more clear. Any ideas? -- 发件人: hequn cheng chenghe...@gmail.com 发送时间: 2014/3/25 10:40 收件人: user@spark.apache.org 主题: Re: 答复: RDD

答复: RDD usage

2014-03-24 Thread 林武康
Hi hequn, a relative question, is that mean the memory usage will doubled? And further more, if the compute function in a rdd is not idempotent, rdd will changed during the job running, is that right? -原始邮件- 发件人: hequn cheng chenghe...@gmail.com 发送时间: ‎2014/‎3/‎25 9:35 收件人:

Re: 答复: RDD usage

2014-03-24 Thread hequn cheng
First question: If you save your modified RDD like this: points.foreach(p=p.y = another_value).collect() or points.foreach(p=p.y = another_value).saveAsTextFile(...) the modified RDD will be materialized and this will not use any work's memory. If you have more transformatins after the map(), the

答复: 答复: RDD usage

2014-03-24 Thread 林武康
discuss can make it more clear. Any ideas? -原始邮件- 发件人: hequn cheng chenghe...@gmail.com 发送时间: ‎2014/‎3/‎25 10:40 收件人: user@spark.apache.org user@spark.apache.org 主题: Re: 答复: RDD usage First question: If you save your modified RDD like this: points.foreach(p=p.y = another_value).collect