RE: Possible long lineage issue when using DStream to update a normal RDD

2015-05-08 Thread Shao, Saisai
Subject: Possible long lineage issue when using DStream to update a normal RDD Hi all, Recently in our project, we need to update a RDD using data regularly received from DStream, I plan to use foreachRDD API to achieve this: var MyRDD = ... dstream.foreachRDD { rdd = MyRDD = MyRDD.join(rdd

Re: Possible long lineage issue when using DStream to update a normal RDD

2015-05-08 Thread Chunnan Yao
appreciated. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Possible-long-lineage-issue-when-using-DStream-to-update-a-normal-RDD-tp22812.html Sent from the Apache Spark User List mailing list archive at Nabble.com

RE: Possible long lineage issue when using DStream to update a normal RDD

2015-05-08 Thread Shao, Saisai
...@gmail.com] Sent: Friday, May 8, 2015 2:51 PM To: Shao, Saisai Cc: user@spark.apache.org Subject: Re: Possible long lineage issue when using DStream to update a normal RDD Thank you for this suggestion! But may I ask what's the advantage to use checkpoint instead of cache here? Cuz they both cut lineage

Possible long lineage issue when using DStream to update a normal RDD

2015-05-07 Thread yaochunnan
this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Possible-long-lineage-issue-when-using-DStream-to-update-a-normal-RDD-tp22812.html Sent from the Apache Spark User List mailing list archive at Nabble.com