Re: Does RDD checkpointing store the entire state in HDFS?

2015-07-14 Thread swetha
OK. Thanks a lot TD. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Does-RDD-checkpointing-store-the-entire-state-in-HDFS-tp7368p13231.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Re: Does RDD checkpointing store the entire state in HDFS?

2015-07-14 Thread swetha
or do I have to use any code like ssc.checkpoint(checkpointDir)? Also, how is the performance if I use both DStream Checkpointing for maintaining the state and use Kafka Direct approach for exactly once semantics? Thanks, Swetha -- View this message in context: http://apache-spark-developers

Regarding sessionization with updateStateByKey

2015-07-14 Thread swetha
, Swetha -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Regarding-sessionization-with-updateStateByKey-tp13226.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com

Regarding master node failure

2015-07-07 Thread swetha
Hi, What happens if a master node fails in the case of Spark Streaming? Would the data be lost in that case? Thanks, Swetha -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Regarding-master-node-failure-tp13055.html Sent from the Apache Spark

Data interaction between various RDDs in Spark Streaming

2015-07-07 Thread swetha
Hi, Suppose I want the data to be grouped by and Id named 12345 and I have certain amount of data coming out from one batch for 12345 and I have data related to 12345 coming after 5 hours, how do I group by 12345 and have a single RDD of list? Thanks, Swetha -- View this message in context