OK. Thanks a lot TD.
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Does-RDD-checkpointing-store-the-entire-state-in-HDFS-tp7368p13231.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
or do I have
to use any code like ssc.checkpoint(checkpointDir)? Also, how is the
performance if I use both DStream Checkpointing for maintaining the state
and use Kafka Direct approach for exactly once semantics?
Thanks,
Swetha
--
View this message in context:
http://apache-spark-developers
,
Swetha
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Regarding-sessionization-with-updateStateByKey-tp13226.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com
Hi,
What happens if a master node fails in the case of Spark Streaming? Would
the data be lost in that case?
Thanks,
Swetha
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Regarding-master-node-failure-tp13055.html
Sent from the Apache Spark
Hi,
Suppose I want the data to be grouped by and Id named 12345 and I have
certain amount of data coming out from one batch for 12345 and I have data
related to 12345 coming after 5 hours, how do I group by 12345 and have
a single RDD of list?
Thanks,
Swetha
--
View this message in context