Get Spark Streaming timestamp

2014-07-23 Thread Bill Jay
Hi all, I have a question regarding Spark streaming. When we use the saveAsTextFiles function and my batch is 60 seconds, Spark will generate a series of files such as: result-140614896, result-140614802, result-140614808, etc. I think this is the timestamp for the beginning of each

Re: Get Spark Streaming timestamp

2014-07-23 Thread Tobias Pfeiffer
Bill, Spark Streaming's DStream provides overloaded methods for transform() and foreachRDD() that allow you to access the timestamp of a batch: http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.streaming.dstream.DStream I think the timestamp is the end of the batch, not

Re: Get Spark Streaming timestamp

2014-07-23 Thread Bill Jay
Hi Tobias, It seems this parameter is an input to the function. What I am expecting is output from a function that tells me the starting or ending time of the batch. For instance, If I use saveAsTextFiles, it seems DStream will generate a batch every minute and the starting time is a complete