Re: Get Spark Streaming timestamp

Bill Jay Wed, 23 Jul 2014 22:46:27 -0700

Hi Tobias,

It seems this parameter is an input to the function. What I am expecting is
output from a function that tells me the starting or ending time of the
batch. For instance, If I use saveAsTextFiles, it seems DStream will
generate a batch every minute and the starting time is a complete minute
(batch size is 60 seconds). Thanks!


Bill


On Wed, Jul 23, 2014 at 6:56 PM, Tobias Pfeiffer <t...@preferred.jp> wrote:

> Bill,
>
> Spark Streaming's DStream provides overloaded methods for transform() and
> foreachRDD() that allow you to access the timestamp of a batch:
>
> http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.streaming.dstream.DStream
>
> I think the timestamp is the end of the batch, not the beginning. For
> example, I compute runtime taking the difference between now() and the time
> I get as a parameter in foreachRDD().
>
> Tobias
>
>
>
> On Thu, Jul 24, 2014 at 6:39 AM, Bill Jay <bill.jaypeter...@gmail.com>
> wrote:
>
>> Hi all,
>>
>> I have a question regarding Spark streaming. When we use the
>> saveAsTextFiles function and my batch is 60 seconds, Spark will generate a
>> series of files such as:
>>
>> result-1406148960000, result-1406148020000, result-1406148080000, etc.
>>
>> I think this is the timestamp for the beginning of each batch. How can we
>> extract the variable and use it in our code? Thanks!
>>
>> Bill
>>
>
>

Re: Get Spark Streaming timestamp

Reply via email to