Hi Sargun,

There have been few discussions on the list recently about the topic. The
short answer is that this is not supported at the moment.
This is a particularly good thread as it discusses the current state and
limitations:

http://apache-spark-developers-list.1001551.n3.nabble.com/brainsotrming-Generalization-of-DStream-a-ContinuousRDD-td7349.html

-kr, Gerard.


On Wed, Jul 16, 2014 at 9:56 AM, Sargun Dhillon <sar...@sargun.me> wrote:

> Does anyone here have a way to do Spark Streaming with external timing
> for windows? Right now, it relies on the wall clock of the driver to
> determine the amount of time that each batch read lasts.
>
> We have a Kafka, and HDFS ingress into our Spark Streaming pipeline
> where the events are annotated by the timestamps that they happened
> (in real time) in. We would like to keep our windows based on those
> timestamps, as opposed to based on the driver time.
>
> Does anyone have any ideas how to do this?
>

Reply via email to