Spark Streaming, external windowing?

2014-07-16 Thread Sargun Dhillon
Does anyone here have a way to do Spark Streaming with external timing for windows? Right now, it relies on the wall clock of the driver to determine the amount of time that each batch read lasts. We have a Kafka, and HDFS ingress into our Spark Streaming pipeline where the events are annotated

Re: Spark Streaming, external windowing?

2014-07-16 Thread Gerard Maas
Hi Sargun, There have been few discussions on the list recently about the topic. The short answer is that this is not supported at the moment. This is a particularly good thread as it discusses the current state and limitations:

Re: Spark Streaming, external windowing?

2014-07-16 Thread Tathagata Das
One way to do that is currently possible is given here http://mail-archives.apache.org/mod_mbox/spark-user/201407.mbox/%3CCAMwrk0=b38dewysliwyc6hmze8tty8innbw6ixatnd1ue2-...@mail.gmail.com%3E On Wed, Jul 16, 2014 at 1:16 AM, Gerard Maas gerard.m...@gmail.com wrote: Hi Sargun, There have