Yes, and as far as I recall it also has partitions (empty) which screws up the isEmpty call if the rdd has been transformed down the line. I will have a look tomorrow at the office and see if I can collaborate On 11 Feb 2016 9:14 p.m., "Shixiong(Ryan) Zhu" <shixi...@databricks.com> wrote:
> Yeah, DirectKafkaInputDStream always returns a RDD even if it's empty. > Feel free to send a PR to improve it. > > On Thu, Feb 11, 2016 at 1:09 PM, Sebastian Piu <sebastian....@gmail.com> > wrote: > >> I'm using the Kafka direct stream api but I can have a look on extending >> it to have this behaviour >> >> Thanks! >> On 11 Feb 2016 9:07 p.m., "Shixiong(Ryan) Zhu" <shixi...@databricks.com> >> wrote: >> >>> Are you using a custom input dstream? If so, you can make the `compute` >>> method return None to skip a batch. >>> >>> On Thu, Feb 11, 2016 at 1:03 PM, Sebastian Piu <sebastian....@gmail.com> >>> wrote: >>> >>>> I was wondering if there is there any way to skip batches with zero >>>> events when streaming? >>>> By skip I mean avoid the empty rdd from being created at all? >>>> >>> >>> >