Hi all,
I am using direct-Kafka-input-stream in my Spark app. When I use
window(...) function in the chain it will cause the processing pipeline
to stop - when I open the Spark-UI I can see that the streaming batches
are being queued and the pipeline reports to process one of the first
batches.
To be more correct: the issue happens only when the windows overlap (if
sliding_interval < window_length). Otherwise the system behaves as expected.
Derivations of window(..) function - like reduceByKeyAndWindow(..), etc.
works also as expected - pipeline doesn't stop. The same applies when
using different type of stream.
Is it some known limitation of window(..) function when used with
direct-Kafka-input-stream ?
Java pseudo code:
org.apache.spark.streaming.kafka.DirectKafkaInputDStream s;
s.window(Durations.seconds(10)).print(); // the pipeline will stop
Thanks
Martin
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org