Sorry, should have mentioned that Spark only seems reluctant to take the
last windowed, groupBy batch from Kafka when using OutputMode.Append.

I've asked on StackOverflow:
https://stackoverflow.com/questions/62915922/spark-structured-streaming-wont-pull-the-final-batch-from-kafka
but am still struggling. Can anybody please help?

How do people test their SSS code if you have to put a message on Kafka to
get Spark to consume a batch?

Kind regards,

Phillip


On Sun, Jul 12, 2020 at 4:55 PM Phillip Henry <londonjava...@gmail.com>
wrote:

> Hi, folks.
>
> I noticed that SSS won't process a waiting batch if there are no batches
> after that. To put it another way, Spark must always leave one batch on
> Kafka waiting to be consumed.
>
> There is a JIRA for this at:
>
> https://issues.apache.org/jira/browse/SPARK-24156
>
> that says it's resolved in 2.4.0 but my code
> <https://github.com/PhillHenry/SSSPlayground/blob/Spark2/src/test/scala/uk/co/odinconsultants/sssplayground/windows/TimestampedStreamingSpec.scala>
> is using 2.4.2 yet I still see Spark reluctant to consume another batch
> from Kafka if it means there is nothing else waiting to be processed in the
> topic.
>
> Do I have to do something special to exploit the behaviour that
> SPARK-24156 says it has addressed?
>
> Regards,
>
> Phillip
>
>
>
>

Reply via email to