Re: Checkpoints not cleaned using Spark streaming + watermarking + kafka

2017-09-22 Thread MathieuP
The expected setting to clean these files is : - spark.sql.streaming.minBatchesToRetain More info on structured streaming settings : https://github.com/jaceklaskowski/spark-structured-streaming-book/blob/master/spark-sql-streaming-properties.adoc -- Sent from:

Checkpoints not cleaned using Spark streaming + watermarking + kafka

2017-09-21 Thread MathieuP
Hi Spark Users ! :) I come to you with a question about checkpoints. I have a streaming application that consumes and produces to Kafka. The computation requires a window and watermarking. Since this is a streaming application with a Kafka output, a checkpoint is expected. The application runs