subject:"structured streaming\- checkpoint metadata growing indefinetely"

Re: structured streaming- checkpoint metadata growing indefinetely

2022-05-04 Thread Wojciech Indyk

For posterity: the problem was FileStreamSourceLog class. I needed to overwrite method shouldRetain, that by default returns true and its doc say: Default implementation retains all log entries. Implementations should override the method to change the behavior. -- Kind regards/ Pozdrawiam,

Re: structured streaming- checkpoint metadata growing indefinetely

2022-04-30 Thread Wojciech Indyk

Hi Gourav! I use stateless processing, no watermarking, no aggregations. I don't want any data loss, so changing checkpoint location is not an option to me. -- Kind regards/ Pozdrawiam, Wojciech Indyk pt., 29 kwi 2022 o 11:07 Gourav Sengupta napisał(a): > Hi, > > this may not solve the

Re: structured streaming- checkpoint metadata growing indefinetely

2022-04-29 Thread Gourav Sengupta

Hi, this may not solve the problem, but have you tried to stop the job gracefully, and then restart without much delay by pointing to a new checkpoint location? The approach will have certain uncertainties for scenarios where the source system can loose data, or we do not expect duplicates to be

Re: structured streaming- checkpoint metadata growing indefinetely

2022-04-29 Thread Wojciech Indyk

Update for the scenario of deleting compact files: it recovers from the recent (not compacted) checkpoint file, but when it comes to compaction of checkpoint then it fails with missing recent compaction file. I use Spark 3.1.2 -- Kind regards/ Pozdrawiam, Wojciech Indyk pt., 29 kwi 2022 o 07:00

structured streaming- checkpoint metadata growing indefinetely

2022-04-28 Thread Wojciech Indyk

Hello! I use spark struture streaming. I need to use s3 for storing checkpoint metadata (I know, it's not optimal storage for checkpoint metadata). Compaction interval is 10 (default) and I set "spark.sql.streaming.minBatchesToRetain"=5. When the job was running for a few weeks then checkpointing

Re: structured streaming- checkpoint metadata growing indefinetely

Re: structured streaming- checkpoint metadata growing indefinetely

Re: structured streaming- checkpoint metadata growing indefinetely

Re: structured streaming- checkpoint metadata growing indefinetely

structured streaming- checkpoint metadata growing indefinetely

5 matches

Site Navigation

Mail list logo

Footer information