Re: structured streaming- checkpoint metadata growing indefinetely

2022-04-29 Thread Gourav Sengupta
Hi, this may not solve the problem, but have you tried to stop the job gracefully, and then restart without much delay by pointing to a new checkpoint location? The approach will have certain uncertainties for scenarios where the source system can loose data, or we do not expect duplicates to be

Re: structured streaming- checkpoint metadata growing indefinetely

2022-04-29 Thread Wojciech Indyk
Update for the scenario of deleting compact files: it recovers from the recent (not compacted) checkpoint file, but when it comes to compaction of checkpoint then it fails with missing recent compaction file. I use Spark 3.1.2 -- Kind regards/ Pozdrawiam, Wojciech Indyk pt., 29 kwi 2022 o 07:00