Re: [Structured Streaming] SST file does not exist. Race condition corrupting state store

2025-08-25 Thread Siying Dong
Thanks. I think a relatively simple fix can be to include the zip file's modification time in the filtering condition too. If the SST's modification timestamp is earlier than any version x's zip file modification time, it is kept. Thanks, Siying On Mon, Aug 25, 2025 at 11:29 AM Pedro Miguel Duar

Re: [Structured Streaming] SST file does not exist. Race condition corrupting state store

2025-08-25 Thread Siying Dong
I suspect that this problem will be mitigated with checkpoint structure V2 ( https://issues.apache.org/jira/browse/SPARK-49374 https://github.com/apache/spark/blob/bc36a7db43f287af536bb2767d7d9f1d70bc799f/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L2656 ). The motivatio

Re: RocksDB State Store Checkpoint Format Improvement

2024-08-27 Thread Siying Dong
t; of the documentation explains how this can happen. The other links are correct. Thanks, Siying On Tue, Aug 27, 2024 at 7:21 PM Siying Dong wrote: > Hi, > > We are planning to work on a feature that improves how the RocksDB State > Store organizes checkpoint versions in cloud storag

RocksDB State Store Checkpoint Format Improvement

2024-08-27 Thread Siying Dong
Hi, We are planning to work on a feature that improves how the RocksDB State Store organizes checkpoint versions in cloud storage. This improvement aims to: - Address some unexpected query results (see the Appendix