Jerry Zheng created SPARK-55058:
-----------------------------------
Summary: Throw an error if the /metadata file is not present, but
offset or commit directories are non-empty
Key: SPARK-55058
URL: https://issues.apache.org/jira/browse/SPARK-55058
Project: Spark
Issue Type: Task
Components: Structured Streaming
Affects Versions: 4.1.1
Reporter: Jerry Zheng
The {{metadata}} file holds the streaming query ID, and should be existent if
the commit and offset files are non-empty. This file not existing will result
in duplicates and incorrectness downstream if using DeltaSource which uses the
streaming query ID to dedup commits for the same batch. If the metadata file
isn’t there, but the commit and offset files are there, we should throw an
error as the checkpoint is in an inconsistent state.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]