[ 
https://issues.apache.org/jira/browse/SPARK-55058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry Zheng updated SPARK-55058:
--------------------------------
    Description: The {{metadata}} file holds the streaming query ID, and should 
be existent if the commit and offset files are non-empty. This file not 
existing will result in duplicates and incorrectness downstream if using 
DeltaSink which uses the streaming query ID to dedup commits for the same 
batch. If the metadata file isn’t there, but the commit and offset files are 
there, we should throw an error as the checkpoint is in an inconsistent state.  
(was: The {{metadata}} file holds the streaming query ID, and should be 
existent if the commit and offset files are non-empty. This file not existing 
will result in duplicates and incorrectness downstream if using DeltaSource 
which uses the streaming query ID to dedup commits for the same batch. If the 
metadata file isn’t there, but the commit and offset files are there, we should 
throw an error as the checkpoint is in an inconsistent state.)

> Throw an error if the /metadata file is not present, but offset or commit 
> directories are non-empty
> ---------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-55058
>                 URL: https://issues.apache.org/jira/browse/SPARK-55058
>             Project: Spark
>          Issue Type: Task
>          Components: Structured Streaming
>    Affects Versions: 4.1.1
>            Reporter: Jerry Zheng
>            Priority: Major
>
> The {{metadata}} file holds the streaming query ID, and should be existent if 
> the commit and offset files are non-empty. This file not existing will result 
> in duplicates and incorrectness downstream if using DeltaSink which uses the 
> streaming query ID to dedup commits for the same batch. If the metadata file 
> isn’t there, but the commit and offset files are there, we should throw an 
> error as the checkpoint is in an inconsistent state.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to