Hi Wu,

If you are using the FileSink, it would use a random UUID on startup, it would 
be changed
after failover, thus the new records would be writes to files like 
<prefix>-<new unique id>-<new count>.suffix.

And logically the sink in 1.14 should be able to recover from an existing 
savepoint from previous version smoothly since
we do not change the state in recent versions? Namely you could use 
stop-with-savepoint to get a savepoint when
running the old version, and starts the new version's job with this savepoint. 

Best,
Yun




------------------------------------------------------------------
From:wu shaoj <[email protected]>
Send Time:2021 Oct. 9 (Sat.) 09:49
To:Yun Gao <[email protected]>; [email protected] <[email protected]>
Subject:Re: What's the purpose of uniqueId in FileWriterBucket

If flink job recover from previous checkpoint/savepoint, will it re-output 
records to a different file with same partFileIndex? And can we upgrade flink 
to 1.4 smoothly?
From: Yun Gao <[email protected]>
Date: Friday, October 8, 2021 at 22:43
To: wu shaoj <[email protected]>, [email protected] <[email protected]>
Subject: Re: What's the purpose of uniqueId in FileWriterBucket
Hi Wu,

The uid is used to distinguish between the different subtasks, if removed, the 
different subtasks
of the filesink would have name conflicts if they writes to the same bucket, 
thus the uid should
be necessary if there are multiple subtasks.

Best,
Yun

 
------------------Original Mail ------------------
Sender:wu shaoj <[email protected]>
Send Date:Fri Oct 8 14:18:34 2021
Recipients:[email protected] <[email protected]>
Subject:What's the purpose of uniqueId in FileWriterBucket
Hi, folks,

 From 1.14, file sink add a uid to a committed file, so would you mind to tell 
me what’s the purpose of this field? Can it be removed safely?
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/datastream/file_sink/#part-file-configuration




Reply via email to