If I remember correctly, there's a fix for this in Flink 1.14 (but the
feature is disabled by default in 1.14, and enabled by default in 1.15).
(I'm thinking
that execution.checkpointing.checkpoints-after-tasks-finish.enabled [1]
takes care of this.)

With Flink 1.13 I believe you'll have to handle this yourself somehow.

Regards,
David

[1]
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/config/#execution-checkpointing-checkpoints-after-tasks-finish-enabled

On Wed, Aug 31, 2022 at 6:26 AM David Clutter <david.clut...@bridg.com>
wrote:

> I am using Flink 1.13.1 on AWS EMR 6.4.  I have an existing application
> using DataStream API that I would like to modify to write output to S3.  I
> am testing the StreamingFileSink with a bounded input.  I have enabled
> checkpointing.
>
> A couple questions:
> 1) When the program finishes, all the files remain .inprogress.  Is that
> "Important Note 2" in the documentation
> <https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/connectors/datastream/streamfile_sink/>?
> Is there a solution to this other than renaming the files myself?  Renaming
> the files in S3 could be costly I think.
>
> 2) If I use a deprecated method such as DataStream.writeAsText() is that
> guaranteed to write *all* the records from the stream, as long as the job
> does not fail?  I understand checkpointing will not be effective here.
>
> Thanks,
> David
>

Reply via email to