[jira] [Created] (FLINK-26388) Release Testing: Repeatable Cleanup

Matthias Pohl (Jira) Mon, 28 Feb 2022 00:15:05 -0800

Matthias Pohl created FLINK-26388:
-------------------------------------

             Summary: Release Testing: Repeatable Cleanup
                 Key: FLINK-26388
                 URL: https://issues.apache.org/jira/browse/FLINK-26388
             Project: Flink
          Issue Type: New Feature
          Components: Runtime / Coordination
    Affects Versions: 1.15.0
            Reporter: Matthias Pohl



Repeatable cleanup got introduced with 
[FLIP-194|https://issues.apache.org/jira/projects/FLINK/issues/FLINK-26284?filter=allopenissues]
 but should be considered as an independent feature of the {{JobResultStore}} 
(JRS) from a user's point of view. The documentation efforts are finalized with 
FLINK-26296.

Repeatable cleanup can be triggered by running into an error while cleaning up. 
This can be achieved by disabling access to S3 after the job finished, e.g.:
* Setting a reasonable enough checkpointing time (checkpointing should be 
enabled to allow cleanup of s3)
* Disable s3 (removing permissions or shutting down the s3 server)
* Stop job with savepoint

Stopping the job should work but the logs should show failure with repeating 
retries. Enabling S3 again should fix the issue.

Keep in mind that if testing this in with HA, you should use a different bucket 
for the file-based JRS artifacts only change permissions for the bucket that 
holds JRS-unrelated artifacts. Flink would fail fatally if the JRS is not able 
to access it's backend storage.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (FLINK-26388) Release Testing: Repeatable Cleanup

Reply via email to