[ 
https://issues.apache.org/jira/browse/FLINK-26236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17494727#comment-17494727
 ] 

Gyula Fora commented on FLINK-26236:
------------------------------------

Seems like the operator SDK provides an out of the box logic for retrying 
errors and setting a custom status by implementing a simple interface.

We should probably use this and instead if catching the errors and setting the 
error status, use this directly:
[https://javaoperatorsdk.io/docs/features]



{{public interface ErrorStatusHandler<T extends HasMetadata> {}}

> Track and cap retries in ReconciliationStatus
> ---------------------------------------------
>
>                 Key: FLINK-26236
>                 URL: https://issues.apache.org/jira/browse/FLINK-26236
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Kubernetes Operator
>            Reporter: Gyula Fora
>            Priority: Major
>
> At the moment we retry errors again and again indefinitely. As suggested by 
> [~t...@apache.org] we should cap the number of retries (or the time spent 
> retrying).
> For this we can include a retrycount in the reconciliiation status,
> Also we should distinguish fatal (like config errors) and recoverable errors 
> with a different exception type and those should not be retried.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to