[ 
https://issues.apache.org/jira/browse/FLINK-26719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509231#comment-17509231
 ] 

Aitozi edited comment on FLINK-26719 at 3/19/22, 9:36 AM:
----------------------------------------------------------

{quote}If we do not want to provide stronger resiliency/guarantees than the 
Flink native integration in itself then I guess we do not need to check, or 
it's enough to check at larger intervals.
{quote}
I have understood generally. In other words, we are using the reconcile loop to 
do the periodic check and plan to produce the ERROR events, Right? 

I think it's an interesting feature to explore, it may be an ability of 
monitoring or self-healing of the operator. The monitoring can use the polling 
or the informer based technique.

Thanks for your guys' explanation, Let’s go and see the evolution of this 
ability :).


was (Author: aitozi):
{quote}
If we do not want to provide stronger resiliency/guarantees than the Flink 
native integration in itself then I guess we do not need to check, or it's 
enough to check at larger intervals.
{quote}
I have understood generally. In other words, we are using the reconcile loop to 
do the periodic check and plan to produce the ERROR events, Right? 

I think it's an interesting feature to explore, it may be an ability of 
monitoring or self-healing of the operator. The monitoring can use the polling 
or the informer based technique.

Thanks for your guys' explanation, Let’s go and see the evolution of this 
ability :).

> Rethink the default reschedule reconcile loop
> ---------------------------------------------
>
>                 Key: FLINK-26719
>                 URL: https://issues.apache.org/jira/browse/FLINK-26719
>             Project: Flink
>          Issue Type: Sub-task
>            Reporter: Aitozi
>            Priority: Major
>
> When I test locally, I found that it will reschedule and reconcile with the 
> {{operator.reconciler.reschedule.interval.sec}} I doubt why we need this? I 
> think we just need to reconcile
>  # waiting for the status change
>  # receive the new event
>  # waiting for the savepoint result
> So when JobManagerDeploymentStatus is Ready, we do not have to trigger the 
> reconcile except waiting for the savepoint result.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to