[jira] [Commented] (CAMEL-15903) Master component do not retry endpoint startup on failure

Claus Ibsen (Jira) Wed, 03 Jul 2024 06:45:05 -0700


    [ 
https://issues.apache.org/jira/browse/CAMEL-15903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17862792#comment-17862792
 ]


Claus Ibsen commented on CAMEL-15903:
-------------------------------------

Okay so camel-master will now retry creating and starting the consumer when it 
is becoming the leader. There are options you can set delay/max attempts to 
avoid keeping for long time.
There is no rules for giving up so for example if starting keeps failing, then 
the leader is given up. That would require some more general work.


> Master component do not retry endpoint startup on failure
> ---------------------------------------------------------
>
>                 Key: CAMEL-15903
>                 URL: https://issues.apache.org/jira/browse/CAMEL-15903
>             Project: Camel
>          Issue Type: Bug
>          Components: camel-master
>            Reporter: EDGAR CHERNICK
>            Assignee: Claus Ibsen
>            Priority: Minor
>             Fix For: 4.7.0
>
>
> The cluster view implementations have a listener attribute where the master 
> component hooks itself to receive leadership change events. 
> When the app instance becomes leader the cluster view will mark that instance 
> as leader then it will trigger the leadershipchangedevent, this will trigger 
> the master component event handler and it will start the delegated consumer 
> and endpoint.
> The issue happens when the delegated consumer or endpoint fail to start. The 
> exception throw by them will go up in the stack, however, this exception does 
> not affect the leadership, i.e., once the app instance becomes leader it will 
> stay so even if the delegated components fail to start.
> Both KubernetesClusterView and FileLockClusterView have this issue.
> KubernetesClusterView uses KubernetesLeadershipController to run the 
> leadership check at an interval. When it acquires the leadership it updates 
> the configmap with that info and call TimedLeaderNotifier refreshLeadership 
> method to check if the leadership has changed. The issue here is that it will 
> mark itself as leader before firing the leadership changed event. Another 
> issue is that the event is fired in a separete thread, so, when the start of 
> the delegated components fail the exception will "die" together with the 
> thread. When the next scheduled leadership check runs the app instance is 
> already the leader and it will not fire the leadership changed event and the 
> delegated component will never start.
> FileLockClusterView has a similar issue, it acquires the file lock prior to 
> firing the event, even if the event processing fails it does not rollback the 
> leader selection.
> Other cluster view implementations might have the same issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CAMEL-15903) Master component do not retry endpoint startup on failure

Reply via email to