[ https://issues.apache.org/jira/browse/CAMEL-15903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17246823#comment-17246823 ]
EDGAR CHERNICK edited comment on CAMEL-15903 at 12/9/20, 9:06 PM: ------------------------------------------------------------------ The Master Consumer does not have a try catch block on leadership taken event handler (https://github.com/apache/camel/blob/master/components/camel-master/src/main/java/org/apache/camel/component/master/MasterConsumer.java#L118). would it be okay If I just added that and then call super.handleException in the catch block ? If the route has bridgeErrorHandler=true this should at least surface the exception for handling, right ? This won't solve the issue by itself but at least end users will have an exception indicating that something is going on. was (Author: edgarcher): The Master Consumer does not have a try catch block on leadership taken event handler (https://github.com/apache/camel/blob/master/components/camel-master/src/main/java/org/apache/camel/component/master/MasterConsumer.java#L118). would it be okay If I just added that and then call super.handleException in the catch block ? If the route has bridgeErrorHandler=true this should at least surface the exception for handling, right ? > Master component do not retry endpoint startup on failure > --------------------------------------------------------- > > Key: CAMEL-15903 > URL: https://issues.apache.org/jira/browse/CAMEL-15903 > Project: Camel > Issue Type: Bug > Components: camel-master > Reporter: EDGAR CHERNICK > Priority: Major > Fix For: 3.8.0 > > > The cluster view implementations have a listener attribute where the master > component hooks itself to receive leadership change events. > When the app instance becomes leader the cluster view will mark that instance > as leader then it will trigger the leadershipchangedevent, this will trigger > the master component event handler and it will start the delegated consumer > and endpoint. > The issue happens when the delegated consumer or endpoint fail to start. The > exception throw by them will go up in the stack, however, this exception does > not affect the leadership, i.e., once the app instance becomes leader it will > stay so even if the delegated components fail to start. > Both KubernetesClusterView and FileLockClusterView have this issue. > KubernetesClusterView uses KubernetesLeadershipController to run the > leadership check at an interval. When it acquires the leadership it updates > the configmap with that info and call TimedLeaderNotifier refreshLeadership > method to check if the leadership has changed. The issue here is that it will > mark itself as leader before firing the leadership changed event. Another > issue is that the event is fired in a separete thread, so, when the start of > the delegated components fail the exception will "die" together with the > thread. When the next scheduled leadership check runs the app instance is > already the leader and it will not fire the leadership changed event and the > delegated component will never start. > FileLockClusterView has a similar issue, it acquires the file lock prior to > firing the event, even if the event processing fails it does not rollback the > leader selection. > Other cluster view implementations might have the same issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)