[ https://issues.apache.org/jira/browse/IGNITE-15886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gaurav Aggarwal updated IGNITE-15886: ------------------------------------- Summary: Intermittent [Failed to send message to next nod] exception on node shutdown (was: Intermittent [Failed to notify direct custom event listener] exception on node shutdown) > Intermittent [Failed to send message to next nod] exception on node shutdown > ---------------------------------------------------------------------------- > > Key: IGNITE-15886 > URL: https://issues.apache.org/jira/browse/IGNITE-15886 > Project: Ignite > Issue Type: Bug > Affects Versions: 2.9 > Reporter: Gaurav Aggarwal > Assignee: Mikhail Petrov > Priority: Major > > {+}*Reproducer*{+}: > Run a cluster with few nodes, on bringing down one of the nodes, the other > nodes throw this exception intermittently and eventually come down > +*Actual result*+ > Intermittent exception in one of the nodes: > {noformat} > SEVERE: Failed to notify direct custom event listener: > StartRoutineDiscoveryMessage [startReqData=StartRequestData > [prjPred=com.bfm.libignite.cluster.filter.ClusterGroupFilter@3a0d7950, > clsName=null, depInfo=null, hnd=CacheContinuousQueryHandlerV2 > [rmtFilterFactoryDep=null, types=0], bufSize=1, interval=0, > autoUnsubscribe=true], keepBinary=false, deserEx=null, > routineId=a452f5c7-4a27-4a5e-be3e-77ddae5d50a3] > java.lang.NullPointerException > at > org.apache.ignite.internal.processors.continuous.StartRoutineDiscoveryMessage.addUpdateCounters(StartRoutineDiscoveryMessage.java:95) > at > org.apache.ignite.internal.processors.continuous.StartRoutineDiscoveryMessage.addUpdateCounters(StartRoutineDiscoveryMessage.java:109) > at > org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.processStartRequest(GridContinuousProcessor.java:1482) > at > org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.access$400(GridContinuousProcessor.java:117) > at > org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:220) > at > org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:211) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:717) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:526) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2677) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2715) > at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) > at java.lang.Thread.run(Thread.java:748)Nov 03, 2021 12:11:00 PM > org.apache.ignite.logger.java.JavaLogger error > SEVERE: Failed to notify direct custom event listener: > StartRoutineDiscoveryMessage [startReqData=StartRequestData > [prjPred=com.bfm.libignite.cluster.filter.ClusterGroupFilter@67cc413e, > clsName=null, depInfo=null, hnd=CacheContinuousQueryHandlerV2 > [rmtFilterFactoryDep=null, types=0], bufSize=1, interval=0, > autoUnsubscribe=true], keepBinary=false, deserEx=null, > routineId=c2ce7d01-5c5b-4654-8942-bd7ab6b90351] > java.lang.NullPointerException > at > org.apache.ignite.internal.processors.continuous.StartRoutineDiscoveryMessage.addUpdateCounters(StartRoutineDiscoveryMessage.java:95) > at > org.apache.ignite.internal.processors.continuous.StartRoutineDiscoveryMessage.addUpdateCounters(StartRoutineDiscoveryMessage.java:109) > at > org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.processStartRequest(GridContinuousProcessor.java:1482) > at > org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.access$400(GridContinuousProcessor.java:117) > at > org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:220) > at > org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:211) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:717) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:526) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2677) > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2715) > at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) > at java.lang.Thread.run(Thread.java:748) > {noformat} > Node terminates after giving a bunch of these exceptions : Failed to send > message to next node > {noformat} > Nov 03, 2021 12:14:40 PM org.apache.ignite.logger.java.JavaLogger warning > WARNING: Failed to send message to next node, try previous > [msg=TcpDiscoveryMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage > [...]] > Nov 03, 2021 12:14:40 PM org.apache.ignite.logger.java.JavaLogger warning > WARNING: Unable to connect to next nodes in a ring, it seems local node is > experiencing connectivity issues. Segmenting local node to avoid case when > one node fails a big part of cluster. To disable this behavior set > TcpDiscoverySpi.setConnectionRecoveryTimeout() to 0. > [connRecoveryTimeout=10000, effectiveConnRecoveryTimeout=10000, > failedNodes=[TcpDiscoveryNode [...] > Nov 03, 2021 12:14:40 PM org.apache.ignite.logger.java.JavaLogger info > {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)