[ 
https://issues.apache.org/jira/browse/NIFI-7866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hsin-Ying Lee updated NIFI-7866:
--------------------------------
    Comment: was deleted

(was: Hi Mark

Can we catch NPE and throw ConnectionException from loadFromConnectionResponse 
method?)

> When cluster coordinator dies, other nodes may have trouble rejoining cluster
> -----------------------------------------------------------------------------
>
>                 Key: NIFI-7866
>                 URL: https://issues.apache.org/jira/browse/NIFI-7866
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>            Reporter: Mark Payne
>            Priority: Major
>
> When the cluster coordinator is lost, the nodes must now begin communicating 
> with a newly elected Cluster Coordinator. This is handled through the 
> StandardFlowService.
> When the `handleReconnectionRequest` method is called and the request 
> provided does not contain the dataflow, the node is to connect to the cluster 
> coordinator and request the dataflow:
> {code:java}
> private void handleReconnectionRequest(final ReconnectionRequestMessage 
> request) {
>     try {
>         logger.info("Processing reconnection request from cluster 
> coordinator.");
>         // reconnect
>         ConnectionResponse connectionResponse = new 
> ConnectionResponse(getNodeId(), request.getDataFlow(),
>                 request.getInstanceId(), request.getNodeConnectionStatuses(), 
> request.getComponentRevisions());
>         if (connectionResponse.getDataFlow() == null) {
>             logger.info("Received a Reconnection Request that contained no 
> DataFlow. Will attempt to connect to cluster using local flow.");
>             connectionResponse = connect(false, false, 
> createDataFlowFromController());
>         }
>         loadFromConnectionResponse(connectionResponse);
> ... {code}
> However, if the call above to `connect(false, false, 
> createDataFlowFromController()` returns false (which is a valid case), that 
> null value is passed along to the loadFromConnectionResponse. This method 
> expects a non-null connectionResponse and throws a NullPointerException, 
> resulting in the following stack trace (stack trace based on nifi 1.11.4):
> {code:java}
> 2020-09-29 10:18:53,324 ERROR [Reconnect to Cluster] 
> o.a.nifi.controller.StandardFlowService Handling reconnection request failed 
> due to: org.apache.nifi.cluster.ConnectionException: Failed to connect node 
> to cluster due to: 
> java.lang.NullPointerExceptionorg.apache.nifi.cluster.ConnectionException: 
> Failed to connect node to cluster due to: java.lang.NullPointerExceptionat 
> org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:1035)at
>  
> org.apache.nifi.controller.StandardFlowService.handleReconnectionRequest(StandardFlowService.java:668)at
>  
> org.apache.nifi.controller.StandardFlowService.access$200(StandardFlowService.java:109)at
>  
> org.apache.nifi.controller.StandardFlowService$1.run(StandardFlowService.java:415)at
>  java.lang.Thread.run(Thread.java:748)Caused by: 
> java.lang.NullPointerException: nullat 
> org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:989)...
>  4 common frames omitted {code}
> This results in the node not reconnecting to the cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to