[ https://issues.apache.org/jira/browse/NIFI-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15159714#comment-15159714 ]
Joseph Witt commented on NIFI-1557: ----------------------------------- +1 on patch. Tony offered to merge to master and support/0.5.x so yes please do. Tony please run contrib-check as I've not yet done that. Thanks Joe > Controller Services and Reporting Tasks not properly ordered in fingerprint > verification - makes restart/upgrades difficult > --------------------------------------------------------------------------------------------------------------------------- > > Key: NIFI-1557 > URL: https://issues.apache.org/jira/browse/NIFI-1557 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework > Affects Versions: 0.3.0, 0.5.0, 0.4.1 > Reporter: Joseph Witt > Assignee: Mark Payne > Priority: Critical > Fix For: 0.5.1 > > Attachments: > 0001-NIFI-1557-Ensure-that-Reporting-Tasks-controller-ser.patch > > > In upgrading a cluster have found unreliable behavior. In restarting nodes > in a cluster found unreliable behavior. Turns out it appears we are not > ordering the controller services and reporting tasks before doing fingerprint > verification. This causes unreliable restarts/upgrades. > {quote} > The NMC says: > =========== > 2016-02-23 19:42:32,466 INFO [Handle Reconnection Failure Message from > [id=61103aa8-1226-44e4-827c-b150ac7d4079, > apiAddress=processing-2.demo.onyara.com, apiPort=8443, > socketAddress=processing-2.demo.onyara.com, socketPort=6000, > siteToSiteAddress=processing-2.demo.onyara.com, siteToSitePort=9000]] > o.a.n.c.manager.impl.WebClusterManager Node Event: > [id=61103aa8-1226-44e4-827c-b150ac7d4079, > apiAddress=processing-2.demo.onyara.com, apiPort=8443, > socketAddress=processing-2.demo.onyara.com, socketPort=6000, > siteToSiteAddress=processing-2.demo.onyara.com, siteToSitePort=9000] -- 'Node > could not rejoin cluster. Setting node to Disconnected. Node reported the > following error: org.apache.nifi.cluster.ConnectionException: Failed to > connect node to cluster because local flow is different than cluster flow.' > 2016-02-23 19:42:32,530 INFO [Process NCM Request-5] > o.a.n.c.p.impl.SocketProtocolListener Finished processing request > 6d4a5f0e-e4e5-47cf-843a-7194660197c8 (type=RECONNECTION_FAILURE, length=654 > bytes) in 102 millis > 2016-02-23 19:42:32,530 INFO [Handle Reconnection Failure Message from > [id=e5577bb3-290c-4688-be43-78b0b4af06fb, > apiAddress=processing-1.demo.onyara.com, apiPort=8443, > socketAddress=processing-1.demo.onyara.com, socketPort=6000, > siteToSiteAddress=processing-1.demo.onyara.com, siteToSitePort=9000]] > o.a.n.c.manager.impl.WebClusterManager Node Event: > [id=e5577bb3-290c-4688-be43-78b0b4af06fb, > apiAddress=processing-1.demo.onyara.com, apiPort=8443, > socketAddress=processing-1.demo.onyara.com, socketPort=6000, > siteToSiteAddress=processing-1.demo.onyara.com, siteToSitePort=9000] -- 'Node > could not rejoin cluster. Setting node to Disconnected. Node reported the > following error: org.apache.nifi.cluster.ConnectionException: Failed to > connect node to cluster because local flow is different than cluster flow.' > 2016-02-23 19:42:34,875 INFO [Process NCM Request-6] > o.a.n.c.p.impl.SocketProtocolListener Received request > 1b681e8d-3f68-4be2-9ab2-7134050cce9d from 172.31.4.123 > 2016-02-23 19:42:34,908 INFO [Process NCM Request-7] > o.a.n.c.p.impl.SocketProtocolListener Received request > 5008cbb3-5084-4083-9b56-84bd5ad3c55c from 172.31.4.123 > 2016-02-23 19:42:34,994 INFO [Process NCM Request-6] > o.a.n.c.p.impl.SocketProtocolListener Finished processing request > 1b681e8d-3f68-4be2-9ab2-7134050cce9d (type=BULLETINS, length=1550 bytes) in > 118 millis > 2016-02-23 19:42:35,008 INFO [Process NCM Request-7] > o.a.n.c.p.impl.SocketProtocolListener Finished processing request > 5008cbb3-5084-4083-9b56-84bd5ad3c55c (type=RECONNECTION_FAILURE, length=654 > bytes) in 100 millis > 2016-02-23 19:42:35,009 INFO [Handle Reconnection Failure Message from > [id=2240c663-2b20-48dc-936d-078a11812d49, > apiAddress=processing-3.demo.onyara.com, apiPort=8443, > socketAddress=processing-3.demo.onyara.com, socketPort=6000, > siteToSiteAddress=processing-3.demo.onyara.com, siteToSitePort=9000]] > o.a.n.c.manager.impl.WebClusterManager Node Event: > [id=2240c663-2b20-48dc-936d-078a11812d49, > apiAddress=processing-3.demo.onyara.com, apiPort=8443, > socketAddress=processing-3.demo.onyara.com, socketPort=6000, > siteToSiteAddress=processing-3.demo.onyara.com, siteToSitePort=9000] -- 'Node > could not rejoin cluster. Setting node to Disconnected. Node reported the > following error: org.apache.nifi.cluster.ConnectionException: Failed to > connect node to cluster because local flow is different than cluster flow.' > ============ > processing-1 says: > ========== > 2016-02-23 19:42:32,410 ERROR [Reconnect to Cluster] > o.a.nifi.controller.StandardFlowService Handling reconnection request failed > due to: org.apache.nifi.cluster.ConnectionException: Failed t > o connect node to cluster because local flow is different than cluster flow. > org.apache.nifi.cluster.ConnectionException: Failed to connect node to > cluster because local flow is different than cluster flow. > at > org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:760) > [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > at > org.apache.nifi.controller.StandardFlowService.handleReconnectionRequest(StandardFlowService.java:533) > [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > at > org.apache.nifi.controller.StandardFlowService.access$200(StandardFlowService.java:81) > [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > at > org.apache.nifi.controller.StandardFlowService$1.run(StandardFlowService.java:370) > [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79] > Caused by: org.apache.nifi.controller.UninheritableFlowException: Proposed > configuration is not inheritable by the flow controller because of flow > differences: Found difference in Flows: > Local Fingerprint: > 9ca96b153807ca8876-6f68-4ba8-9bd7-96febb03cbdbPROCESSORNO_VALUE1a03fca1-fbf2-4029-8c23-1ba2e96f00c81ad4e807-4d92-3475-9e99-e7b5e8b3592aorg.apache.nifi.processors.kafka.Pu > tKafkaNO_VALUEClient NameNiFi- > Cluster Fingerprint: > 9ca96b153807ca8876-6f68-4ba8-9bd7-96febb03cbdbPROCESSORNO_VALUE1a03fca1-fbf2-4029-8c23-1ba2e96f00c818c0ac39-0aef-3df1-8d20-f2c1460536c4org.apache.nifi.processors.kafka.Pu > tKafkaNO_VALUEClient NameNiFi- > at > org.apache.nifi.controller.StandardFlowSynchronizer.sync(StandardFlowSynchronizer.java:216) > ~[nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > at > org.apache.nifi.controller.FlowController.synchronize(FlowController.java:1285) > ~[nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > at > org.apache.nifi.persistence.StandardXMLFlowConfigurationDAO.load(StandardXMLFlowConfigurationDAO.java:72) > ~[nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > at > org.apache.nifi.controller.StandardFlowService.loadFromBytes(StandardFlowService.java:629) > [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > at > org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:737) > [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > ... 4 common frames omitted > ========== > > processing-2 says: > =========== > Caused by: org.apache.nifi.controller.UninheritableFlowException: Proposed > configuration is not inheritable by the flow controller because of flow > differences: Found difference in Flows: > Local Fingerprint: > 9ca96b153807ca8876-6f68-4ba8-9bd7-96febb03cbdbPROCESSORNO_VALUE1a03fca1-fbf2-4029-8c23-1ba2e96f00c81ad4e807-4d92-3475-9e99-e7b5e8b3592aorg.apache.nifi.processors.kafka.Pu > tKafkaNO_VALUEClient NameNiFi- > Cluster Fingerprint: > 9ca96b153807ca8876-6f68-4ba8-9bd7-96febb03cbdbPROCESSORNO_VALUE1a03fca1-fbf2-4029-8c23-1ba2e96f00c818c0ac39-0aef-3df1-8d20-f2c1460536c4org.apache.nifi.processors.kafka.Pu > tKafkaNO_VALUEClient NameNiFi- > at > org.apache.nifi.controller.StandardFlowSynchronizer.sync(StandardFlowSynchronizer.java:216) > ~[nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > at > org.apache.nifi.controller.FlowController.synchronize(FlowController.java:1285) > ~[nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > at > org.apache.nifi.persistence.StandardXMLFlowConfigurationDAO.load(StandardXMLFlowConfigurationDAO.java:72) > ~[nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > at > org.apache.nifi.controller.StandardFlowService.loadFromBytes(StandardFlowService.java:629) > [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > at > org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:737) > [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > ... 4 common frames omitted > =========== > processing-3 says: > =========== > 2016-02-23 19:42:34,896 ERROR [Reconnect to Cluster] > o.a.nifi.controller.StandardFlowService Handling reconnection request failed > due to: org.apache.nifi.cluster.ConnectionException: Failed t > o connect node to cluster because local flow is different than cluster flow. > org.apache.nifi.cluster.ConnectionException: Failed to connect node to > cluster because local flow is different than cluster flow. > at > org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:760) > [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > at > org.apache.nifi.controller.StandardFlowService.handleReconnectionRequest(StandardFlowService.java:533) > [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > at > org.apache.nifi.controller.StandardFlowService.access$200(StandardFlowService.java:81) > [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > at > org.apache.nifi.controller.StandardFlowService$1.run(StandardFlowService.java:370) > [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_65] > Caused by: org.apache.nifi.controller.UninheritableFlowException: Proposed > configuration is not inheritable by the flow controller because of flow > differences: Found difference in Flows: > Local Fingerprint: > 9ca96b153807ca8876-6f68-4ba8-9bd7-96febb03cbdbPROCESSORNO_VALUE1a03fca1-fbf2-4029-8c23-1ba2e96f00c81ad4e807-4d92-3475-9e99-e7b5e8b3592aorg.apache.nifi.processors.kafka.Pu > tKafkaNO_VALUEClient NameNiFi- > Cluster Fingerprint: > 9ca96b153807ca8876-6f68-4ba8-9bd7-96febb03cbdbPROCESSORNO_VALUE1a03fca1-fbf2-4029-8c23-1ba2e96f00c818c0ac39-0aef-3df1-8d20-f2c1460536c4org.apache.nifi.processors.kafka.Pu > tKafkaNO_VALUEClient NameNiFi- > at > org.apache.nifi.controller.StandardFlowSynchronizer.sync(StandardFlowSynchronizer.java:216) > ~[nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > at > org.apache.nifi.controller.FlowController.synchronize(FlowController.java:1285) > ~[nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > at > org.apache.nifi.persistence.StandardXMLFlowConfigurationDAO.load(StandardXMLFlowConfigurationDAO.java:72) > ~[nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > at > org.apache.nifi.controller.StandardFlowService.loadFromBytes(StandardFlowService.java:629) > [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > at > org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:737) > [nifi-framework-core-0.5.1-SNAPSHOT.jar:0.5.1-SNAPSHOT] > ... 4 common frames omitted > > =========== > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)