Re: Nifi clusters : duplicate nodes shown in cluster overview

Joe Witt Fri, 19 May 2017 06:02:38 -0700

When a node joins a cluster it writes its node identifier into its local
state.  This is in a state directory.  Is that directory being removed?  If
so it will get a new identifier.  Otherwise when the node is started it
will reuse that identifier.


On May 19, 2017 4:13 AM, "ddewaele" <ddewa...@gmail.com> wrote:

> We have a 2 node cluster (centos-a / centos-b).  During on of your failover
> tests, we noticed that when we rebooted centos-b, sometimes "duplicate"
> node
> entries can be seen in the cluster.
>
> We rebooted centos-b and when it came back online the cluster NiFi saw 2
> out
> of 3 nodes connected.
>
> centos-b was added twice (using different nodeIds).
>
> 1. centos-b : 05/19/2017 06:48:51 UTC : Node disconnected from cluster due
> to Have not received a heartbeat from node in 44 seconds
> 2. centos-b : 05/19/2017 07:42:54 UTC : Received first heartbeat from
> connecting node. Node connected.
>
> Is this by design ? In this case (and I assume in most cases), an address /
> apiPort combo should uniquely identify a particular node. Why does it get
> assigned a new nodeId ?
>
> As a result, we need to manually disconnected the duplicate disconnected
> centos-b
>
>
> Output of the cluster rest endpoint :
>
>
> {
>   "cluster": {
>     "nodes": [
>       {
>         "nodeId": "62be0e80-306a-4037-80e5-b4def5fbc78e",
>         "address": "centos-b",
>         "apiPort": 8080,
>         "status": "DISCONNECTED",
>         "roles": [],
>         "events": [
>           {
>             "timestamp": "05/19/2017 06:48:51 UTC",
>             "category": "WARNING",
>             "message": "Node disconnected from cluster due to Have not
> received a heartbeat from node in 44 seconds"
>           },
>           {
>             "timestamp": "05/18/2017 13:33:56 UTC",
>             "category": "INFO",
>             "message": "Node Status changed from CONNECTING to CONNECTED"
>           }
>         ]
>       },
>       {
>         "nodeId": "d41d71f2-0ab4-4d6e-bbf2-793bd4faad06",
>         "address": "centos-a",
>         "apiPort": 8080,
>         "status": "CONNECTED",
>         "heartbeat": "05/19/2017 07:44:39 UTC",
>         "roles": [
>           "Primary Node",
>           "Cluster Coordinator"
>         ],
>         "activeThreadCount": 0,
>         "queued": "0 / 0 bytes",
>         "events": [
>           {
>             "timestamp": "05/18/2017 13:33:56 UTC",
>             "category": "INFO",
>             "message": "Node Status changed from CONNECTING to CONNECTED"
>           }
>         ],
>         "nodeStartTime": "05/18/2017 13:33:51 UTC"
>       },
>       {
>         "nodeId": "ddd371c7-2618-4079-8c61-ee30245d15cc",
>         "address": "centos-b",
>         "apiPort": 8080,
>         "status": "CONNECTED",
>         "heartbeat": "05/19/2017 07:44:36 UTC",
>         "roles": [],
>         "activeThreadCount": 0,
>         "queued": "0 / 0 bytes",
>         "events": [
>           {
>             "timestamp": "05/19/2017 07:42:54 UTC",
>             "category": "INFO",
>             "message": "Received first heartbeat from connecting node. Node
> connected."
>           },
>           {
>             "timestamp": "05/19/2017 07:42:47 UTC",
>             "category": "INFO",
>             "message": "Connection requested from existing node. Setting
> status to connecting."
>           }
>         ],
>         "nodeStartTime": "05/19/2017 07:42:40 UTC"
>       }
>     ],
>     "generated": "07:44:39 UTC"
>   }
> }
>
>
>
> --
> View this message in context: http://apache-nifi-users-list.
> 2361937.n4.nabble.com/Nifi-clusters-duplicate-nodes-
> shown-in-cluster-overview-tp1966.html
> Sent from the Apache NiFi Users List mailing list archive at Nabble.com.
>

Re: Nifi clusters : duplicate nodes shown in cluster overview

Reply via email to