We have a 2 node cluster (centos-a / centos-b).  During on of your failover
tests, we noticed that when we rebooted centos-b, sometimes "duplicate" node
entries can be seen in the cluster.

We rebooted centos-b and when it came back online the cluster NiFi saw 2 out
of 3 nodes connected. 

centos-b was added twice (using different nodeIds).

1. centos-b : 05/19/2017 06:48:51 UTC : Node disconnected from cluster due
to Have not received a heartbeat from node in 44 seconds
2. centos-b : 05/19/2017 07:42:54 UTC : Received first heartbeat from
connecting node. Node connected.

Is this by design ? In this case (and I assume in most cases), an address /
apiPort combo should uniquely identify a particular node. Why does it get
assigned a new nodeId ?

As a result, we need to manually disconnected the duplicate disconnected
centos-b


Output of the cluster rest endpoint :

 
{
  "cluster": {
    "nodes": [
      {
        "nodeId": "62be0e80-306a-4037-80e5-b4def5fbc78e",
        "address": "centos-b",
        "apiPort": 8080,
        "status": "DISCONNECTED",
        "roles": [],
        "events": [
          {
            "timestamp": "05/19/2017 06:48:51 UTC",
            "category": "WARNING",
            "message": "Node disconnected from cluster due to Have not
received a heartbeat from node in 44 seconds"
          },
          {
            "timestamp": "05/18/2017 13:33:56 UTC",
            "category": "INFO",
            "message": "Node Status changed from CONNECTING to CONNECTED"
          }
        ]
      },
      {
        "nodeId": "d41d71f2-0ab4-4d6e-bbf2-793bd4faad06",
        "address": "centos-a",
        "apiPort": 8080,
        "status": "CONNECTED",
        "heartbeat": "05/19/2017 07:44:39 UTC",
        "roles": [
          "Primary Node",
          "Cluster Coordinator"
        ],
        "activeThreadCount": 0,
        "queued": "0 / 0 bytes",
        "events": [
          {
            "timestamp": "05/18/2017 13:33:56 UTC",
            "category": "INFO",
            "message": "Node Status changed from CONNECTING to CONNECTED"
          }
        ],
        "nodeStartTime": "05/18/2017 13:33:51 UTC"
      },
      {
        "nodeId": "ddd371c7-2618-4079-8c61-ee30245d15cc",
        "address": "centos-b",
        "apiPort": 8080,
        "status": "CONNECTED",
        "heartbeat": "05/19/2017 07:44:36 UTC",
        "roles": [],
        "activeThreadCount": 0,
        "queued": "0 / 0 bytes",
        "events": [
          {
            "timestamp": "05/19/2017 07:42:54 UTC",
            "category": "INFO",
            "message": "Received first heartbeat from connecting node. Node
connected."
          },
          {
            "timestamp": "05/19/2017 07:42:47 UTC",
            "category": "INFO",
            "message": "Connection requested from existing node. Setting
status to connecting."
          }
        ],
        "nodeStartTime": "05/19/2017 07:42:40 UTC"
      }
    ],
    "generated": "07:44:39 UTC"
  }
}



--
View this message in context: 
http://apache-nifi-users-list.2361937.n4.nabble.com/Nifi-clusters-duplicate-nodes-shown-in-cluster-overview-tp1966.html
Sent from the Apache NiFi Users List mailing list archive at Nabble.com.

Reply via email to