[GitHub] nifi pull request #491: NIFI-1960: Update admin guide regarding documentatio...

2016-06-03 Thread markap14
Github user markap14 closed the pull request at:

https://github.com/apache/nifi/pull/491


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request #491: NIFI-1960: Update admin guide regarding documentatio...

2016-06-03 Thread markap14
Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/491#discussion_r65757147
  
--- Diff: nifi-docs/src/main/asciidoc/administration-guide.adoc ---
@@ -1351,66 +1389,81 @@ in the file specified in 
`nifi.login.identity.provider.configuration.file`. Sett
 |nifi.security.ocsp.responder.certificate|This is the location of the OCSP 
responder certificate if one is being used. It is blank by default.
 |
 
-*Cluster Common Properties* +
+ Cluster Common Properties
 
-When setting up a NiFi cluster, these properties should be configured the 
same way on both the cluster manager and the nodes.
+When setting up a NiFi cluster, these properties should be configured the 
same way on all nodes.
 
 |
 |*Property*|*Description*
-|nifi.cluster.protocol.heartbeat.interval|The interval at which nodes 
should emit heartbeats to the cluster manager. The default value is 5 sec.
+|nifi.cluster.protocol.heartbeat.interval|The interval at which nodes 
should emit heartbeats to the Cluster Coordinator. The default value is 5 sec.
 |nifi.cluster.protocol.is.secure|This indicates whether cluster 
communications are secure. The default value is _false_.
-|nifi.cluster.protocol.socket.timeout|The amount of time to wait for a 
cluster protocol socket to be established before trying again. The default 
value is 30 sec.
-|nifi.cluster.protocol.connection.handshake.timeout|The amount of time to 
wait for a node to connect to the cluster. The default value is 45 sec.
+|nifi.cluster.node.event.history.size|When the state of a node in the 
cluster is changed, an event is generated
--- End diff --

Yes - I updated this in the new commit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request #491: NIFI-1960: Update admin guide regarding documentatio...

2016-06-03 Thread alopresto
Github user alopresto commented on a diff in the pull request:

https://github.com/apache/nifi/pull/491#discussion_r65756334
  
--- Diff: nifi-docs/src/main/asciidoc/administration-guide.adoc ---
@@ -485,98 +485,137 @@ It is preferable to request upstream/downstream 
systems to switch to https://cwi
 Clustering Configuration
 
 
-This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster. In the future, we hope to provide 
supplemental documentation that covers the NiFi Cluster Architecture in depth.
-
-The design of NiFi clustering is a simple master/slave model where there 
is a master and one or more slaves.
-While the model is that of master and slave, if the master dies, the 
slaves are all instructed to continue operating
-as they were to ensure the dataflow remains live. The absence of the 
master simply means new slaves cannot join the
-cluster and cluster flow changes cannot occur until the master is 
restored. In NiFi clustering, we call the master
-the NiFi Cluster Manager (NCM), and the slaves are called Nodes. See a 
full description of each in the Terminology section below.
+This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster.
+In the future, we hope to provide supplemental documentation that covers 
the NiFi Cluster Architecture in depth.
+
+NiFi employs a Zero-Master Clustering paradigm. Each of the nodes in the 
cluster performs the same tasks on
+the data but each operates on a different set of data. One of the nodes is 
automatically elected (via Apache
+ZooKeeper) as the Cluster Coordinator. All nodes in the cluster will then 
send heartbeat/status information
+to this node, and this node is responsible for disconnecting nodes that do 
not report any heartbeat status
+for some amount of time. Additionally, when a new node elects to join the 
cluster, the new node must first
+connect to the currently-elected Cluster Coordinator in order to obtain 
the most up-to-date flow. If the Cluster
+Coordinator determines that the node is allowed to join (based on its 
configured Firewall file), the current
+flow is provided to that node, and that node is able to join the cluster, 
assuming that the node's copy of the
+flow matches the copy provided by the Cluster Coordinator. If the node's 
version of the flow configuration differs
+from that of the Cluster Coordinator's, the node will not join the cluster.
 
 *Why Cluster?* +
 
-NiFi Administrators or Dataflow Managers (DFMs) may find that using one 
instance of NiFi on a single server is not enough to process the amount of data 
they have. So, one solution is to run the same dataflow on multiple NiFi 
servers. However, this creates a management problem, because each time DFMs 
want to change or update the dataflow, they must make those changes on each 
server and then monitor each server individually. By clustering the NiFi 
servers, it's possible to have that increased processing capability along with 
a single interface through which to make dataflow changes and monitor the 
dataflow. Clustering allows the DFM to make each change only once, and that 
change is then replicated to all the nodes of the cluster. Through the single 
interface, the DFM may also monitor the health and status of all the nodes.
+NiFi Administrators or Dataflow Managers (DFMs) may find that using one 
instance of NiFi on a single server is not
+enough to process the amount of data they have. So, one solution is to run 
the same dataflow on multiple NiFi servers.
+However, this creates a management problem, because each time DFMs want to 
change or update the dataflow, they must make
+those changes on each server and then monitor each server individually. By 
clustering the NiFi servers, it's possible to
+have that increased processing capability along with a single interface 
through which to make dataflow changes and monitor
+the dataflow. Clustering allows the DFM to make each change only once, and 
that change is then replicated to all the nodes
+of the cluster. Through the single interface, the DFM may also monitor the 
health and status of all the nodes.
 
 NiFi Clustering is unique and has its own terminology. It's important to 
understand the following terms before setting up a cluster.
 
 [template="glossary", id="terminology"]
 *Terminology* +
 
-*NiFi Cluster Manager*: A NiFi Cluster Manager (NCM) is an instance of 
NiFi that provides the sole management point for the cluster. It communicates 
dataflow changes to the nodes and receives health and status information from 
the nodes. It also ensures that a uniform dataflow is maintained across the 
cluster.  When DFMs manage a dataflow in a cluster, they do so through the User 
Interface of the NCM (i.e., via t

[GitHub] nifi pull request #491: NIFI-1960: Update admin guide regarding documentatio...

2016-06-03 Thread markap14
Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/491#discussion_r65756026
  
--- Diff: nifi-docs/src/main/asciidoc/administration-guide.adoc ---
@@ -485,98 +485,137 @@ It is preferable to request upstream/downstream 
systems to switch to https://cwi
 Clustering Configuration
 
 
-This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster. In the future, we hope to provide 
supplemental documentation that covers the NiFi Cluster Architecture in depth.
-
-The design of NiFi clustering is a simple master/slave model where there 
is a master and one or more slaves.
-While the model is that of master and slave, if the master dies, the 
slaves are all instructed to continue operating
-as they were to ensure the dataflow remains live. The absence of the 
master simply means new slaves cannot join the
-cluster and cluster flow changes cannot occur until the master is 
restored. In NiFi clustering, we call the master
-the NiFi Cluster Manager (NCM), and the slaves are called Nodes. See a 
full description of each in the Terminology section below.
+This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster.
+In the future, we hope to provide supplemental documentation that covers 
the NiFi Cluster Architecture in depth.
+
+NiFi employs a Zero-Master Clustering paradigm. Each of the nodes in the 
cluster performs the same tasks on
+the data but each operates on a different set of data. One of the nodes is 
automatically elected (via Apache
+ZooKeeper) as the Cluster Coordinator. All nodes in the cluster will then 
send heartbeat/status information
+to this node, and this node is responsible for disconnecting nodes that do 
not report any heartbeat status
+for some amount of time. Additionally, when a new node elects to join the 
cluster, the new node must first
+connect to the currently-elected Cluster Coordinator in order to obtain 
the most up-to-date flow. If the Cluster
+Coordinator determines that the node is allowed to join (based on its 
configured Firewall file), the current
+flow is provided to that node, and that node is able to join the cluster, 
assuming that the node's copy of the
+flow matches the copy provided by the Cluster Coordinator. If the node's 
version of the flow configuration differs
+from that of the Cluster Coordinator's, the node will not join the cluster.
 
 *Why Cluster?* +
 
-NiFi Administrators or Dataflow Managers (DFMs) may find that using one 
instance of NiFi on a single server is not enough to process the amount of data 
they have. So, one solution is to run the same dataflow on multiple NiFi 
servers. However, this creates a management problem, because each time DFMs 
want to change or update the dataflow, they must make those changes on each 
server and then monitor each server individually. By clustering the NiFi 
servers, it's possible to have that increased processing capability along with 
a single interface through which to make dataflow changes and monitor the 
dataflow. Clustering allows the DFM to make each change only once, and that 
change is then replicated to all the nodes of the cluster. Through the single 
interface, the DFM may also monitor the health and status of all the nodes.
+NiFi Administrators or Dataflow Managers (DFMs) may find that using one 
instance of NiFi on a single server is not
+enough to process the amount of data they have. So, one solution is to run 
the same dataflow on multiple NiFi servers.
+However, this creates a management problem, because each time DFMs want to 
change or update the dataflow, they must make
+those changes on each server and then monitor each server individually. By 
clustering the NiFi servers, it's possible to
+have that increased processing capability along with a single interface 
through which to make dataflow changes and monitor
+the dataflow. Clustering allows the DFM to make each change only once, and 
that change is then replicated to all the nodes
+of the cluster. Through the single interface, the DFM may also monitor the 
health and status of all the nodes.
 
 NiFi Clustering is unique and has its own terminology. It's important to 
understand the following terms before setting up a cluster.
 
 [template="glossary", id="terminology"]
 *Terminology* +
 
-*NiFi Cluster Manager*: A NiFi Cluster Manager (NCM) is an instance of 
NiFi that provides the sole management point for the cluster. It communicates 
dataflow changes to the nodes and receives health and status information from 
the nodes. It also ensures that a uniform dataflow is maintained across the 
cluster.  When DFMs manage a dataflow in a cluster, they do so through the User 
Interface of the NCM (i.e., via th

[GitHub] nifi pull request #491: NIFI-1960: Update admin guide regarding documentatio...

2016-06-03 Thread markap14
Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/491#discussion_r65755845
  
--- Diff: nifi-docs/src/main/asciidoc/administration-guide.adoc ---
@@ -485,98 +485,137 @@ It is preferable to request upstream/downstream 
systems to switch to https://cwi
 Clustering Configuration
 
 
-This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster. In the future, we hope to provide 
supplemental documentation that covers the NiFi Cluster Architecture in depth.
-
-The design of NiFi clustering is a simple master/slave model where there 
is a master and one or more slaves.
-While the model is that of master and slave, if the master dies, the 
slaves are all instructed to continue operating
-as they were to ensure the dataflow remains live. The absence of the 
master simply means new slaves cannot join the
-cluster and cluster flow changes cannot occur until the master is 
restored. In NiFi clustering, we call the master
-the NiFi Cluster Manager (NCM), and the slaves are called Nodes. See a 
full description of each in the Terminology section below.
+This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster.
+In the future, we hope to provide supplemental documentation that covers 
the NiFi Cluster Architecture in depth.
+
+NiFi employs a Zero-Master Clustering paradigm. Each of the nodes in the 
cluster performs the same tasks on
+the data but each operates on a different set of data. One of the nodes is 
automatically elected (via Apache
+ZooKeeper) as the Cluster Coordinator. All nodes in the cluster will then 
send heartbeat/status information
+to this node, and this node is responsible for disconnecting nodes that do 
not report any heartbeat status
+for some amount of time. Additionally, when a new node elects to join the 
cluster, the new node must first
+connect to the currently-elected Cluster Coordinator in order to obtain 
the most up-to-date flow. If the Cluster
+Coordinator determines that the node is allowed to join (based on its 
configured Firewall file), the current
+flow is provided to that node, and that node is able to join the cluster, 
assuming that the node's copy of the
+flow matches the copy provided by the Cluster Coordinator. If the node's 
version of the flow configuration differs
+from that of the Cluster Coordinator's, the node will not join the cluster.
 
 *Why Cluster?* +
 
-NiFi Administrators or Dataflow Managers (DFMs) may find that using one 
instance of NiFi on a single server is not enough to process the amount of data 
they have. So, one solution is to run the same dataflow on multiple NiFi 
servers. However, this creates a management problem, because each time DFMs 
want to change or update the dataflow, they must make those changes on each 
server and then monitor each server individually. By clustering the NiFi 
servers, it's possible to have that increased processing capability along with 
a single interface through which to make dataflow changes and monitor the 
dataflow. Clustering allows the DFM to make each change only once, and that 
change is then replicated to all the nodes of the cluster. Through the single 
interface, the DFM may also monitor the health and status of all the nodes.
+NiFi Administrators or Dataflow Managers (DFMs) may find that using one 
instance of NiFi on a single server is not
+enough to process the amount of data they have. So, one solution is to run 
the same dataflow on multiple NiFi servers.
+However, this creates a management problem, because each time DFMs want to 
change or update the dataflow, they must make
+those changes on each server and then monitor each server individually. By 
clustering the NiFi servers, it's possible to
+have that increased processing capability along with a single interface 
through which to make dataflow changes and monitor
+the dataflow. Clustering allows the DFM to make each change only once, and 
that change is then replicated to all the nodes
+of the cluster. Through the single interface, the DFM may also monitor the 
health and status of all the nodes.
 
 NiFi Clustering is unique and has its own terminology. It's important to 
understand the following terms before setting up a cluster.
 
 [template="glossary", id="terminology"]
 *Terminology* +
 
-*NiFi Cluster Manager*: A NiFi Cluster Manager (NCM) is an instance of 
NiFi that provides the sole management point for the cluster. It communicates 
dataflow changes to the nodes and receives health and status information from 
the nodes. It also ensures that a uniform dataflow is maintained across the 
cluster.  When DFMs manage a dataflow in a cluster, they do so through the User 
Interface of the NCM (i.e., via th

[GitHub] nifi pull request #491: NIFI-1960: Update admin guide regarding documentatio...

2016-06-03 Thread markap14
Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/491#discussion_r65755455
  
--- Diff: nifi-docs/src/main/asciidoc/administration-guide.adoc ---
@@ -485,98 +485,137 @@ It is preferable to request upstream/downstream 
systems to switch to https://cwi
 Clustering Configuration
 
 
-This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster. In the future, we hope to provide 
supplemental documentation that covers the NiFi Cluster Architecture in depth.
-
-The design of NiFi clustering is a simple master/slave model where there 
is a master and one or more slaves.
-While the model is that of master and slave, if the master dies, the 
slaves are all instructed to continue operating
-as they were to ensure the dataflow remains live. The absence of the 
master simply means new slaves cannot join the
-cluster and cluster flow changes cannot occur until the master is 
restored. In NiFi clustering, we call the master
-the NiFi Cluster Manager (NCM), and the slaves are called Nodes. See a 
full description of each in the Terminology section below.
+This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster.
+In the future, we hope to provide supplemental documentation that covers 
the NiFi Cluster Architecture in depth.
+
+NiFi employs a Zero-Master Clustering paradigm. Each of the nodes in the 
cluster performs the same tasks on
+the data but each operates on a different set of data. One of the nodes is 
automatically elected (via Apache
+ZooKeeper) as the Cluster Coordinator. All nodes in the cluster will then 
send heartbeat/status information
+to this node, and this node is responsible for disconnecting nodes that do 
not report any heartbeat status
+for some amount of time. Additionally, when a new node elects to join the 
cluster, the new node must first
+connect to the currently-elected Cluster Coordinator in order to obtain 
the most up-to-date flow. If the Cluster
+Coordinator determines that the node is allowed to join (based on its 
configured Firewall file), the current
+flow is provided to that node, and that node is able to join the cluster, 
assuming that the node's copy of the
+flow matches the copy provided by the Cluster Coordinator. If the node's 
version of the flow configuration differs
+from that of the Cluster Coordinator's, the node will not join the cluster.
 
 *Why Cluster?* +
 
-NiFi Administrators or Dataflow Managers (DFMs) may find that using one 
instance of NiFi on a single server is not enough to process the amount of data 
they have. So, one solution is to run the same dataflow on multiple NiFi 
servers. However, this creates a management problem, because each time DFMs 
want to change or update the dataflow, they must make those changes on each 
server and then monitor each server individually. By clustering the NiFi 
servers, it's possible to have that increased processing capability along with 
a single interface through which to make dataflow changes and monitor the 
dataflow. Clustering allows the DFM to make each change only once, and that 
change is then replicated to all the nodes of the cluster. Through the single 
interface, the DFM may also monitor the health and status of all the nodes.
+NiFi Administrators or Dataflow Managers (DFMs) may find that using one 
instance of NiFi on a single server is not
+enough to process the amount of data they have. So, one solution is to run 
the same dataflow on multiple NiFi servers.
+However, this creates a management problem, because each time DFMs want to 
change or update the dataflow, they must make
+those changes on each server and then monitor each server individually. By 
clustering the NiFi servers, it's possible to
+have that increased processing capability along with a single interface 
through which to make dataflow changes and monitor
+the dataflow. Clustering allows the DFM to make each change only once, and 
that change is then replicated to all the nodes
+of the cluster. Through the single interface, the DFM may also monitor the 
health and status of all the nodes.
 
 NiFi Clustering is unique and has its own terminology. It's important to 
understand the following terms before setting up a cluster.
 
 [template="glossary", id="terminology"]
 *Terminology* +
 
-*NiFi Cluster Manager*: A NiFi Cluster Manager (NCM) is an instance of 
NiFi that provides the sole management point for the cluster. It communicates 
dataflow changes to the nodes and receives health and status information from 
the nodes. It also ensures that a uniform dataflow is maintained across the 
cluster.  When DFMs manage a dataflow in a cluster, they do so through the User 
Interface of the NCM (i.e., via th

[GitHub] nifi pull request #491: NIFI-1960: Update admin guide regarding documentatio...

2016-06-03 Thread markap14
Github user markap14 commented on a diff in the pull request:

https://github.com/apache/nifi/pull/491#discussion_r65755030
  
--- Diff: nifi-docs/src/main/asciidoc/administration-guide.adoc ---
@@ -485,98 +485,137 @@ It is preferable to request upstream/downstream 
systems to switch to https://cwi
 Clustering Configuration
 
 
-This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster. In the future, we hope to provide 
supplemental documentation that covers the NiFi Cluster Architecture in depth.
-
-The design of NiFi clustering is a simple master/slave model where there 
is a master and one or more slaves.
-While the model is that of master and slave, if the master dies, the 
slaves are all instructed to continue operating
-as they were to ensure the dataflow remains live. The absence of the 
master simply means new slaves cannot join the
-cluster and cluster flow changes cannot occur until the master is 
restored. In NiFi clustering, we call the master
-the NiFi Cluster Manager (NCM), and the slaves are called Nodes. See a 
full description of each in the Terminology section below.
+This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster.
+In the future, we hope to provide supplemental documentation that covers 
the NiFi Cluster Architecture in depth.
+
+NiFi employs a Zero-Master Clustering paradigm. Each of the nodes in the 
cluster performs the same tasks on
+the data but each operates on a different set of data. One of the nodes is 
automatically elected (via Apache
+ZooKeeper) as the Cluster Coordinator. All nodes in the cluster will then 
send heartbeat/status information
+to this node, and this node is responsible for disconnecting nodes that do 
not report any heartbeat status
+for some amount of time. Additionally, when a new node elects to join the 
cluster, the new node must first
+connect to the currently-elected Cluster Coordinator in order to obtain 
the most up-to-date flow. If the Cluster
+Coordinator determines that the node is allowed to join (based on its 
configured Firewall file), the current
--- End diff --

This is a NiFi thing - not new, though. It has been around since clustering 
was initially introduced. Simply a text file that contains the IP addresses of 
nodes that are allowed to join the cluster.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request #491: NIFI-1960: Update admin guide regarding documentatio...

2016-06-03 Thread alopresto
Github user alopresto commented on a diff in the pull request:

https://github.com/apache/nifi/pull/491#discussion_r65754466
  
--- Diff: nifi-docs/src/main/asciidoc/administration-guide.adoc ---
@@ -1351,66 +1389,81 @@ in the file specified in 
`nifi.login.identity.provider.configuration.file`. Sett
 |nifi.security.ocsp.responder.certificate|This is the location of the OCSP 
responder certificate if one is being used. It is blank by default.
 |
 
-*Cluster Common Properties* +
+ Cluster Common Properties
 
-When setting up a NiFi cluster, these properties should be configured the 
same way on both the cluster manager and the nodes.
+When setting up a NiFi cluster, these properties should be configured the 
same way on all nodes.
 
 |
 |*Property*|*Description*
-|nifi.cluster.protocol.heartbeat.interval|The interval at which nodes 
should emit heartbeats to the cluster manager. The default value is 5 sec.
+|nifi.cluster.protocol.heartbeat.interval|The interval at which nodes 
should emit heartbeats to the Cluster Coordinator. The default value is 5 sec.
 |nifi.cluster.protocol.is.secure|This indicates whether cluster 
communications are secure. The default value is _false_.
-|nifi.cluster.protocol.socket.timeout|The amount of time to wait for a 
cluster protocol socket to be established before trying again. The default 
value is 30 sec.
-|nifi.cluster.protocol.connection.handshake.timeout|The amount of time to 
wait for a node to connect to the cluster. The default value is 45 sec.
+|nifi.cluster.node.event.history.size|When the state of a node in the 
cluster is changed, an event is generated
+and can be viewed in the Cluster page. This value indicates how many 
events to keep in memory for each node.
+|nifi.cluster.node.connection.timeout|When connecting to another node in 
the cluster, specifies how long this node should wait before considering
+the connection a failure.
+|nifi.cluster.node.read.timeout|When communicating with another node in 
the cluster, specifies how long this node should wait to receive information
+from the remote node before considering the communication with the node a 
failure.
+|nifi.cluster.firewall.file|The location of the node firewall file. This 
is a file that may be used to list all the nodes that are allowed to connect
+to the cluster. It provides an additional layer of security. This value is 
blank by default, meaning that no firewall file is to be used.
 |
 
-*Multicast Cluster Common Properties* +
-If multicast is used, the following nifi.cluster.protocol.multicast.xxx 
properties must be configured. By default, unicast is used.
+ Cluster Node Properties
+
+Configure these properties for cluster nodes.
 
 |
 |*Property*|*Description*
-|nifi.cluster.protocol.use.multicast|Indicates whether multicast is being 
used. The default value is _false_.
-|nifi.cluster.protocol.multicast.address|The cluster multicast address. It 
is blank by default.
-|nifi.cluster.protocol.multicast.port|The cluster multicast port. It is 
blank by default.
-|nifi.cluster.protocol.multicast.service.broadcast.delay|The multicast 
service broadcast delay. The default value is 500 ms.
-|nifi.cluster.protocol.multicast.service.locator.attempts|The number of 
multicast service locator attempts to make. The default value is 3.
-|nifi.cluster.protocol.multicast.service.locator.attempts.delay|The 
multicast service locator attempts delay. The default value is 1 sec.
+|nifi.cluster.is.node|Set this to _true_ if the instance is a node in a 
cluster. The default value is _false_.
+|nifi.cluster.node.address|The fully qualified address of the node. It is 
blank by default.
+|nifi.cluster.node.protocol.port|The node's protocol port. It is blank by 
default.
+|nifi.cluster.node.protocol.threads|The number of threads that should be 
used to communicate with other nodes
+in the cluster. This property defaults to 10, but for large clusters, this 
value may need to be larger.
 |
 
-*Cluster Node Properties* +
-
-Only configure these properties for cluster nodes.
+[[claim_management]]
+ Claim Management
+
+Whenever a request is made to change the dataflow, it is important that
+all nodes in the NiFi cluster are kept in-sync. In order to allow for 
this, NiFi employs a two-phase commit. The request
+is first replicated to all nodes in the cluster, simply asking whether or 
not the request is allowed. Each node then determines
+whether or not it will allow the request and if so issues a "Claim" on the 
component(s) being modified. This claim can be
+thought of as a mutually-exclusive lock that is owned by the requestor. 
Once all nodes have voted on whether or not the request
+is allowed, the node from which the request originated must deci

[GitHub] nifi pull request #491: NIFI-1960: Update admin guide regarding documentatio...

2016-06-03 Thread alopresto
Github user alopresto commented on a diff in the pull request:

https://github.com/apache/nifi/pull/491#discussion_r65753532
  
--- Diff: nifi-docs/src/main/asciidoc/administration-guide.adoc ---
@@ -1351,66 +1389,81 @@ in the file specified in 
`nifi.login.identity.provider.configuration.file`. Sett
 |nifi.security.ocsp.responder.certificate|This is the location of the OCSP 
responder certificate if one is being used. It is blank by default.
 |
 
-*Cluster Common Properties* +
+ Cluster Common Properties
 
-When setting up a NiFi cluster, these properties should be configured the 
same way on both the cluster manager and the nodes.
+When setting up a NiFi cluster, these properties should be configured the 
same way on all nodes.
 
 |
 |*Property*|*Description*
-|nifi.cluster.protocol.heartbeat.interval|The interval at which nodes 
should emit heartbeats to the cluster manager. The default value is 5 sec.
+|nifi.cluster.protocol.heartbeat.interval|The interval at which nodes 
should emit heartbeats to the Cluster Coordinator. The default value is 5 sec.
 |nifi.cluster.protocol.is.secure|This indicates whether cluster 
communications are secure. The default value is _false_.
-|nifi.cluster.protocol.socket.timeout|The amount of time to wait for a 
cluster protocol socket to be established before trying again. The default 
value is 30 sec.
-|nifi.cluster.protocol.connection.handshake.timeout|The amount of time to 
wait for a node to connect to the cluster. The default value is 45 sec.
+|nifi.cluster.node.event.history.size|When the state of a node in the 
cluster is changed, an event is generated
+and can be viewed in the Cluster page. This value indicates how many 
events to keep in memory for each node.
+|nifi.cluster.node.connection.timeout|When connecting to another node in 
the cluster, specifies how long this node should wait before considering
+the connection a failure.
+|nifi.cluster.node.read.timeout|When communicating with another node in 
the cluster, specifies how long this node should wait to receive information
+from the remote node before considering the communication with the node a 
failure.
+|nifi.cluster.firewall.file|The location of the node firewall file. This 
is a file that may be used to list all the nodes that are allowed to connect
--- End diff --

Where is the format of the firewall file described? How does it perform 
hostname verification?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request #491: NIFI-1960: Update admin guide regarding documentatio...

2016-06-03 Thread alopresto
Github user alopresto commented on a diff in the pull request:

https://github.com/apache/nifi/pull/491#discussion_r65753430
  
--- Diff: nifi-docs/src/main/asciidoc/administration-guide.adoc ---
@@ -1351,66 +1389,81 @@ in the file specified in 
`nifi.login.identity.provider.configuration.file`. Sett
 |nifi.security.ocsp.responder.certificate|This is the location of the OCSP 
responder certificate if one is being used. It is blank by default.
 |
 
-*Cluster Common Properties* +
+ Cluster Common Properties
 
-When setting up a NiFi cluster, these properties should be configured the 
same way on both the cluster manager and the nodes.
+When setting up a NiFi cluster, these properties should be configured the 
same way on all nodes.
 
 |
 |*Property*|*Description*
-|nifi.cluster.protocol.heartbeat.interval|The interval at which nodes 
should emit heartbeats to the cluster manager. The default value is 5 sec.
+|nifi.cluster.protocol.heartbeat.interval|The interval at which nodes 
should emit heartbeats to the Cluster Coordinator. The default value is 5 sec.
 |nifi.cluster.protocol.is.secure|This indicates whether cluster 
communications are secure. The default value is _false_.
-|nifi.cluster.protocol.socket.timeout|The amount of time to wait for a 
cluster protocol socket to be established before trying again. The default 
value is 30 sec.
-|nifi.cluster.protocol.connection.handshake.timeout|The amount of time to 
wait for a node to connect to the cluster. The default value is 45 sec.
+|nifi.cluster.node.event.history.size|When the state of a node in the 
cluster is changed, an event is generated
--- End diff --

Are there default values for this config properties?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request #491: NIFI-1960: Update admin guide regarding documentatio...

2016-06-03 Thread alopresto
Github user alopresto commented on a diff in the pull request:

https://github.com/apache/nifi/pull/491#discussion_r65753157
  
--- Diff: nifi-docs/src/main/asciidoc/administration-guide.adoc ---
@@ -485,98 +485,137 @@ It is preferable to request upstream/downstream 
systems to switch to https://cwi
 Clustering Configuration
 
 
-This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster. In the future, we hope to provide 
supplemental documentation that covers the NiFi Cluster Architecture in depth.
-
-The design of NiFi clustering is a simple master/slave model where there 
is a master and one or more slaves.
-While the model is that of master and slave, if the master dies, the 
slaves are all instructed to continue operating
-as they were to ensure the dataflow remains live. The absence of the 
master simply means new slaves cannot join the
-cluster and cluster flow changes cannot occur until the master is 
restored. In NiFi clustering, we call the master
-the NiFi Cluster Manager (NCM), and the slaves are called Nodes. See a 
full description of each in the Terminology section below.
+This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster.
+In the future, we hope to provide supplemental documentation that covers 
the NiFi Cluster Architecture in depth.
+
+NiFi employs a Zero-Master Clustering paradigm. Each of the nodes in the 
cluster performs the same tasks on
+the data but each operates on a different set of data. One of the nodes is 
automatically elected (via Apache
+ZooKeeper) as the Cluster Coordinator. All nodes in the cluster will then 
send heartbeat/status information
+to this node, and this node is responsible for disconnecting nodes that do 
not report any heartbeat status
+for some amount of time. Additionally, when a new node elects to join the 
cluster, the new node must first
+connect to the currently-elected Cluster Coordinator in order to obtain 
the most up-to-date flow. If the Cluster
+Coordinator determines that the node is allowed to join (based on its 
configured Firewall file), the current
+flow is provided to that node, and that node is able to join the cluster, 
assuming that the node's copy of the
+flow matches the copy provided by the Cluster Coordinator. If the node's 
version of the flow configuration differs
+from that of the Cluster Coordinator's, the node will not join the cluster.
 
 *Why Cluster?* +
 
-NiFi Administrators or Dataflow Managers (DFMs) may find that using one 
instance of NiFi on a single server is not enough to process the amount of data 
they have. So, one solution is to run the same dataflow on multiple NiFi 
servers. However, this creates a management problem, because each time DFMs 
want to change or update the dataflow, they must make those changes on each 
server and then monitor each server individually. By clustering the NiFi 
servers, it's possible to have that increased processing capability along with 
a single interface through which to make dataflow changes and monitor the 
dataflow. Clustering allows the DFM to make each change only once, and that 
change is then replicated to all the nodes of the cluster. Through the single 
interface, the DFM may also monitor the health and status of all the nodes.
+NiFi Administrators or Dataflow Managers (DFMs) may find that using one 
instance of NiFi on a single server is not
+enough to process the amount of data they have. So, one solution is to run 
the same dataflow on multiple NiFi servers.
+However, this creates a management problem, because each time DFMs want to 
change or update the dataflow, they must make
+those changes on each server and then monitor each server individually. By 
clustering the NiFi servers, it's possible to
+have that increased processing capability along with a single interface 
through which to make dataflow changes and monitor
+the dataflow. Clustering allows the DFM to make each change only once, and 
that change is then replicated to all the nodes
+of the cluster. Through the single interface, the DFM may also monitor the 
health and status of all the nodes.
 
 NiFi Clustering is unique and has its own terminology. It's important to 
understand the following terms before setting up a cluster.
 
 [template="glossary", id="terminology"]
 *Terminology* +
 
-*NiFi Cluster Manager*: A NiFi Cluster Manager (NCM) is an instance of 
NiFi that provides the sole management point for the cluster. It communicates 
dataflow changes to the nodes and receives health and status information from 
the nodes. It also ensures that a uniform dataflow is maintained across the 
cluster.  When DFMs manage a dataflow in a cluster, they do so through the User 
Interface of the NCM (i.e., via t

[GitHub] nifi pull request #491: NIFI-1960: Update admin guide regarding documentatio...

2016-06-03 Thread alopresto
Github user alopresto commented on a diff in the pull request:

https://github.com/apache/nifi/pull/491#discussion_r65752750
  
--- Diff: nifi-docs/src/main/asciidoc/administration-guide.adoc ---
@@ -485,98 +485,137 @@ It is preferable to request upstream/downstream 
systems to switch to https://cwi
 Clustering Configuration
 
 
-This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster. In the future, we hope to provide 
supplemental documentation that covers the NiFi Cluster Architecture in depth.
-
-The design of NiFi clustering is a simple master/slave model where there 
is a master and one or more slaves.
-While the model is that of master and slave, if the master dies, the 
slaves are all instructed to continue operating
-as they were to ensure the dataflow remains live. The absence of the 
master simply means new slaves cannot join the
-cluster and cluster flow changes cannot occur until the master is 
restored. In NiFi clustering, we call the master
-the NiFi Cluster Manager (NCM), and the slaves are called Nodes. See a 
full description of each in the Terminology section below.
+This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster.
+In the future, we hope to provide supplemental documentation that covers 
the NiFi Cluster Architecture in depth.
+
+NiFi employs a Zero-Master Clustering paradigm. Each of the nodes in the 
cluster performs the same tasks on
+the data but each operates on a different set of data. One of the nodes is 
automatically elected (via Apache
+ZooKeeper) as the Cluster Coordinator. All nodes in the cluster will then 
send heartbeat/status information
+to this node, and this node is responsible for disconnecting nodes that do 
not report any heartbeat status
+for some amount of time. Additionally, when a new node elects to join the 
cluster, the new node must first
+connect to the currently-elected Cluster Coordinator in order to obtain 
the most up-to-date flow. If the Cluster
+Coordinator determines that the node is allowed to join (based on its 
configured Firewall file), the current
+flow is provided to that node, and that node is able to join the cluster, 
assuming that the node's copy of the
+flow matches the copy provided by the Cluster Coordinator. If the node's 
version of the flow configuration differs
+from that of the Cluster Coordinator's, the node will not join the cluster.
 
 *Why Cluster?* +
 
-NiFi Administrators or Dataflow Managers (DFMs) may find that using one 
instance of NiFi on a single server is not enough to process the amount of data 
they have. So, one solution is to run the same dataflow on multiple NiFi 
servers. However, this creates a management problem, because each time DFMs 
want to change or update the dataflow, they must make those changes on each 
server and then monitor each server individually. By clustering the NiFi 
servers, it's possible to have that increased processing capability along with 
a single interface through which to make dataflow changes and monitor the 
dataflow. Clustering allows the DFM to make each change only once, and that 
change is then replicated to all the nodes of the cluster. Through the single 
interface, the DFM may also monitor the health and status of all the nodes.
+NiFi Administrators or Dataflow Managers (DFMs) may find that using one 
instance of NiFi on a single server is not
+enough to process the amount of data they have. So, one solution is to run 
the same dataflow on multiple NiFi servers.
+However, this creates a management problem, because each time DFMs want to 
change or update the dataflow, they must make
+those changes on each server and then monitor each server individually. By 
clustering the NiFi servers, it's possible to
+have that increased processing capability along with a single interface 
through which to make dataflow changes and monitor
+the dataflow. Clustering allows the DFM to make each change only once, and 
that change is then replicated to all the nodes
+of the cluster. Through the single interface, the DFM may also monitor the 
health and status of all the nodes.
 
 NiFi Clustering is unique and has its own terminology. It's important to 
understand the following terms before setting up a cluster.
 
 [template="glossary", id="terminology"]
 *Terminology* +
 
-*NiFi Cluster Manager*: A NiFi Cluster Manager (NCM) is an instance of 
NiFi that provides the sole management point for the cluster. It communicates 
dataflow changes to the nodes and receives health and status information from 
the nodes. It also ensures that a uniform dataflow is maintained across the 
cluster.  When DFMs manage a dataflow in a cluster, they do so through the User 
Interface of the NCM (i.e., via t

[GitHub] nifi pull request #491: NIFI-1960: Update admin guide regarding documentatio...

2016-06-03 Thread alopresto
Github user alopresto commented on a diff in the pull request:

https://github.com/apache/nifi/pull/491#discussion_r65752470
  
--- Diff: nifi-docs/src/main/asciidoc/administration-guide.adoc ---
@@ -485,98 +485,137 @@ It is preferable to request upstream/downstream 
systems to switch to https://cwi
 Clustering Configuration
 
 
-This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster. In the future, we hope to provide 
supplemental documentation that covers the NiFi Cluster Architecture in depth.
-
-The design of NiFi clustering is a simple master/slave model where there 
is a master and one or more slaves.
-While the model is that of master and slave, if the master dies, the 
slaves are all instructed to continue operating
-as they were to ensure the dataflow remains live. The absence of the 
master simply means new slaves cannot join the
-cluster and cluster flow changes cannot occur until the master is 
restored. In NiFi clustering, we call the master
-the NiFi Cluster Manager (NCM), and the slaves are called Nodes. See a 
full description of each in the Terminology section below.
+This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster.
+In the future, we hope to provide supplemental documentation that covers 
the NiFi Cluster Architecture in depth.
+
+NiFi employs a Zero-Master Clustering paradigm. Each of the nodes in the 
cluster performs the same tasks on
+the data but each operates on a different set of data. One of the nodes is 
automatically elected (via Apache
+ZooKeeper) as the Cluster Coordinator. All nodes in the cluster will then 
send heartbeat/status information
+to this node, and this node is responsible for disconnecting nodes that do 
not report any heartbeat status
+for some amount of time. Additionally, when a new node elects to join the 
cluster, the new node must first
+connect to the currently-elected Cluster Coordinator in order to obtain 
the most up-to-date flow. If the Cluster
+Coordinator determines that the node is allowed to join (based on its 
configured Firewall file), the current
+flow is provided to that node, and that node is able to join the cluster, 
assuming that the node's copy of the
+flow matches the copy provided by the Cluster Coordinator. If the node's 
version of the flow configuration differs
+from that of the Cluster Coordinator's, the node will not join the cluster.
 
 *Why Cluster?* +
 
-NiFi Administrators or Dataflow Managers (DFMs) may find that using one 
instance of NiFi on a single server is not enough to process the amount of data 
they have. So, one solution is to run the same dataflow on multiple NiFi 
servers. However, this creates a management problem, because each time DFMs 
want to change or update the dataflow, they must make those changes on each 
server and then monitor each server individually. By clustering the NiFi 
servers, it's possible to have that increased processing capability along with 
a single interface through which to make dataflow changes and monitor the 
dataflow. Clustering allows the DFM to make each change only once, and that 
change is then replicated to all the nodes of the cluster. Through the single 
interface, the DFM may also monitor the health and status of all the nodes.
+NiFi Administrators or Dataflow Managers (DFMs) may find that using one 
instance of NiFi on a single server is not
+enough to process the amount of data they have. So, one solution is to run 
the same dataflow on multiple NiFi servers.
+However, this creates a management problem, because each time DFMs want to 
change or update the dataflow, they must make
+those changes on each server and then monitor each server individually. By 
clustering the NiFi servers, it's possible to
+have that increased processing capability along with a single interface 
through which to make dataflow changes and monitor
+the dataflow. Clustering allows the DFM to make each change only once, and 
that change is then replicated to all the nodes
+of the cluster. Through the single interface, the DFM may also monitor the 
health and status of all the nodes.
 
 NiFi Clustering is unique and has its own terminology. It's important to 
understand the following terms before setting up a cluster.
 
 [template="glossary", id="terminology"]
 *Terminology* +
 
-*NiFi Cluster Manager*: A NiFi Cluster Manager (NCM) is an instance of 
NiFi that provides the sole management point for the cluster. It communicates 
dataflow changes to the nodes and receives health and status information from 
the nodes. It also ensures that a uniform dataflow is maintained across the 
cluster.  When DFMs manage a dataflow in a cluster, they do so through the User 
Interface of the NCM (i.e., via t

[GitHub] nifi pull request #491: NIFI-1960: Update admin guide regarding documentatio...

2016-06-03 Thread alopresto
Github user alopresto commented on a diff in the pull request:

https://github.com/apache/nifi/pull/491#discussion_r65752318
  
--- Diff: nifi-docs/src/main/asciidoc/administration-guide.adoc ---
@@ -485,98 +485,137 @@ It is preferable to request upstream/downstream 
systems to switch to https://cwi
 Clustering Configuration
 
 
-This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster. In the future, we hope to provide 
supplemental documentation that covers the NiFi Cluster Architecture in depth.
-
-The design of NiFi clustering is a simple master/slave model where there 
is a master and one or more slaves.
-While the model is that of master and slave, if the master dies, the 
slaves are all instructed to continue operating
-as they were to ensure the dataflow remains live. The absence of the 
master simply means new slaves cannot join the
-cluster and cluster flow changes cannot occur until the master is 
restored. In NiFi clustering, we call the master
-the NiFi Cluster Manager (NCM), and the slaves are called Nodes. See a 
full description of each in the Terminology section below.
+This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster.
+In the future, we hope to provide supplemental documentation that covers 
the NiFi Cluster Architecture in depth.
+
+NiFi employs a Zero-Master Clustering paradigm. Each of the nodes in the 
cluster performs the same tasks on
+the data but each operates on a different set of data. One of the nodes is 
automatically elected (via Apache
+ZooKeeper) as the Cluster Coordinator. All nodes in the cluster will then 
send heartbeat/status information
+to this node, and this node is responsible for disconnecting nodes that do 
not report any heartbeat status
+for some amount of time. Additionally, when a new node elects to join the 
cluster, the new node must first
+connect to the currently-elected Cluster Coordinator in order to obtain 
the most up-to-date flow. If the Cluster
+Coordinator determines that the node is allowed to join (based on its 
configured Firewall file), the current
+flow is provided to that node, and that node is able to join the cluster, 
assuming that the node's copy of the
+flow matches the copy provided by the Cluster Coordinator. If the node's 
version of the flow configuration differs
+from that of the Cluster Coordinator's, the node will not join the cluster.
 
 *Why Cluster?* +
 
-NiFi Administrators or Dataflow Managers (DFMs) may find that using one 
instance of NiFi on a single server is not enough to process the amount of data 
they have. So, one solution is to run the same dataflow on multiple NiFi 
servers. However, this creates a management problem, because each time DFMs 
want to change or update the dataflow, they must make those changes on each 
server and then monitor each server individually. By clustering the NiFi 
servers, it's possible to have that increased processing capability along with 
a single interface through which to make dataflow changes and monitor the 
dataflow. Clustering allows the DFM to make each change only once, and that 
change is then replicated to all the nodes of the cluster. Through the single 
interface, the DFM may also monitor the health and status of all the nodes.
+NiFi Administrators or Dataflow Managers (DFMs) may find that using one 
instance of NiFi on a single server is not
+enough to process the amount of data they have. So, one solution is to run 
the same dataflow on multiple NiFi servers.
+However, this creates a management problem, because each time DFMs want to 
change or update the dataflow, they must make
+those changes on each server and then monitor each server individually. By 
clustering the NiFi servers, it's possible to
+have that increased processing capability along with a single interface 
through which to make dataflow changes and monitor
+the dataflow. Clustering allows the DFM to make each change only once, and 
that change is then replicated to all the nodes
+of the cluster. Through the single interface, the DFM may also monitor the 
health and status of all the nodes.
 
 NiFi Clustering is unique and has its own terminology. It's important to 
understand the following terms before setting up a cluster.
 
 [template="glossary", id="terminology"]
 *Terminology* +
 
-*NiFi Cluster Manager*: A NiFi Cluster Manager (NCM) is an instance of 
NiFi that provides the sole management point for the cluster. It communicates 
dataflow changes to the nodes and receives health and status information from 
the nodes. It also ensures that a uniform dataflow is maintained across the 
cluster.  When DFMs manage a dataflow in a cluster, they do so through the User 
Interface of the NCM (i.e., via t

[GitHub] nifi pull request #491: NIFI-1960: Update admin guide regarding documentatio...

2016-06-03 Thread alopresto
Github user alopresto commented on a diff in the pull request:

https://github.com/apache/nifi/pull/491#discussion_r65752022
  
--- Diff: nifi-docs/src/main/asciidoc/administration-guide.adoc ---
@@ -485,98 +485,137 @@ It is preferable to request upstream/downstream 
systems to switch to https://cwi
 Clustering Configuration
 
 
-This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster. In the future, we hope to provide 
supplemental documentation that covers the NiFi Cluster Architecture in depth.
-
-The design of NiFi clustering is a simple master/slave model where there 
is a master and one or more slaves.
-While the model is that of master and slave, if the master dies, the 
slaves are all instructed to continue operating
-as they were to ensure the dataflow remains live. The absence of the 
master simply means new slaves cannot join the
-cluster and cluster flow changes cannot occur until the master is 
restored. In NiFi clustering, we call the master
-the NiFi Cluster Manager (NCM), and the slaves are called Nodes. See a 
full description of each in the Terminology section below.
+This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster.
+In the future, we hope to provide supplemental documentation that covers 
the NiFi Cluster Architecture in depth.
+
+NiFi employs a Zero-Master Clustering paradigm. Each of the nodes in the 
cluster performs the same tasks on
+the data but each operates on a different set of data. One of the nodes is 
automatically elected (via Apache
+ZooKeeper) as the Cluster Coordinator. All nodes in the cluster will then 
send heartbeat/status information
+to this node, and this node is responsible for disconnecting nodes that do 
not report any heartbeat status
+for some amount of time. Additionally, when a new node elects to join the 
cluster, the new node must first
+connect to the currently-elected Cluster Coordinator in order to obtain 
the most up-to-date flow. If the Cluster
+Coordinator determines that the node is allowed to join (based on its 
configured Firewall file), the current
+flow is provided to that node, and that node is able to join the cluster, 
assuming that the node's copy of the
+flow matches the copy provided by the Cluster Coordinator. If the node's 
version of the flow configuration differs
+from that of the Cluster Coordinator's, the node will not join the cluster.
 
 *Why Cluster?* +
 
-NiFi Administrators or Dataflow Managers (DFMs) may find that using one 
instance of NiFi on a single server is not enough to process the amount of data 
they have. So, one solution is to run the same dataflow on multiple NiFi 
servers. However, this creates a management problem, because each time DFMs 
want to change or update the dataflow, they must make those changes on each 
server and then monitor each server individually. By clustering the NiFi 
servers, it's possible to have that increased processing capability along with 
a single interface through which to make dataflow changes and monitor the 
dataflow. Clustering allows the DFM to make each change only once, and that 
change is then replicated to all the nodes of the cluster. Through the single 
interface, the DFM may also monitor the health and status of all the nodes.
+NiFi Administrators or Dataflow Managers (DFMs) may find that using one 
instance of NiFi on a single server is not
+enough to process the amount of data they have. So, one solution is to run 
the same dataflow on multiple NiFi servers.
+However, this creates a management problem, because each time DFMs want to 
change or update the dataflow, they must make
+those changes on each server and then monitor each server individually. By 
clustering the NiFi servers, it's possible to
+have that increased processing capability along with a single interface 
through which to make dataflow changes and monitor
+the dataflow. Clustering allows the DFM to make each change only once, and 
that change is then replicated to all the nodes
+of the cluster. Through the single interface, the DFM may also monitor the 
health and status of all the nodes.
 
 NiFi Clustering is unique and has its own terminology. It's important to 
understand the following terms before setting up a cluster.
 
 [template="glossary", id="terminology"]
 *Terminology* +
 
-*NiFi Cluster Manager*: A NiFi Cluster Manager (NCM) is an instance of 
NiFi that provides the sole management point for the cluster. It communicates 
dataflow changes to the nodes and receives health and status information from 
the nodes. It also ensures that a uniform dataflow is maintained across the 
cluster.  When DFMs manage a dataflow in a cluster, they do so through the User 
Interface of the NCM (i.e., via t

[GitHub] nifi pull request #491: NIFI-1960: Update admin guide regarding documentatio...

2016-06-03 Thread alopresto
Github user alopresto commented on a diff in the pull request:

https://github.com/apache/nifi/pull/491#discussion_r65751497
  
--- Diff: nifi-docs/src/main/asciidoc/administration-guide.adoc ---
@@ -485,98 +485,137 @@ It is preferable to request upstream/downstream 
systems to switch to https://cwi
 Clustering Configuration
 
 
-This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster. In the future, we hope to provide 
supplemental documentation that covers the NiFi Cluster Architecture in depth.
-
-The design of NiFi clustering is a simple master/slave model where there 
is a master and one or more slaves.
-While the model is that of master and slave, if the master dies, the 
slaves are all instructed to continue operating
-as they were to ensure the dataflow remains live. The absence of the 
master simply means new slaves cannot join the
-cluster and cluster flow changes cannot occur until the master is 
restored. In NiFi clustering, we call the master
-the NiFi Cluster Manager (NCM), and the slaves are called Nodes. See a 
full description of each in the Terminology section below.
+This section provides a quick overview of NiFi Clustering and instructions 
on how to set up a basic cluster.
+In the future, we hope to provide supplemental documentation that covers 
the NiFi Cluster Architecture in depth.
+
+NiFi employs a Zero-Master Clustering paradigm. Each of the nodes in the 
cluster performs the same tasks on
+the data but each operates on a different set of data. One of the nodes is 
automatically elected (via Apache
+ZooKeeper) as the Cluster Coordinator. All nodes in the cluster will then 
send heartbeat/status information
+to this node, and this node is responsible for disconnecting nodes that do 
not report any heartbeat status
+for some amount of time. Additionally, when a new node elects to join the 
cluster, the new node must first
+connect to the currently-elected Cluster Coordinator in order to obtain 
the most up-to-date flow. If the Cluster
+Coordinator determines that the node is allowed to join (based on its 
configured Firewall file), the current
--- End diff --

Where is the Firewall file? Is this a Zookeeper thing or a new NiFi thing?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request #491: NIFI-1960: Update admin guide regarding documentatio...

2016-06-03 Thread markap14
GitHub user markap14 opened a pull request:

https://github.com/apache/nifi/pull/491

NIFI-1960: Update admin guide regarding documentation for clustering



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/markap14/nifi NIFI-1960

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/491.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #491


commit 85c1b3c21dce1d585c7147af6fa6989bacd41acf
Author: Mark Payne 
Date:   2016-06-03T18:02:35Z

NIFI-1960: Update admin guide regarding documentation for clustering




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---