Re: Shutdown of one Node in Cluster

Mark Payne Tue, 11 Apr 2017 11:11:39 -0700

Mark,

Yes, 2 out of 3 should be sufficient. For testing purposes, a single zookeeper 
instance
is fine, as well. For production, I would not actually recommend using an 
embedded
ZooKeeper at all and instead use a standalone ZooKeeper. ZooKeeper tends not to 
be
very happy when running on a box on which there is already heavy resource load, 
so if
your cluster starts getting busy, you'll see far more stable performance from a 
standalone
ZooKeeper.



> On Apr 11, 2017, at 2:06 PM, Mark Bean <mark.o.b...@gmail.com> wrote:
> 
> All 3 nodes are running embedded ZooKeeper. And, the Admin Guide states
> "ZooKeeper requires a majority of nodes be active in order to function".
> So, I assumed 2/3 being active was ok. Perhaps not.
> 
> Related: can a Cluster be setup with only 1 ZooKeeper node? Clearly, in
> production, one would not want to do this. But when testing, this should be
> acceptable, yes?
> 
> 
> 
> On Tue, Apr 11, 2017 at 1:56 PM, Mark Payne <marka...@hotmail.com> wrote:
> 
>> Mark,
>> 
>> Are all of your nodes running an embedded ZooKeeper, or only 1 or 2 of
>> them?
>> 
>> Thanks
>> -Mark
>> 
>>> On Apr 11, 2017, at 1:19 PM, Mark Bean <mark.o.b...@gmail.com> wrote:
>>> 
>>> I have a 3-node Cluster with each Node hosting the embedded zookeeper.
>> When
>>> one Node is shutdown (and the Node is not the Cluster Coordinator), the
>>> Cluster becomes unavailable. The UI indicates "Action cannot be performed
>>> because there is currently no Cluster Coordinator elected. The request
>>> should be tried again after a moment, after a Cluster Coordinator has
>> been
>>> automatically elected."
>>> 
>>> The app.log indicates "ConnectionStateManager State change: SUSPENDED".
>>> And, there are an endless number of "CuratorFrameworkImpl Background
>> retry
>>> gave up" messages; the surviving Nodes are not able to allow the Cluster
>> to
>>> survive.
>>> 
>>> I would have thought since 2/3 Nodes are surviving, there wouldn't be a
>>> problem. In addition, since the Node that was shutdown was not the
>> Cluster
>>> Coordinator nor Primary node, no Cluster state changes were required.
>>> 
>>> nifi.cluster.flow.election.max.wait.time=2 mins
>>> nifi.cluster.flow.election.max.candidates=
>>> 
>>> The same behavior was observed when max.candidates was set to 2.
>>> 
>>> NiFi 1.1.2
>> 
>>

Re: Shutdown of one Node in Cluster

Reply via email to