Re: Round Robin not working NiFi 1.13.2

Jens M. Kofoed Fri, 11 Jun 2021 01:19:29 -0700

This is a bug. There are a mismatch between documentation, nifi.properties
and the java class handling the properties
https://issues.apache.org/jira/browse/NIFI-8643
kind regards
Jens


Den tor. 10. jun. 2021 kl. 17.01 skrev Jens M. Kofoed <
jmkofoed....@gmail.com>:

> In the beginning both parameters was set:
>
> nifi.cluster.is.node=true
> nifi.cluster.node.address=node01.domain.com
> nifi.cluster.node.protocol.port=9443
> nifi.cluster.node.protocol.threads=10
> nifi.cluster.node.protocol.max.threads=50
> nifi.cluster.node.event.history.size=25
> nifi.cluster.node.connection.timeout=5 sec
> nifi.cluster.node.read.timeout=5 sec
> nifi.cluster.node.max.concurrent.requests=100
> nifi.cluster.firewall.file=
> nifi.cluster.flow.election.max.wait.time=5 mins
> nifi.cluster.flow.election.max.candidates=3
>
> # cluster load balancing properties #
> nifi.cluster.load.balance.host= node01.domain.com
> nifi.cluster.load.balance.port=6342
> nifi.cluster.load.balance.connections.per.node=4
> nifi.cluster.load.balance.max.thread.count=8
> nifi.cluster.load.balance.comms.timeout=30 sec
>
> If have testet multiple combination of host names. the "funny" part is
> that port 9443 binds to 0.0.0.0:
> netstat -l
> Active Internet connections (only servers)
> Proto Recv-Q Send-Q Local Address           Foreign Address         State
> tcp        0      0 0.0.0.0:9090            0.0.0.0:*               LISTEN
> tcp        0      0 0.0.0.0:9443            0.0.0.0:*               LISTEN
> tcp        0      0 localhost:6342          0.0.0.0:*               LISTEN
> tcp        0      0 localhost:42603         0.0.0.0:*               LISTEN
> tcp        0      0 0.0.0.0:ssh             0.0.0.0:*               LISTEN
> tcp        0      0 node01.domain.com:8443 0.0.0.0:*               LISTEN
> tcp6       0      0 localhost:42101         [::]:*                  LISTEN
> tcp6       0      0 [::]:ssh                [::]:*                  LISTEN
> raw6       0      0 [::]:ipv6-icmp          [::]:*                  7
>
>
> regards
> Jens
>
> Den tor. 10. jun. 2021 kl. 16.53 skrev Joe Gresock <jgres...@gmail.com>:
>
>> Looking at the code, it appears that if nifi.cluster.load.balance.address
>> is not set, it falls back to choosing nifi.cluster.node.address.  If this
>> is not provided, it finally falls back to localhost.
>>
>> I'd recommend setting nifi.cluster.node.address at minimum, and you might
>> as well also set nifi.cluster.load.balance.address in order to be explicit.
>>
>> On Thu, Jun 10, 2021 at 10:45 AM Jens M. Kofoed <jmkofoed....@gmail.com>
>> wrote:
>>
>>> Hi Joe
>>>
>>> I just found out that port 6342 is bound to localhost. Why????
>>> In the last build NIFI is bound to localhost as standard if not
>>> specifying which interface to use:
>>> nifi.web.https.host=node1.domain.com
>>> nifi.web.https.port=8443
>>> nifi.web.https.network.interface.default=ens192    <----- If this is not
>>> configured the UI is bound to localhost.
>>>
>>> But how can I configure port 6342 to bound to any interface???
>>>
>>> kind regards
>>> Jens
>>>
>>>
>>> Den tor. 10. jun. 2021 kl. 16.32 skrev Joe Gresock <jgres...@gmail.com>:
>>>
>>>> Ok, and just to confirm, you've verified that each node can talk to the
>>>> others over port 6342?
>>>>
>>>> On Thu, Jun 10, 2021 at 10:29 AM Jens M. Kofoed <jmkofoed....@gmail.com>
>>>> wrote:
>>>>
>>>>> I have the same error for node2 as well.
>>>>> All 3 nodes can talk to each other. If I use a remote process group
>>>>> and connect to an "remote" input port, everything works fine. This is a
>>>>> work around for round robin.
>>>>> My configuration for cluster load balance is the default.
>>>>> nifi.cluster.load.balance.host=
>>>>> nifi.cluster.load.balance.port=6342
>>>>> nifi.cluster.load.balance.connections.per.node=4
>>>>> nifi.cluster.load.balance.max.thread.count=8
>>>>> nifi.cluster.load.balance.comms.timeout=30 sec
>>>>>
>>>>> kind regards
>>>>> Jens
>>>>>
>>>>>
>>>>> Den tor. 10. jun. 2021 kl. 16.18 skrev Joe Gresock <jgres...@gmail.com
>>>>> >:
>>>>>
>>>>>> That would seem to be the culprit :)  It sounds like your other nodes
>>>>>> can't connect to node3 over port 8443.  Have you verified that the port 
>>>>>> is
>>>>>> open?  Same question for all other ports configured in your 
>>>>>> nifi.properties.
>>>>>>
>>>>>> On Thu, Jun 10, 2021 at 10:08 AM Jens M. Kofoed <
>>>>>> jmkofoed....@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Joe
>>>>>>>
>>>>>>> Thanks for replaying :-)
>>>>>>> Looking at status history for the fetchFTP and all the other
>>>>>>> processers in the flow it is only the primary node which has processed
>>>>>>> flowfiles.
>>>>>>> I have created clusters before with no issues, but there must be
>>>>>>> something tricky which I'm missing.
>>>>>>>
>>>>>>> I found this error in the log which explain why it is only the
>>>>>>> primary node
>>>>>>> 2021-06-10 16:00:22,078 ERROR [Load-Balanced Client Thread-1]
>>>>>>> org.apache.nifi.controller.queue.clustered.client.async.nio.NioAsyncLoadBalanceClient
>>>>>>> Unable to connect to node3.domain.com:8443 for load balancing
>>>>>>> java.net.ConnectException: Connection refused
>>>>>>>
>>>>>>> But I don't know why the Connection should be refused. I can't find
>>>>>>> any other errors about connections. And know I have added the node group
>>>>>>> into all policies, so all nodes should have all access rights.
>>>>>>>
>>>>>>> Any advice for future investigation?
>>>>>>>
>>>>>>> kind regards
>>>>>>> Jens
>>>>>>>
>>>>>>> Den tor. 10. jun. 2021 kl. 15.36 skrev Joe Gresock <
>>>>>>> jgres...@gmail.com>:
>>>>>>>
>>>>>>>> Hi Jens,
>>>>>>>>
>>>>>>>> Out of curiosity, when you run the FetchFTP processor, what does
>>>>>>>> the Status History of that processor show?  Is the processor processing
>>>>>>>> files on all of your nodes or just the primary?
>>>>>>>>
>>>>>>>> On Thu, Jun 10, 2021 at 9:07 AM Jens M. Kofoed <
>>>>>>>> jmkofoed....@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Dear community
>>>>>>>>>
>>>>>>>>> I have created a 3 node cluster with NiFi 1.13.2, java 8 on a
>>>>>>>>> ubuntu 20.04.
>>>>>>>>> I have a ListFTP Process running on primary node only -> FetchFTP
>>>>>>>>> with Round Robin on the connection. But if I stop the FetchFTP 
>>>>>>>>> Process and
>>>>>>>>> looking at the queue all flowfiles are listed to be on the same node. 
>>>>>>>>> Which
>>>>>>>>> is also the primary node.
>>>>>>>>>
>>>>>>>>> Just for testing purpose, I've tried to set round robin on other
>>>>>>>>> connection but all files stays on primary node. I have been looking 
>>>>>>>>> in the
>>>>>>>>> logs but can't find any errors yet.
>>>>>>>>>
>>>>>>>>> Please advice?
>>>>>>>>> kind regards
>>>>>>>>> Jens
>>>>>>>>>
>>>>>>>>

Re: Round Robin not working NiFi 1.13.2

Reply via email to