Hi Mark Just back at the office after a small holiday :-) I have tested my setup with nifi 1.14.0 regarding hostname and FQDN. If I run a nslookup node01.domain.lan I get the address 192.168.1.11 If I configure nifi.cluster.load.balance.host=node01.domain.lan, netstat -l show the following: tcp 0 0 localhost:6342 0.0.0.0:* LISTEN
if I configure nifi.cluster.load.balance.host=192.168.1.11, netstat -l show the following: tcp 0 0 node01.domain.lan:6342 0.0.0.0:* LISTEN I don't know why it will be different than yours since I can get the correct IP via nslookup Kind regards Jens M. Kofoed Den fre. 6. aug. 2021 kl. 15.48 skrev Mark Payne <marka...@hotmail.com>: > Jens, > > You’re right - my mistake, the change from > “nifi.cluster.load.balance.address” to “nifi.cluster.load.balance.host” was > in 1.14.0, not early on. In 1.14.0, only nifi.cluster.load.balance.host is > used. The documentation and properties file both used .host, but the code > was making use of .address instead. So the code was fixed in 1.14.0 to > match what the documentation and nifi.properties file specified. > > I just did some testing locally on my macbook regarding the IP address vs. > hostname. > What I found is that if I use the IP address, it listens as expected. > If I use just <hostname> (not fully qualified), interestingly it listens > on localhost only. > If I run: "nslookup <hostname>" I get back <hostname>.lan as the fqdn > If I use "<hostname>.lan” in my properties, it listens as expected. > > Thanks > -Mark > > On Aug 6, 2021, at 12:28 AM, Jens M. Kofoed <jmkofoed....@gmail.com> > wrote: > > Hi Mark > > In version 1.13.2 (at least) the file > "main/nifi-commons/nifi-properties/src/main/java/org/apache/nifi/util/NiFiProperties.java" > is looking for a property called "nifi.cluster.load.balance.address" which > has been reported in https://issues.apache.org/jira/browse/NIFI-8643 and > fixed in version 1.14.0 > > In version 1.14.0 the only way I can get it to work, is if I type in the > IP address. If I don't specified it or type in the fqdn the load balance > port will bind to localhost. which has been reported in > https://issues.apache.org/jira/browse/NIFI-9010 > The result from running netstat -l > tcp 0 0 localhost:6342 0.0.0.0:* LISTEN > > Kind regards > Jens M. Kofoed > > > > Den tor. 5. aug. 2021 kl. 23.08 skrev Mark Payne <marka...@hotmail.com>: > >> Axel, >> >> I think that I can help clarify some of these things. >> >> First of all: nifi.cluster.load.balance.host vs. >> nifi.cluster.load.balance.address >> * The nifi.cluster.load.balance.host property is what matters. >> >> * The nifi.cluster.load.balance.address is not a real property. NiFi has >> never looked at this property. However, in the first release that included >> load-balancing, there was a typo in which the nifi.properties file had >> “…address” instead of “…host”. This was later addressed. >> >> * So if you have a value for “nifi.cluster.load.balance.address”, it does >> nothing and is always ignored. >> >> >> >> Next: nifi.cluster.load.balance.host property >> >> * nifi.cluster.load.balance.host can be either an IP address or a >> hostname. But if set, other nodes in the cluster MUST be able to >> communicate with the node using whatever value you put here. So using a >> value of 0.0.0.0 will not work. Also, if set, NiFi will listen for incoming >> connections ONLY on that hostname. So if you set it to “localhost”, for >> instance, no other node can connect to it, because no other host can >> connect to the node using “localhost”. So this needs to be an address that >> both the NiFi instance knows about/can bind to, and other nodes in the >> cluster can connect to. >> >> * If nifi.cluster.load.balance.host is NOT set: NiFi will listen for >> incoming requests on all network interfaces / hostnames. It will advertise >> its hostname to other nodes in the cluster according to whatever is set for >> the “nifi.cluster.node.address” property. Meaning that other nodes in the >> cluster must be able to connect to this node using whatever hostname is set >> for the “nifi.cluster.node.address” property. If >> the “nifi.cluster.node.address” property is not set, it advertises its >> hostname as localhost - which means other nodes won’t be able to send to >> it. >> >> So you must specify either the “nifi.cluster.load.balance.host” property >> or the “nifi.cluster.node.address” property. >> >> >> >> Finally: having to delete the state directory >> >> If you change the “nifi.cluster.load.balance.host” or >> “nifi.cluster.load.balance.port” property and restart a node, you must >> restart all nodes in the cluster. Otherwise, the other nodes won’t be able >> to send to that node. >> So, for example, when you changed the load.balance.host from fqdn or >> 0.0.0.0 to the IP address - the other nodes in the cluster would stop >> sending. I created a JIRA [1] for that. In my testing, when I changed the >> hostname, the other nodes stopped sending. But restarting them got things >> back on track. I wasn’t able to replicate the issue after restarting all >> nodes. >> >> Hope this is helpful! >> -Mark >> >> [1] https://issues.apache.org/jira/browse/NIFI-9017 >> >> >> On Aug 3, 2021, at 3:08 AM, Axel Schwarz <axelkop...@emailn.de> wrote: >> >> Hey guys, >> >> I think I found the "trick" for at least version 1.13.2 and of course >> I'll share it with you. >> I now use the following load balancing properties: >> >> # cluster load balancing properties # >> nifi.cluster.load.balance.host=192.168.1.10 >> nifi.cluster.load.balance.port=6342 >> nifi.cluster.load.balance.connections.per.node=4 >> nifi.cluster.load.balance.max.thread.count=8 >> nifi.cluster.load.balance.comms.timeout=30 sec >> >> So I use the hosts IP address for balance.host instead of 0.0.0.0 or the >> fqdn and have no balance.address property at all. >> This led to partly load balancing in my case as already mentioned. It >> looked like I needed to do one more step to reach the goal and this step >> seems to be deleting all statemanagement files. >> >> Through the state-management.xml config file I changed the state >> management directory to be outside of the nifi installation, because the >> config file says "it is important, that the directory be copied over to the >> new version when upgrading nifi". So everytime when I upgraded or >> reinstalled Nifi during my load balancing odyssey, the statemanagement >> remained completely untouched. >> As soon as I changed that, by deleting the entire state management >> directory before reinstalling Nifi with above mentioned properties, load >> balancing was immediately working throughout the whole cluster. >> >> >> I think for my flow it is not quite that bad to delete the state >> management as I only use one statefull processor to increase some counter. >> And the times I already tried this by now, I could not encounter any wrong >> behaviour whatsoever. But of course I can't test everything, so when any of >> you have some important facts about deleting the state management, please >> let me know :) >> >> Beside that I now feel like this solved my problem. Gotta have an eye on >> that when updating to version 1.14.0 later on, but I think I can figure >> this out. So thanks for all your support! :) >> >> --- Ursprüngliche Nachricht --- >> Von: "Jens M. Kofoed" <jmkofoed....@gmail.com> >> Datum: 29.07.2021 11:08:28 >> An: users@nifi.apache.org, Axel Schwarz <axelkop...@emailn.de> >> Betreff: Re: Re: Re: No Load Balancing since 1.13.2 >> >> Hmm... I can't remember :-( sorry >> >> My configuration for version 1.13.2 is like this: >> # cluster node properties (only configure for cluster nodes) # >> nifi.cluster.is.node=true >> nifi.cluster.node.address=nifi-node01.domaine.com >> nifi.cluster.node.protocol.port=9443 >> nifi.cluster.node.protocol.threads=10 >> nifi.cluster.node.protocol.max.threads=50 >> nifi.cluster.node.event.history.size=25 >> nifi.cluster.node.connection.timeout=5 sec >> nifi.cluster.node.read.timeout=5 sec >> nifi.cluster.node.max.concurrent.requests=100 >> nifi.cluster.firewall.file= >> nifi.cluster.flow.election.max.wait.time=5 mins >> nifi.cluster.flow.election.max.candidates=3 >> >> # cluster load balancing properties # >> nifi.cluster.load.balance.address=192.168.1.11 >> nifi.cluster.load.balance.port=6111 >> nifi.cluster.load.balance.connections.per.node=4 >> nifi.cluster.load.balance.max.thread.count=8 >> nifi.cluster.load.balance.comms.timeout=30 sec >> >> So I defined "nifi.cluster.node.address" with the hostname and >> not an ip >> adress and the "nifi.cluster.load.balance.address" with the ip >> address of >> the server. >> And triple check the configuration at all servers :-) >> >> Kind Regards >> Jens M. Kofoed >> >> >> Den tor. 29. jul. 2021 kl. 10.11 skrev Axel Schwarz <axelkop...@emailn.de >> >: >> >> >> Hey Jens, >> >> in Issue Nifi-8643 you wrote the last comment with the exactly same >> >> >> behaviour as we're experiencing now. 2 of 3 nodes were load balancing. >> >> >> How did you get the third node to participate in load balancing? An >> >> update >> >> to 1.14.0 does not change anything for us. >> >> >> >> https://issues.apache.org/jira/browse/NIFI-8643?focusedCommentId=17361418&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17361418 >> >> >> >> >> --- Ursprüngliche Nachricht --- >> Von: "Jens M. Kofoed" <jmkofoed....@gmail.com> >> Datum: 28.07.2021 12:07:50 >> An: users@nifi.apache.org, Axel Schwarz <axelkop...@emailn.de> >> >> >> Betreff: Re: Re: No Load Balancing since 1.13.2 >> >> hi >> >> I can see that you have configured >> >> nifi.cluster.load.balance.address=0.0.0.0 >> >> >> Have your tried to set the correct ip adress? >> node1: nifi.cluster.load.balance.address=192.168.1.10 >> node2: nifi.cluster.load.balance.address=192.168.1.11 >> node3: nifi.cluster.load.balance.address=192.168.1.12 >> >> regards >> Jens M. Kofoed >> >> Den ons. 28. jul. 2021 kl. 11.17 skrev Axel Schwarz < >> >> axelkop...@emailn.de>: >> >> >> >> Just tried Java 11. But still does not work. Nothing changed. >> >> :( >> >> >> --- Ursprüngliche Nachricht --- >> Von: Jorge Machado <jom...@me.com> >> Datum: 27.07.2021 13:08:55 >> An: users@nifi.apache.org, Axel Schwarz <axelkop...@emailn.de> >> >> >> >> Betreff: Re: No Load Balancing since 1.13.2 >> >> Did you tried java 11 ? I have a client running a similar >> >> setup >> >> to yours >> >> but with a lower nigh version and it works fine. Maybe >> >> it is worth >> >> to try >> >> it. >> >> >> On 27. Jul 2021, at 12:42, Axel Schwarz <axelkop...@emailn.de> >> >> >> >> wrote: >> >> >> I did indeed, but I updated from u161 to u291, as >> >> this was >> >> the newest >> >> version at that time, because I thought it could help. >> >> >> So the issue started under u161. But I just saw >> >> that u301 >> >> is out. I >> >> will try this as well. >> >> --- Ursprüngliche Nachricht --- >> Von: Pierre Villard <pierre.villard...@gmail.com> >> >> >> Datum: 27.07.2021 10:18:38 >> An: users@nifi.apache.org, Axel Schwarz <axelkop...@emailn.de> >> >> >> >> >> Betreff: Re: No Load Balancing since 1.13.2 >> >> Hi, >> >> I believe the minor u291 is known to have issues >> >> (for some >> >> of its early >> >> builds). Did you upgrade the Java version recently? >> >> >> Thanks, >> Pierre >> >> Le mar. 27 juil. 2021 à 08:07, Axel Schwarz <axelkop...@emailn.de >> >> >> >> <mailto:axelkop...@emailn.de <axelkop...@emailn.de>>> a écrit : >> >> Dear Community, >> >> we're running a secured 3 node Nifi Cluster on Java >> >> 8_u291 >> >> and Debian >> >> 7 and experiencing >> >> problems with load balancing since version 1.13.2. >> >> >> >> I'm fully aware of Issue Nifi-8643 and tested alot >> >> around >> >> this, but >> >> gotta say, that this >> >> is not our problem. Mainly because the balance port >> >> never >> >> binds to >> >> localhost, >> >> but also because I >> >> implemented all workarounds under version 1.13.2 >> >> and even >> >> tried version >> >> 1.14.0 by now, >> >> but load blancing still does not work. >> What we experience is best described as "the >> >> primary >> >> node balances >> >> with itself"... >> >> >> So what it does is, opening the balancing connections >> >> to its >> >> own IP >> >> instead of the IPs >> >> of the other two nodes. And the other two nodes >> >> don't open >> >> balancing >> >> connections at all. >> >> >> When executing "ss | grep 6342" on the >> >> primary node, >> >> this >> >> is what it looks like: >> >> >> [root@nifiHost1 conf]# ss | grep 6342 >> tcp ESTAB 0 0 192.168.1.10:51380 >> >> < >> >> http://192.168.1.10:51380/> >> >> 192.168.1.10:6342 <http://192.168.1.10:6342/> >> >> >> >> >> tcp ESTAB 0 0 192.168.1.10:51376 >> >> < >> >> http://192.168.1.10:51376/> >> >> 192.168.1.10:6342 <http://192.168.1.10:6342/> >> >> >> >> >> tcp ESTAB 0 0 192.168.1.10:51378 >> >> < >> >> http://192.168.1.10:51378/> >> >> 192.168.1.10:6342 <http://192.168.1.10:6342/> >> >> >> >> >> tcp ESTAB 0 0 192.168.1.10:51370 >> >> < >> >> http://192.168.1.10:51370/> >> >> 192.168.1.10:6342 <http://192.168.1.10:6342/> >> >> >> >> >> tcp ESTAB 0 0 192.168.1.10:51372 >> >> < >> >> http://192.168.1.10:51372/> >> >> 192.168.1.10:6342 <http://192.168.1.10:6342/> >> >> >> >> >> tcp ESTAB 0 0 192.168.1.10:6342 >> >> < >> >> http://192.168.1.10:6342/> >> >> 192.168.1.10:51376 <http://192.168.1.10:51376/> >> >> >> >> >> tcp ESTAB 0 0 192.168.1.10:51374 >> >> < >> >> http://192.168.1.10:51374/> >> >> 192.168.1.10:6342 <http://192.168.1.10:6342/> >> >> >> >> >> tcp ESTAB 0 0 192.168.1.10:6342 >> >> < >> >> http://192.168.1.10:6342/> >> >> 192.168.1.10:51374 <http://192.168.1.10:51374/> >> >> >> >> >> tcp ESTAB 0 0 192.168.1.10:51366 >> >> < >> >> http://192.168.1.10:51366/> >> >> 192.168.1.10:6342 <http://192.168.1.10:6342/> >> >> >> >> >> tcp ESTAB 0 0 192.168.1.10:6342 >> >> < >> >> http://192.168.1.10:6342/> >> >> 192.168.1.10:51370 <http://192.168.1.10:51370/> >> >> >> >> >> tcp ESTAB 0 0 192.168.1.10:6342 >> >> < >> >> http://192.168.1.10:6342/> >> >> 192.168.1.10:51366 <http://192.168.1.10:51366/> >> >> >> >> >> tcp ESTAB 0 0 192.168.1.10:51368 >> >> < >> >> http://192.168.1.10:51368/> >> >> 192.168.1.10:6342 <http://192.168.1.10:6342/> >> >> >> >> >> tcp ESTAB 0 0 192.168.1.10:6342 >> >> < >> >> http://192.168.1.10:6342/> >> >> 192.168.1.10:51372 <http://192.168.1.10:51372/> >> >> >> >> >> tcp ESTAB 0 0 192.168.1.10:6342 >> >> < >> >> http://192.168.1.10:6342/> >> >> 192.168.1.10:51378 <http://192.168.1.10:51378/> >> >> >> >> >> tcp ESTAB 0 0 192.168.1.10:6342 >> >> < >> >> http://192.168.1.10:6342/> >> >> 192.168.1.10:51368 <http://192.168.1.10:51368/> >> >> >> >> >> tcp ESTAB 0 0 192.168.1.10:6342 >> >> < >> >> http://192.168.1.10:6342/> >> >> 192.168.1.10:51380 <http://192.168.1.10:51380/> >> >> >> >> >> Executing it on the other non primary nodes, just >> >> returns >> >> absolutely >> >> nothing. >> >> >> Netstat show the following on each server: >> >> [root@nifiHost1 conf]# netstat -tulpn >> Active Internet connections (only servers) >> Proto Recv-Q Send-Q Local Address Foreign >> >> Address >> >> >> State PID/Program name >> >> tcp 0 0 192.168.1.10:6342 <http://192.168.1.10:6342/> >> >> >> >> 0.0.0.0:* LISTEN 10352/java >> >> >> >> [root@nifiHost2 conf]# netstat -tulpn >> Active Internet connections (only servers) >> Proto Recv-Q Send-Q Local Address Foreign >> >> Address >> >> >> State PID/Program name >> >> tcp 0 0 192.168.1.11:6342 <http://192.168.1.11:6342/> >> >> >> >> 0.0.0.0:* LISTEN 31562/java >> >> >> >> [root@nifiHost3 conf]# netstat -tulpn >> Active Internet connections (only servers) >> Proto Recv-Q Send-Q Local Address Foreign >> >> Address >> >> >> State PID/Program name >> >> tcp 0 0 192.168.1.12:6342 <http://192.168.1.12:6342/> >> >> >> >> 0.0.0.0:* LISTEN 31685/java >> >> >> >> And here is what our load balancing properties look >> >> like: >> >> >> >> # cluster load balancing properties # >> nifi.cluster.load.balance.host=nifiHost1.contoso.com >> <http://nifihost1.contoso.com/> >> >> < >> >> >> http://nifihost1.contoso.com/> >> >> >> nifi.cluster.load.balance.address=0.0.0.0 >> nifi.cluster.load.balance.port=6342 >> nifi.cluster.load.balance.connections.per.node=4 >> >> >> nifi.cluster.load.balance.max.thread.count=8 >> nifi.cluster.load.balance.comms.timeout=30 sec >> >> When running Nifi in version 1.12.1 on the exact >> >> same setup >> >> in the >> >> exact >> >> same environment, load balancing is working absolutely >> >> fine. >> >> There was a time when load balancing even worked >> >> in version >> >> 1.13.2. >> >> But I'm not able to reproduce this and it just stopped >> >> >> working one day after some restart, without changing >> >> any property >> >> or >> >> whatsoever. >> >> >> If any more information would be helpful please >> >> let me know >> >> and I'll >> >> try to provide it as fast as possible. >> >> >> >> >> Versendet mit Emailn.de <http://emailn.de/> <https://www.emailn.de/> >> >> - Freemail >> >> >> >> * Unbegrenzt Speicherplatz >> * Eigenes Online-Büro >> * 24h besten Mailempfang >> * Spamschutz, Adressbuch >> >> >> >> >> Versendet mit Emailn.de <http://emailn.de/> <https://www.emailn.de/> >> >> - Freemail >> >> >> >> >> * Unbegrenzt Speicherplatz >> * Eigenes Online-Büro >> * 24h besten Mailempfang >> * Spamschutz, Adressbuch >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >