Thanks again, Bryan. Just a quick follow-up question: does removing users.xml and authorizations.xml mean that we will need to re-create all users and groups that we had in the original standalone NiFi instance?
-----Original Message----- From: Bryan Bende <bbe...@gmail.com> Sent: Monday, October 22, 2018 12:48 PM To: users@nifi.apache.org Subject: Re: NiFi fails on cluster nodes Sorry I was confused when you said two 1 node clusters and I assumed they each had their own ZooKeeper. You don't need to run ZK on both nodes, you can create a 2 node cluster using the embedded ZK on the first node. This blog post shows how to setup a secure 2 node cluster: https://bryanbende.com/development/2016/08/17/apache-nifi-1-0-0-authorization-and-multi-tenancy The only difference is that the authorizers.xml has changed slightly, so instead of: <authorizer> <identifier>file-provider</identifier> <class>org.apache.nifi.authorization.FileAuthorizer</class> <property name="Authorizations File">./conf/authorizations.xml</property> <property name="Users File">./conf/users.xml</property> <property name="Initial Admin Identity">CN=bbende, OU=ApacheNiFi</property> <property name="Legacy Authorized Users File"></property> <property name="Node Identity 1">CN=localhost, OU=NIFI</property> </authorizer> You need to add the the users to the user-group-provider and then to the access-policy-provider... <userGroupProvider> <identifier>file-user-group-provider</identifier> <class>org.apache.nifi.authorization.FileUserGroupProvider</class> <property name="Users File">./conf/users.xml</property> <property name="Legacy Authorized Users File"></property> <property name="Initial User Identity 1">CN=bbende, OU=Apache NiFI</property> <property name="Initial User Identity 2">CN=nifi-host-1, OU=NIFI</property> <property name="Initial User Identity 2">CN=nifi-host-2, OU=NIFI</property> </userGroupProvider> <accessPolicyProvider> <identifier>file-access-policy-provider</identifier> <class>org.apache.nifi.authorization.FileAccessPolicyProvider</class> <property name="User Group Provider">composite-configurable-user-group-provider</property> <property name="Authorizations File">./conf/authorizations.xml</property> <property name="Initial Admin Identity">CN=bbende, OU=Apache NiFI</property> <property name="Legacy Authorized Users File"></property> <property name="Node Identity 1">CN=nifi-host-1, OU=NIFI</property> <property name="Node Identity 1">CN=nifi-host-2, OU=NIFI</property> </accessPolicyProvider> Also, whenever you change any config in the authorizers.xml related to the file-based providers, then you will need to remove users.xml and authorizations.xml On Mon, Oct 22, 2018 at 12:20 PM Saip, Alexander (NIH/CC/BTRIS) [C] <alexander.s...@nih.gov> wrote: > > Hi Bryan, > > > > At this point, we don't want to run ZooKeeper on both nodes (as far as I > understand, it prefers an odd number of members in the ensemble). Actually, > the ZooKeeper running on one of them, sees both NiFi instances, but they > don't talk to each other. When we try to make them do so by using a different > authorizers.xml file, which is very much just a customized version of the > “composite” example from the NiFi Admin Guide, then none of the nodes is able > to start at all, throwing the error I mentioned in my previous post. > > > > Are you saying that we have to run ZooKeeper on both nodes? BTW, do we > still need > > > > nifi.login.identity.provider.configuration.file=./conf/login-identity- > providers.xml > > > > in the nifi.properties file when we use that new authorizers.xml? I’m asking > since we have the same LDAP authentication/authorization settings in the > latter. > > > > Thank you, > > > > Alexander > > > > -----Original Message----- > From: Bryan Bende <bbe...@gmail.com> > Sent: Monday, October 22, 2018 11:55 AM > To: users@nifi.apache.org > Subject: Re: NiFi fails on cluster nodes > > > > If you are getting separate clusters then each node is likely only using it's > own ZooKeeper and therefore doesn't know about the other node. > > > > In nifi.properties the ZK connect string would need to be something like > nifi-node1-hostname:2181,nifi-node2-hostname:2181 and in zoo.properties you > would need entries for both ZooKeepers: > > > > server.1=nifi-node1-hostname:2888:3888 > > server.2=nifi-node2-hostname:2888:3888 > > On Mon, Oct 22, 2018 at 11:28 AM Saip, Alexander (NIH/CC/BTRIS) [C] > <alexander.s...@nih.gov> wrote: > > > > > > I wonder if anyone has run into the same problem when trying to > > > configure composite authentication/authorization (LDAP and local > > > file)? When we use the “stand-alone” authorizers.xml file with the > > > addition of two extra properties > > > > > > > > > > > > <property name="Node Identity 1">… > > > > > > <property name="Node Identity 2">… > > > > > > > > > > > > and let ZooKeeper start on one on the nodes, we end up with two > > > one-node clusters, since apparently, the NiFi instances don’t talk > > to > > > each other, but at least, they come alive… > > > > > > > > > > > > From: Saip, Alexander (NIH/CC/BTRIS) [C] <alexander.s...@nih.gov> > > > Sent: Friday, October 19, 2018 11:18 AM > > > To: users@nifi.apache.org > > > Subject: RE: NiFi fails on cluster nodes > > > > > > > > > > > > We have managed to get past that error by installing the CA cert in the > > truststore. So, we can get a one-node cluster up and running. In order to > > add another node, I edited the authorizers.xml file, basically, using the > > “example composite implementation loading users and groups from LDAP and a > > local file” from the Admin guide as a template. When I re-started the node > > running ZooKeeper, though, it crashed with the following error written into > > the nifi-app.log file: > > > > > > > > > > > > 2018-10-19 08:09:26,992 ERROR [main] o.s.web.context.ContextLoader > > > Context initialization failed > > > > > > org.springframework.beans.factory.UnsatisfiedDependencyException: > > > Error creating bean with name > > > 'org.springframework.security.config.annotation.web.configuration.We > > bS > > > ecurityConfiguration': Unsatisfied dependency expressed through > > method > > > 'setFilterChainProxySecurityConfigurer' parameter 1; nested > > exception > > > is org.springframework.beans.factory.BeanExpressionException: > > > Expression parsing failed; nested exception is > > > org.springframework.beans.factory.UnsatisfiedDependencyException: > > > Error creating bean with name > > > 'org.apache.nifi.web.NiFiWebApiSecurityConfiguration': Unsatisfied > > > dependency expressed through method 'setJwtAuthenticationProvider' > > > parameter 0; nested exception is > > > org.springframework.beans.factory.BeanCreationException: Error > > > creating bean with name 'jwtAuthenticationProvider' defined in class > > > path resource [nifi-web-security-context.xml]: Cannot resolve > > > reference to bean 'authorizer' while setting constructor argument; > > > nested exception is > > > org.springframework.beans.factory.BeanCreationException: Error > > > creating bean with name 'authorizer': FactoryBean threw exception on > > > object creation; nested exception is java.lang.NullPointerException: > > > Name is null > > > > > > at > > > org.springframework.beans.factory.annotation.AutowiredAnnotationBean > > Po > > > stProcessor$AutowiredMethodElement.inject(AutowiredAnnotationBeanPos > > tP > > > rocessor.java:667) > > > > > > at > > > org.springframework.beans.factory.annotation.InjectionMetadata.injec > > t( > > > InjectionMetadata.java:88) > > > > > > at > > > org.springframework.beans.factory.annotation.AutowiredAnnotationBean > > Po > > > stProcessor.postProcessPropertyValues(AutowiredAnnotationBeanPostPro > > ce > > > ssor.java:366) > > > > > > at > > > org.springframework.beans.factory.support.AbstractAutowireCapableBea > > nF > > > actory.populateBean(AbstractAutowireCapableBeanFactory.java:1264) > > > > > > at > > > org.springframework.beans.factory.support.AbstractAutowireCapableBea > > nF > > > actory.doCreateBean(AbstractAutowireCapableBeanFactory.java:553) > > > > > > at > > > org.springframework.beans.factory.support.AbstractAutowireCapableBea > > nF > > > actory.createBean(AbstractAutowireCapableBeanFactory.java:483) > > > > > > at > > > org.springframework.beans.factory.support.AbstractBeanFactory$1.getO > > bj > > > ect(AbstractBeanFactory.java:306) > > > > > > at > > > org.springframework.beans.factory.support.DefaultSingletonBeanRegist > > ry > > > .getSingleton(DefaultSingletonBeanRegistry.java:230) > > > > > > at > > > org.springframework.beans.factory.support.AbstractBeanFactory.doGetB > > ea > > > n(AbstractBeanFactory.java:302) > > > > > > at > > > org.springframework.beans.factory.support.AbstractBeanFactory.getBea > > n( > > > AbstractBeanFactory.java:197) > > > > > > at > > > org.springframework.beans.factory.support.DefaultListableBeanFactory > > .p > > > reInstantiateSingletons(DefaultListableBeanFactory.java:761) > > > > > > at > > > org.springframework.context.support.AbstractApplicationContext.finis > > hB > > > eanFactoryInitialization(AbstractApplicationContext.java:867) > > > > > > at > > > org.springframework.context.support.AbstractApplicationContext.refre > > sh > > > (AbstractApplicationContext.java:543) > > > > > > at > > > org.springframework.web.context.ContextLoader.configureAndRefreshWeb > > Ap > > > plicationContext(ContextLoader.java:443) > > > > > > at > > > org.springframework.web.context.ContextLoader.initWebApplicationCont > > ex > > > t(ContextLoader.java:325) > > > > > > at > > > org.springframework.web.context.ContextLoaderListener.contextInitial > > iz > > > ed(ContextLoaderListener.java:107) > > > > > > at > > > org.eclipse.jetty.server.handler.ContextHandler.callContextInitializ > > ed > > > (ContextHandler.java:876) > > > > > > at > > > org.eclipse.jetty.servlet.ServletContextHandler.callContextInitializ > > ed > > > (ServletContextHandler.java:532) > > > > > > at > > > org.eclipse.jetty.server.handler.ContextHandler.startContext(Context > > Ha > > > ndler.java:839) > > > > > > at > > > org.eclipse.jetty.servlet.ServletContextHandler.startContext(Servlet > > Co > > > ntextHandler.java:344) > > > > > > at > > > org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java: > > > 1480) > > > > > > at > > > org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.ja > > va > > > :1442) > > > > > > at > > > org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandl > > er > > > .java:799) > > > > > > at > > > org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletConte > > xt > > > Handler.java:261) > > > > > > at > > > org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:54 > > 0) > > > > > > at > > > org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLif > > eC > > > ycle.java:68) > > > > > > at > > > org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerL > > if > > > eCycle.java:131) > > > > > > at > > > org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(Containe > > rL > > > ifeCycle.java:113) > > > > > > at > > > org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHan > > dl > > > er.java:113) > > > > > > at > > > org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLif > > eC > > > ycle.java:68) > > > > > > at > > > org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerL > > if > > > eCycle.java:131) > > > > > > at > > > org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(Containe > > rL > > > ifeCycle.java:105) > > > > > > at > > > org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHan > > dl > > > er.java:113) > > > > > > at > > > org.eclipse.jetty.server.handler.gzip.GzipHandler.doStart(GzipHandler. > > > java:290) > > > > > > at > > > org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLif > > eC > > > ycle.java:68) > > > > > > at > > > org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerL > > if > > > eCycle.java:131) > > > > > > at > > > org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(Containe > > rL > > > ifeCycle.java:113) > > > > > > at > > > org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHan > > dl > > > er.java:113) > > > > > > at > > > org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLif > > eC > > > ycle.java:68) > > > > > > at > > > org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerL > > if > > > eCycle.java:131) > > > > > > at org.eclipse.jetty.server.Server.start(Server.java:452) > > > > > > at > > > org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(Containe > > rL > > > ifeCycle.java:105) > > > > > > at > > > org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHan > > dl > > > er.java:113) > > > > > > at org.eclipse.jetty.server.Server.doStart(Server.java:419) > > > > > > at > > > org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLif > > eC > > > ycle.java:68) > > > > > > at > > > org.apache.nifi.web.server.JettyServer.start(JettyServer.java:838) > > > > > > at org.apache.nifi.NiFi.<init>(NiFi.java:157) > > > > > > at org.apache.nifi.NiFi.<init>(NiFi.java:71) > > > > > > at org.apache.nifi.NiFi.main(NiFi.java:292) > > > > > > Caused by: org.springframework.beans.factory.BeanExpressionException: > > > Expression parsing failed; nested exception is > > > org.springframework.beans.factory.UnsatisfiedDependencyException: > > > Error creating bean with name > > > 'org.apache.nifi.web.NiFiWebApiSecurityConfiguration': Unsatisfied > > > dependency expressed through method 'setJwtAuthenticationProvider' > > > parameter 0; nested exception is > > > org.springframework.beans.factory.BeanCreationException: Error > > > creating bean with name 'jwtAuthenticationProvider' defined in class > > > path resource [nifi-web-security-context.xml]: Cannot resolve > > > reference to bean 'authorizer' while setting constructor argument; > > > nested exception is > > > org.springframework.beans.factory.BeanCreationException: Error > > > creating bean with name 'authorizer': FactoryBean threw exception on > > > object creation; nested exception is java.lang.NullPointerException: > > > Name is null > > > > > > at > > > org.springframework.context.expression.StandardBeanExpressionResolver. > > > evaluate(StandardBeanExpressionResolver.java:164) > > > > > > at > > > org.springframework.beans.factory.support.AbstractBeanFactory.evalua > > te > > > BeanDefinitionString(AbstractBeanFactory.java:1448) > > > > > > at > > > org.springframework.beans.factory.support.DefaultListableBeanFactory > > .d > > > oResolveDependency(DefaultListableBeanFactory.java:1088) > > > > > > at > > > org.springframework.beans.factory.support.DefaultListableBeanFactory > > .r > > > esolveDependency(DefaultListableBeanFactory.java:1066) > > > > > > at > > > org.springframework.beans.factory.annotation.AutowiredAnnotationBean > > Po > > > stProcessor$AutowiredMethodElement.inject(AutowiredAnnotationBeanPos > > tP > > > rocessor.java:659) > > > > > > ... 48 common frames omitted > > > > > > Caused by: > > > org.springframework.beans.factory.UnsatisfiedDependencyException: > > > Error creating bean with name > > > 'org.apache.nifi.web.NiFiWebApiSecurityConfiguration': Unsatisfied > > > dependency expressed through method 'setJwtAuthenticationProvider' > > > parameter 0; nested exception is > > > org.springframework.beans.factory.BeanCreationException: Error > > > creating bean with name 'jwtAuthenticationProvider' defined in class > > > path resource [nifi-web-security-context.xml]: Cannot resolve > > > reference to bean 'authorizer' while setting constructor argument; > > > nested exception is > > > org.springframework.beans.factory.BeanCreationException: Error > > > creating bean with name 'authorizer': FactoryBean threw exception on > > > object creation; nested exception is java.lang.NullPointerException: > > > Name is null > > > > > > at > > > org.springframework.beans.factory.annotation.AutowiredAnnotationBean > > Po > > > stProcessor$AutowiredMethodElement.inject(AutowiredAnnotationBeanPos > > tP > > > rocessor.java:667) > > > > > > at > > > org.springframework.beans.factory.annotation.InjectionMetadata.injec > > t( > > > InjectionMetadata.java:88) > > > > > > at > > > org.springframework.beans.factory.annotation.AutowiredAnnotationBean > > Po > > > stProcessor.postProcessPropertyValues(AutowiredAnnotationBeanPostPro > > ce > > > ssor.java:366) > > > > > > at > > > org.springframework.beans.factory.support.AbstractAutowireCapableBea > > nF > > > actory.populateBean(AbstractAutowireCapableBeanFactory.java:1264) > > > > > > at > > > org.springframework.beans.factory.support.AbstractAutowireCapableBea > > nF > > > actory.doCreateBean(AbstractAutowireCapableBeanFactory.java:553) > > > > > > at > > > org.springframework.beans.factory.support.AbstractAutowireCapableBea > > nF > > > actory.createBean(AbstractAutowireCapableBeanFactory.java:483) > > > > > > at > > > org.springframework.beans.factory.support.AbstractBeanFactory$1.getO > > bj > > > ect(AbstractBeanFactory.java:306) > > > > > > at > > > org.springframework.beans.factory.support.DefaultSingletonBeanRegist > > ry > > > .getSingleton(DefaultSingletonBeanRegistry.java:230) > > > > > > at > > > org.springframework.beans.factory.support.AbstractBeanFactory.doGetB > > ea > > > n(AbstractBeanFactory.java:302) > > > > > > at > > > org.springframework.beans.factory.support.AbstractBeanFactory.getBea > > n( > > > AbstractBeanFactory.java:202) > > > > > > at > > > org.springframework.beans.factory.support.DefaultListableBeanFactory > > .g > > > etBeansOfType(DefaultListableBeanFactory.java:519) > > > > > > at > > > org.springframework.beans.factory.support.DefaultListableBeanFactory > > .g > > > etBeansOfType(DefaultListableBeanFactory.java:508) > > > > > > at > > > org.springframework.security.config.annotation.web.configuration.Aut > > ow > > > iredWebSecurityConfigurersIgnoreParents.getWebSecurityConfigurers(Au > > to > > > wiredWebSecurityConfigurersIgnoreParents.java:53) > > > > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > > Method) > > > > > > at > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl > > .j > > > ava:62) > > > > > > at > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcce > > ss > > > orImpl.java:43) > > > > > > at java.lang.reflect.Method.invoke(Method.java:498) > > > > > > at > > > org.springframework.expression.spel.support.ReflectiveMethodExecutor > > .e > > > xecute(ReflectiveMethodExecutor.java:113) > > > > > > at > > > org.springframework.expression.spel.ast.MethodReference.getValueInte > > rn > > > al(MethodReference.java:129) > > > > > > at > > > org.springframework.expression.spel.ast.MethodReference.access$000(M > > et > > > hodReference.java:49) > > > > > > at > > > org.springframework.expression.spel.ast.MethodReference$MethodValueR > > ef > > > .getValue(MethodReference.java:347) > > > > > > at > > > org.springframework.expression.spel.ast.CompoundExpression.getValueI > > nt > > > ernal(CompoundExpression.java:88) > > > > > > at > > > org.springframework.expression.spel.ast.SpelNodeImpl.getValue(SpelNo > > de > > > Impl.java:120) > > > > > > at > > > org.springframework.expression.spel.standard.SpelExpression.getValue > > (S > > > pelExpression.java:262) > > > > > > at > > > org.springframework.context.expression.StandardBeanExpressionResolver. > > > evaluate(StandardBeanExpressionResolver.java:161) > > > > > > ... 52 common frames omitted > > > > > > > > > > > > I tried to Google for possible clues, but so far, there hasn’t been > > > any luck… > > > > > > > > > > > > -----Original Message----- > > > From: Bryan Bende <bbe...@gmail.com> > > > Sent: Monday, October 15, 2018 10:27 AM > > > To: users@nifi.apache.org > > > Subject: Re: NiFi fails on cluster nodes > > > > > > > > > > > > I'm not really sure, the error message is indicating that either a > > certificate was not sent during cluster communications, or possibly the > > cert was not valid/trusted. > > > > > > > > > > > > In this case since it is only 1 node, it is the same node talking back to > > itself, so the only parts involved here are the keystore and truststore of > > that node, and the config in nifi.properties. > > > > > > > > > > > > Maybe your truststore is not setup correctly to trust certs signed by the > > CA that created the server cert? > > > > > > On Mon, Oct 15, 2018 at 9:53 AM Saip, Alexander (NIH/CC/BTRIS) [C] > > <alexander.s...@nih.gov> wrote: > > > > > > > > > > > > > > Yes, 'nifi.cluster.protocol.is.secure' is set to 'true', since otherwise, > > > NiFi would require values for 'nifi.web.http.host' and > > > 'nifi.web.http.port'. We have a cert that is used to serve HTTPS requests > > > to the NiFi web UI, and it works just fine. > > > > > > > > > > > > > > -----Original Message----- > > > > > > > From: Bryan Bende <bbe...@gmail.com> > > > > > > > Sent: Monday, October 15, 2018 9:43 AM > > > > > > > To: users@nifi.apache.org > > > > > > > Subject: Re: NiFi fails on cluster nodes > > > > > > > > > > > > > > This is not related to ZooKeeper... I think you are missing something > > > related to TLS/SSL configuration, maybe you set cluster protocol to be > > > secure, but then you didn't configure NiFi with a keystore/truststore? > > > > > > > > > > > > > > On Mon, Oct 15, 2018 at 9:41 AM Mike Thomsen <mikerthom...@gmail.com> > > > wrote: > > > > > > > > > > > > > > > > Not sure what's going on here, but NiFi does not require a cert to > > > > setup ZooKeeper. > > > > > > > > > > > > > > > > Mike > > > > > > > > > > > > > > > > On Mon, Oct 15, 2018 at 9:39 AM Saip, Alexander (NIH/CC/BTRIS) [C] > > > > <alexander.s...@nih.gov> wrote: > > > > > > > >> > > > > > > > >> Hi Mike and Bryan, > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> I’ve installed and started ZooKeeper 3.4.13 and re-started a single > > > >> NiFi node so far. Here is the error from the NiFi log: > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> 2018-10-15 09:19:48,371 ERROR [Process Cluster Protocol > > > > >> Request-1] > > > > > > > >> o.a.nifi.security.util.CertificateUtils The incoming request > > > >> did > > > > > > > >> not contain client certificates and thus the DN cannot be extracted. > > > > > > > >> Check that the other endpoint is providing a complete client > > > > > > > >> certificate chain > > > > > > > >> > > > > > > > >> 2018-10-15 09:19:48,425 INFO [main] > > > > > > > >> o.a.nifi.controller.StandardFlowService Connecting Node: > > > > > > > >> 0.0.0.0:8008 > > > > > > > >> > > > > > > > >> 2018-10-15 09:19:48,452 ERROR [Process Cluster Protocol > > > > >> Request-2] > > > > > > > >> o.a.nifi.security.util.CertificateUtils The incoming request > > > >> did > > > > > > > >> not contain client certificates and thus the DN cannot be extracted. > > > > > > > >> Check that the other endpoint is providing a complete client > > > > > > > >> certificate chain > > > > > > > >> > > > > > > > >> 2018-10-15 09:19:48,456 WARN [main] > > > > > > > >> o.a.nifi.controller.StandardFlowService Failed to connect to > > > > > > > >> cluster due to: org.apache.nifi.cluster.protocol.ProtocolException: > > > > > > > >> Failed marshalling 'CONNECTION_REQUEST' protocol message due to: > > > > > > > >> javax.net.ssl.SSLHandshakeException: Received fatal alert: > > > > > > > >> bad_certificate > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> It is likely extraneous to NiFi, but does this mean that we need > > > >> install a cert into ZooKeeper? Right now, both apps are running on the > > > >> same box. > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> Thank you. > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> From: Mike Thomsen <mikerthom...@gmail.com> > > > > > > > >> Sent: Monday, October 15, 2018 9:02 AM > > > > > > > >> To: users@nifi.apache.org > > > > > > > >> Subject: Re: NiFi fails on cluster nodes > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> http://nifi.apache.org/docs/nifi-docs/html/administration-guide > > > >> .h > > > > >> tm > > > > > > > >> l > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> See the properties that start with "nifi.zookeeper." > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> On Mon, Oct 15, 2018 at 8:58 AM Saip, Alexander (NIH/CC/BTRIS) [C] > > > >> <alexander.s...@nih.gov> wrote: > > > > > > > >> > > > > > > > >> Mike, > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> I wonder if you could point me to instructions how to configure a > > > >> cluster with an external instance of ZooKeeper? The NiFi Admin Guide > > > >> talks exclusively about the embedded one. > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> Thanks again. > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> From: Mike Thomsen <mikerthom...@gmail.com> > > > > > > > >> Sent: Friday, October 12, 2018 10:17 AM > > > > > > > >> To: users@nifi.apache.org > > > > > > > >> Subject: Re: NiFi fails on cluster nodes > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> It very well could become a problem down the road. The reason > > > >> ZooKeeper is usually on a dedicated machine is that you want it to be > > > >> able to have enough resources to always communicate within a quorum to > > > >> reconcile configuration changes and feed configuration details to > > > >> clients. > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> That particular message is just a warning message. From what I can > > > >> tell, it's just telling you that no cluster coordinator has been > > > >> elected and it's going to try to do something about that. It's usually > > > >> a problem with embedded ZooKeeper because each node by default points > > > >> to the version of ZooKeeper it fires up. > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> For a development environment, a VM with 2GB of RAM and 1-2 CPU cores > > > >> should be enough to run an external ZooKeeper. > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> On Fri, Oct 12, 2018 at 9:47 AM Saip, Alexander (NIH/CC/BTRIS) [C] > > > >> <alexander.s...@nih.gov> wrote: > > > > > > > >> > > > > > > > >> Thanks Mike. We will get an external ZooKeeper instance deployed. I > > > >> guess co-locating it with one of the NiFi nodes shouldn’t be an issue, > > > >> or will it? We are chronically short of hardware. BTW, does the > > > >> following message in the logs point to some sort of problem with the > > > >> embedded ZooKeeper? > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> 2018-10-12 08:21:35,838 WARN [main] > > > > > > > >> o.a.nifi.controller.StandardFlowService There is currently no > > > > > > > >> Cluster Coordinator. This often happens upon restart of NiFi > > > >> when > > > > > > > >> running an embedded ZooKeeper. Will register this node to > > > >> become > > > > > > > >> the active Cluster Coordinator and will attempt to connect to > > > > > > > >> cluster again > > > > > > > >> > > > > > > > >> 2018-10-12 08:21:35,838 INFO [main] > > > > > > > >> o.a.n.c.l.e.CuratorLeaderElectionManager > > > > > > > >> CuratorLeaderElectionManager[stopped=false] Attempted to > > > >> register > > > > > > > >> Leader Election for role 'Cluster Coordinator' but this role is > > > > > > > >> already registered > > > > > > > >> > > > > > > > >> 2018-10-12 08:21:42,090 INFO [Curator-Framework-0] > > > > > > > >> o.a.c.f.state.ConnectionStateManager State change: SUSPENDED > > > > > > > >> > > > > > > > >> 2018-10-12 08:21:42,092 INFO [Curator-ConnectionStateManager-0] > > > > > > > >> o.a.n.c.l.e.CuratorLeaderElectionManager > > > > > > > >> org.apache.nifi.controller.leader.election.CuratorLeaderElectio > > > >> nM > > > > >> an > > > > > > > >> ag er$ElectionListener@17900f5b Connection State changed to > > > > > > > >> SUSPENDED > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> From: Mike Thomsen <mikerthom...@gmail.com> > > > > > > > >> Sent: Friday, October 12, 2018 8:33 AM > > > > > > > >> To: users@nifi.apache.org > > > > > > > >> Subject: Re: NiFi fails on cluster nodes > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> Also, in a production environment NiFi should have its own dedicated > > > >> ZooKeeper cluster to be on the safe side. You should not reuse > > > >> ZooKeeper quora (ex. have HBase and NiFi point to the same quorum). > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> On Fri, Oct 12, 2018 at 8:29 AM Mike Thomsen <mikerthom...@gmail.com> > > > >> wrote: > > > > > > > >> > > > > > > > >> Alexander, > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> I am pretty sure your problem is here: > > > > > > > >> nifi.state.management.embedded.zookeeper.start=true > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> That spins up an embedded ZooKeeper, which is generally intended to be > > > >> used for local development. For example, HBase provides the same > > > >> feature, but it is intended to allow you to test a real HBase client > > > >> application against a single node of HBase running locally. > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> What you need to try is these steps: > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> 1. Set up an external ZooKeeper instance (or set up 3 in a > > > > >> quorum; > > > > > > > >> must be odd numbers) > > > > > > > >> > > > > > > > >> 2. Update nifi.properties on each node to use the external ZooKeeper > > > >> setup. > > > > > > > >> > > > > > > > >> 3. Restart all of them. > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> See if that works. > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> Mike > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> On Fri, Oct 12, 2018 at 8:13 AM Saip, Alexander (NIH/CC/BTRIS) [C] > > > >> <alexander.s...@nih.gov> wrote: > > > > > > > >> > > > > > > > >> nifi.cluster.node.protocol.port=11443 by default on all nodes, I > > > >> haven’t touched that property. Yesterday, we discovered some issues > > > >> preventing two of the boxes from communicating. Now, they can talk > > > >> okay. Ports 11443, 2181 and 3888 are explicitly open in iptables, but > > > >> clustering still doesn’t happen. The log files are filled up with > > > >> errors like this: > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> 2018-10-12 07:59:08,494 ERROR [Curator-Framework-0] > > > > > > > >> o.a.c.f.imps.CuratorFrameworkImpl Background operation retry > > > >> gave > > > > > > > >> up > > > > > > > >> > > > > > > > >> org.apache.zookeeper.KeeperException$ConnectionLossException: > > > > > > > >> KeeperErrorCode = ConnectionLoss > > > > > > > >> > > > > > > > >> at > > > > > > > >> org.apache.zookeeper.KeeperException.create(KeeperException.java: > > > > >> 99 > > > > > > > >> ) > > > > > > > >> > > > > > > > >> at > > > > > > > >> org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBac > > > >> kg > > > > >> ro > > > > > > > >> un > > > > > > > >> dRetry(CuratorFrameworkImpl.java:728) > > > > > > > >> > > > > > > > >> at > > > > > > > >> org.apache.curator.framework.imps.CuratorFrameworkImpl.performB > > > >> ac > > > > >> kg > > > > > > > >> ro > > > > > > > >> undOperation(CuratorFrameworkImpl.java:857) > > > > > > > >> > > > > > > > >> at > > > > > > > >> org.apache.curator.framework.imps.CuratorFrameworkImpl.backgrou > > > >> nd > > > > >> Op > > > > > > > >> er > > > > > > > >> ationsLoop(CuratorFrameworkImpl.java:809) > > > > > > > >> > > > > > > > >> at > > > > > > > >> org.apache.curator.framework.imps.CuratorFrameworkImpl.access$3 > > > >> 00 > > > > >> (C > > > > > > > >> ur > > > > > > > >> atorFrameworkImpl.java:64) > > > > > > > >> > > > > > > > >> at > > > > > > > >> org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(C > > > >> ur > > > > >> at > > > > > > > >> or > > > > > > > >> FrameworkImpl.java:267) > > > > > > > >> > > > > > > > >> at > > > > >> java.util.concurrent.FutureTask.run(FutureTask.java:266) > > > > > > > >> > > > > > > > >> at > > > > > > > >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask. > > > > > > > >> access$201(ScheduledThreadPoolExecutor.java:180) > > > > > > > >> > > > > > > > >> at > > > > > > > >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask. > > > > > > > >> run(ScheduledThreadPoolExecutor.java:293) > > > > > > > >> > > > > > > > >> at > > > > > > > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor. > > > > > > > >> java:1149) > > > > > > > >> > > > > > > > >> at > > > > > > > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolEx > > > >> ec > > > > >> ut > > > > > > > >> or > > > > > > > >> .java:624) > > > > > > > >> > > > > > > > >> at java.lang.Thread.run(Thread.java:748) > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> Is there anything else we should check? > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> From: Nathan Gough <thena...@gmail.com> > > > > > > > >> Sent: Thursday, October 11, 2018 9:12 AM > > > > > > > >> To: users@nifi.apache.org > > > > > > > >> Subject: Re: NiFi fails on cluster nodes > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> You may also need to explicitly open ‘nifi.cluster.node.protocol.port’ > > > >> on all nodes to allow cluster communication for cluster heartbeats etc. > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> From: ashmeet kandhari <ashmeetkandhar...@gmail.com> > > > > > > > >> Reply-To: <users@nifi.apache.org> > > > > > > > >> Date: Thursday, October 11, 2018 at 9:09 AM > > > > > > > >> To: <users@nifi.apache.org> > > > > > > > >> Subject: Re: NiFi fails on cluster nodes > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> Hi Alexander, > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> Can you verify by pinging if the 3 nodes (tcp ping) or run nifi in > > > >> standalone mode and see if you can ping them from other 2 servers just > > > >> to be sure if they can communicate with one another. > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> On Thu, Oct 11, 2018 at 11:49 AM Saip, Alexander (NIH/CC/BTRIS) [C] > > > >> <alexander.s...@nih.gov> wrote: > > > > > > > >> > > > > > > > >> How do I do that? The nifi.properties file on each node includes > > > >> ‘nifi.state.management.embedded.zookeeper.start=true’, so I assume > > > >> Zookeeper does start. > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> From: ashmeet kandhari <ashmeetkandhar...@gmail.com> > > > > > > > >> Sent: Thursday, October 11, 2018 4:36 AM > > > > > > > >> To: users@nifi.apache.org > > > > > > > >> Subject: Re: NiFi fails on cluster nodes > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> Can you see if zookeeper node is up and running and can connect > > > > >> to > > > > > > > >> the nifi nodes > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> On Wed, Oct 10, 2018 at 7:34 PM Saip, Alexander (NIH/CC/BTRIS) [C] > > > >> <alexander.s...@nih.gov> wrote: > > > > > > > >> > > > > > > > >> Hello, > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> We have three NiFi 1.7.1 nodes originally configured as independent > > > >> instances, each on its own server. There is no firewall between them. > > > >> When I tried to build a cluster following instructions here, NiFi > > > >> failed to start on all of them, despite the fact that I even set > > > >> nifi.cluster.protocol.is.secure=false in the nifi.properties file on > > > >> each node. Here is the error in the log files: > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> 2018-10-10 13:57:07,506 INFO [main] org.apache.nifi.NiFi Launching > > > >> NiFi... > > > > > > > >> > > > > > > > >> 2018-10-10 13:57:07,745 INFO [main] > > > >> o.a.nifi.properties.NiFiPropertiesLoader Determined default > > > >> nifi.properties path to be '/opt/nifi-1.7.1/./conf/nifi.properties' > > > > > > > >> > > > > > > > >> 2018-10-10 13:57:07,748 INFO [main] > > > > > > > >> o.a.nifi.properties.NiFiPropertiesLoader Loaded 125 properties > > > > >> from > > > > > > > >> /opt/nifi-1.7.1/./conf/nifi.properties > > > > > > > >> > > > > > > > >> 2018-10-10 13:57:07,755 INFO [main] org.apache.nifi.NiFi Loaded > > > > >> 125 > > > > > > > >> properties > > > > > > > >> > > > > > > > >> 2018-10-10 13:57:07,762 INFO [main] > > > > > > > >> org.apache.nifi.BootstrapListener Started Bootstrap Listener, > > > > > > > >> Listening for incoming requests on port > > > > > > > >> 43744 > > > > > > > >> > > > > > > > >> 2018-10-10 13:59:15,056 ERROR [main] org.apache.nifi.NiFi > > > >> Failure > > > > > > > >> to launch NiFi due to java.net.ConnectException: Connection > > > >> timed > > > > > > > >> out (Connection timed out) > > > > > > > >> > > > > > > > >> java.net.ConnectException: Connection timed out (Connection > > > >> timed > > > > > > > >> out) > > > > > > > >> > > > > > > > >> at java.net.PlainSocketImpl.socketConnect(Native > > > >> Method) > > > > > > > >> > > > > > > > >> at > > > > > > > >> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl. > > > > > > > >> ja > > > > > > > >> va:350) > > > > > > > >> > > > > > > > >> at > > > > > > > >> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlain > > > >> So > > > > >> ck > > > > > > > >> et > > > > > > > >> Impl.java:206) > > > > > > > >> > > > > > > > >> at > > > > > > > >> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl. > > > > >> ja > > > > > > > >> va > > > > > > > >> :188) > > > > > > > >> > > > > > > > >> at > > > > > > > >> java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) > > > > > > > >> > > > > > > > >> at java.net.Socket.connect(Socket.java:589) > > > > > > > >> > > > > > > > >> at java.net.Socket.connect(Socket.java:538) > > > > > > > >> > > > > > > > >> at > > > > > > > >> org.apache.nifi.BootstrapListener.sendCommand(BootstrapListener.java: > > > > > > > >> 100) > > > > > > > >> > > > > > > > >> at > > > > > > > >> org.apache.nifi.BootstrapListener.start(BootstrapListener.java: > > > >> 83 > > > > >> ) > > > > > > > >> > > > > > > > >> at org.apache.nifi.NiFi.<init>(NiFi.java:102) > > > > > > > >> > > > > > > > >> at org.apache.nifi.NiFi.<init>(NiFi.java:71) > > > > > > > >> > > > > > > > >> at org.apache.nifi.NiFi.main(NiFi.java:292) > > > > > > > >> > > > > > > > >> 2018-10-10 13:59:15,058 INFO [Thread-1] org.apache.nifi.NiFi > > > >> Initiating shutdown of Jetty web server... > > > > > > > >> > > > > > > > >> 2018-10-10 13:59:15,059 INFO [Thread-1] org.apache.nifi.NiFi Jetty web > > > >> server shutdown completed (nicely or otherwise). > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> Without clustering, the instances had no problem starting. Since this > > > >> is our first experiment building a cluster, I’m not sure where to look > > > >> for clues. > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> Thanks in advance, > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> Alexander