Re: putsolrcontentstream and kerberos

2018-10-12 Thread Bryan Bende
I've only done it against Solr cloud, but I don't know a reason why it
wouldn't work against standalone Solr.

Nothing is jumping out at me as being wrong with your config. My JAAS
config was the following:

SolrJClient {
  com.sun.security.auth.module.Krb5LoginModule required
  useKeyTab=true
  keyTab="/Users/bbende/Projects/docker-kdc/krb5.keytab"
  storeKey=true
  useTicketCache=false
  debug=true
  principal="n...@solr.org";
};

Can you successfully access Solr from a curl command?

curl --negotiate -u :
"http://vmsol001.epg.nam.gm.com:8983/solr/collection1/select;

Assuming you did a kinit first.

On Fri, Oct 12, 2018 at 2:48 PM Dan Caulfield  wrote:
>
> When attempting to use the putsolrcontentstream (Version 1.6.0) to load json 
> file into a Solr 6.3 cluster that requires Kerberos authentication.
>
>
>
> I have set the -D java.security.auth.login.config=/disk-1/nifi/jaas/jaas.conf
>
>
>
> And the jass file looks like this -
>
> MicroServicesSolrClient {
>
>   com.sun.security.auth.module.Krb5LoginModule required
>
>   useKeyTab=true
>
>   storeKey=true
>
>   keyTab="/disk-1/nifi/keytabs/MSSolrClient"
>
>   serviceName="solr"
>
>   principal="mssolru...@x.gm.com";
>
> };
>
> GMASTSolrClient {
>
>   com.sun.security.auth.module.Krb5LoginModule required
>
>   useKeyTab=true
>
>   storeKey=true
>
>   useTicketCache=true
>
>   debug=true
>
>   keyTab="/disk-1/nifi/keytabs/GMSolrClient"
>
>   serviceName="solr"
>
>   principal="gmsolru...@x.gm.com";
>
> };
>
>
>
> The Processor is set to –
>
> Solr Type – Standard
>
> Solr Location - http://vmsol001.epg.nam.gm.com:8983/solr/collection1/
>
> Content Stream Path - /update/json/docs
>
> Content-Type – application/json
>
> JAAS Client App Name – GMASTSolrClient
>
>
>
> Does the 1.6 version of the PutSolrContentStream support Kerberos?  We are 
> getting the 401 authentication error -
>
>
>
>
>
> 198730787914459,size=119100] to Solr due to 
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
> from server at http://vmsol001.epg.nam.gm.com:8983/solr: Expected mime type 
> application/octet-stream but got text/html. 
>
> 
>
> 
>
> Error 401 Authentication required
>
> 
>
> HTTP ERROR 401
>
> Problem accessing /solr/select. Reason:
>
> Authentication required
>
> 
>
> 
>
> ; routing to failure: 
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
> from server at http://vmsol001.epg.nam.gm.com:8983/solr: Expected mime type 
> application/octet-stream but got text/html. 
>
> 
>
> 
>
> Error 401 Authentication required
>
> 
>
> HTTP ERROR 401
>
> Problem accessing /solr/select. Reason:
>
> Authentication required
>
> 
>
> 
>
>
>
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
> from server at http://vmsol001.epg.nam.gm.com:8983/solr: Expected mime type 
> application/octet-stream but got text/html. 
>
> 
>
> 
>
> Error 401 Authentication required
>
> 
>
> HTTP ERROR 401
>
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
> from server at http://vmsol001.epg.nam.gm.com:8983/solr: Expected mime type 
> application/octet-stream but got text/html. 
>
> 
>
> 
>
> Error 401 Authentication required
>
> 
>
> HTTP ERROR 401
>
> Problem accessing /solr/select. Reason:
>
> Authentication required
>
> 
>
> 
>
>
>
> at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:560)
>
> at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:261)
>
> at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:250)
>
> at 
> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
>
> at 
> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:166)
>
> at 
> org.apache.nifi.processors.solr.PutSolrContentStream$1.process(PutSolrContentStream.java:242)
>
> at 
> org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2207)
>
> at 
> org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2175)
>
> at 
> org.apache.nifi.processors.solr.PutSolrContentStream.onTrigger(PutSolrContentStream.java:199)
>
> at 
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>
> at 
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1147)
>
> at 
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
>
> at 
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
>
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>
> at 
> 

putsolrcontentstream and kerberos

2018-10-12 Thread Dan Caulfield
When attempting to use the putsolrcontentstream (Version 1.6.0) to load json 
file into a Solr 6.3 cluster that requires Kerberos authentication.

I have set the -D java.security.auth.login.config=/disk-1/nifi/jaas/jaas.conf

And the jass file looks like this -
MicroServicesSolrClient {
  com.sun.security.auth.module.Krb5LoginModule required
  useKeyTab=true
  storeKey=true
  keyTab="/disk-1/nifi/keytabs/MSSolrClient"
  serviceName="solr"
  principal="mssolru...@x.gm.com";
};
GMASTSolrClient {
  com.sun.security.auth.module.Krb5LoginModule required
  useKeyTab=true
  storeKey=true
  useTicketCache=true
  debug=true
  keyTab="/disk-1/nifi/keytabs/GMSolrClient"
  serviceName="solr"
  principal="gmsolru...@x.gm.com";
};

The Processor is set to -
Solr Type - Standard
Solr Location - http://vmsol001.epg.nam.gm.com:8983/solr/collection1/
Content Stream Path - /update/json/docs
Content-Type - application/json
JAAS Client App Name - GMASTSolrClient

Does the 1.6 version of the PutSolrContentStream support Kerberos?  We are 
getting the 401 authentication error -


198730787914459,size=119100] to Solr due to 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://vmsol001.epg.nam.gm.com:8983/solr: Expected mime type 
application/octet-stream but got text/html. 


Error 401 Authentication required

HTTP ERROR 401
Problem accessing /solr/select. Reason:
Authentication required


; routing to failure: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://vmsol001.epg.nam.gm.com:8983/solr: Expected mime type 
application/octet-stream but got text/html. 


Error 401 Authentication required

HTTP ERROR 401
Problem accessing /solr/select. Reason:
Authentication required



org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://vmsol001.epg.nam.gm.com:8983/solr: Expected mime type 
application/octet-stream but got text/html. 


Error 401 Authentication required

HTTP ERROR 401
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://vmsol001.epg.nam.gm.com:8983/solr: Expected mime type 
application/octet-stream but got text/html. 


Error 401 Authentication required

HTTP ERROR 401
Problem accessing /solr/select. Reason:
Authentication required



at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:560)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:261)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:250)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:166)
at 
org.apache.nifi.processors.solr.PutSolrContentStream$1.process(PutSolrContentStream.java:242)
at 
org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2207)
at 
org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2175)
at 
org.apache.nifi.processors.solr.PutSolrContentStream.onTrigger(PutSolrContentStream.java:199)
at 
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at 
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1147)
at 
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Dan Caulfield
Hyper Scale Engineer
GM - NA Information Technology - Hyper Scale Data Solutions
dan.caulfi...@gm.com



Nothing in this message is intended to constitute an electronic signature 
unless a specific statement to the contrary is included in this message.

Confidentiality Note: This message is intended only for the person or entity to 
which it is addressed. It may contain confidential and/or privileged material. 
Any review, transmission, dissemination or other use, or taking of any action 
in reliance upon this message by persons or entities other than the intended 
recipient is prohibited and may be unlawful. If you received this 

Whitelisting Proxy Host values in a Container Environment?

2018-10-12 Thread Jon Logan
We are running into issues with NiFi not allowing secure connections in a
container due to the proxy...the only documentation we've found on this
involves whitelisting specific proxy addresses. Is this the only solution?
Specifically, we're concerned about the fact that we don't know the proxy
address ahead of time to whitelist -- the port is an arbitrary assigned at
runtime port, and the proxy name could be any of the nodes of our
Kubernetes cluster.

Are we missing something?


Thanks!


Re: How do I logout of NiFi UI

2018-10-12 Thread Joe Witt
Vijay

What mechanism are you using to login to NiFi?

Thanks
On Fri, Oct 12, 2018 at 10:37 AM Vijay Chhipa  wrote:
>
> Hello
>
> I can't find the 'logout' link on the canvas or under my profile name or in 
> the hamburger menu.
> Whats the recommended way to logout of Apache NiFi
>
> Can you please point me to this link on a screenshot or if its somewhere in 
> the docs a pointer to that is highly appreciated.
>
> Thanks
>


How do I logout of NiFi UI

2018-10-12 Thread Vijay Chhipa
Hello 

I can't find the 'logout' link on the canvas or under my profile name or in the 
hamburger menu. 
Whats the recommended way to logout of Apache NiFi 

Can you please point me to this link on a screenshot or if its somewhere in the 
docs a pointer to that is highly appreciated. 

Thanks



RE: [EXT] Back Pressure on Process Group

2018-10-12 Thread John McGinn
Hi Lee,

Thanks for the link to that thread, it was one of the ones I had gone through
before posting here. I don't think it works for me in my use case, though,
since I have a JSON string coming in to both the Process Group and the Merge
Content, I modify the flow file in the process group to have attributes, and
went it exits and gets into the Merge Content, the JSON flow file content
is now duplicated. Oddly, it also seemed like the back pressure didn't work
right since inside my process group, while I was testing things, I had a lot of
flow files coming in.

This thread, 
http://apache-nifi.1125220.n5.nabble.com/Wait-only-if-flagged-td19925.html
though, talks about the same situation, which is where Koji mentions his
Wait/Notify example at 
https://gist.github.com/ijokarumawak/9e1a4855934f2bb9661f88ca625bd244, 
but as he states on that page, "The tricky part is how to setup the initial
state in a DistributedMapCache in order to pass the first incoming
FlowFile."

I also wondered if a Property could be added to the ConnectionDTO to allow
the backpressure queue to be applied to the entire object that the connection
is pointing to, in this case as Process Group. Since the Connection knows
the 'Within Group' of both sides of the connection, it could identify the
Process Group, and identify if there are any flowfiles in there. Though,
this could be problematic for a Process Group with multiple Input Ports, or
the code to identify the paths from the Input Port to any Output Ports to
determine the back pressure information. I'm not sure if this type of
suggestion conforms to the NiFi paradigm of flow processing and doing 1 thing
well.

I'm going to try a DistributedCacheService, along the lines of Wait/Notify
that will be able to use the not-found of the FetchCache, and then use
RouteOnAttribute based on that value to determine if we can proceed or 
go to a Wait processor, and then use PutCache at the end of the Process Group
(or groups of Processors, even) and possibly a Notify, and see how that might
get me forward.

Thanks,
John


On Thu, 10/11/18, Lee Laim (leelaim)  wrote:

 Subject: RE: [EXT] Back Pressure on Process Group
 To: "users@nifi.apache.org" , "John McGinn" 

 Date: Thursday, October 11, 2018, 3:49 PM
 
 Hi John,  
 
 
 You can send the initiating
 flowfile into the Process Group and simultaneously send a
 duplicate flowfile around the process group into a
 MergeContent processor.  This duplicate will act as a back
 pressure "latch".
 
 Upon successful exit of the process group,
 Merge Content with a strict correlation strategy will clear
 the back pressure latch, allowing the next flowfile into the
 group.
 
 Personally, I think
 Wait/Notify is the more elegant solution, but have
 successfully used the back pressure latch before Wait/Notify
 was readily available.   
  
 
 This thread might offer some
 additional insight: 
http://apache-nifi.1125220.n5.nabble.com/Having-a-processor-wait-for-all-inputs-td15614.html
 
 
 Thanks,
 Lee
 
 
 -Original Message-
 From: John McGinn [mailto:amruginn-n...@yahoo.com]
 
 Sent: Thursday, October 11, 2018 12:17
 PM
 To: users@nifi.apache.org
 Subject: [EXT] Back Pressure on Process
 Group
 
 I've been going
 through mailing list archives, and looking at blog posts
 around Wait/Notify, and these don't seem to be the
 solution for my use case.
 
 My basic use case is as follows. I have 4 DB
 tables, 3 of which are id/name pairs (office name, city,
 state), and the 4th table joins the 3 ids together to a new
 id which is used elsewhere in the database.
 
 Using NiFi to injest data from
 a different database system, we have to verify if that
 office is active, and if it isn't active or
 non-existent, create a new record, as well as any of the
 other 3 tables necessary.
 
 The first step, then, is to join the 4 tables
 together to search for the name fields, and if the join
 comes back with a row, use that top level id as an
 attribute. No problem, works fine. (FetchDatabaseTable ->
 AvroToJson -> EvaluateJsonPath, etc.) If the join comes
 back empty, I need to insert rows for the 3 pieces and then
 join them together. Ideally, this would be a flow of 3
 PutSqls, then a connection back to the top level search of
 the database. (Currently I'm using a modified custom
 processor, LookupAttributeFromSQL, that Brett Ryan did in
 January 18th, before he worked on a SQLLookupService.)
 
 The problem is that I could
 have 2 records coming in with the same pieces of
 information, and because it's flow based, the check for
 the 4 table join will come up empty on the second record
 before the first record is done creating the 4 table
 records. I've investigated the Wait/Notify pattern, but
 the odd part for me is that you need to have a separate
 "initialization" of the Wait/Notify release signal
 indicator 
(https://gist.github.com/ijokarumawak/9e1a4855934f2bb9661f88ca625bd244)
 and that seems "hack-ish" to me.
 
 With 

Re: NiFi fails on cluster nodes

2018-10-12 Thread Mike Thomsen
It very well could become a problem down the road. The reason ZooKeeper is
usually on a dedicated machine is that you want it to be able to have
enough resources to always communicate within a quorum to reconcile
configuration changes and feed configuration details to clients.

That particular message is just a warning message. From what I can tell,
it's just telling you that no cluster coordinator has been elected and it's
going to try to do something about that. It's usually a problem with
embedded ZooKeeper because each node by default points to the version of
ZooKeeper it fires up.

For a development environment, a VM with 2GB of RAM and 1-2 CPU cores
should be enough to run an external ZooKeeper.

On Fri, Oct 12, 2018 at 9:47 AM Saip, Alexander (NIH/CC/BTRIS) [C] <
alexander.s...@nih.gov> wrote:

> Thanks Mike. We will get an external ZooKeeper instance deployed. I guess
> co-locating it with one of the NiFi nodes shouldn’t be an issue, or will
> it? We are chronically short of hardware. BTW, does the following message
> in the logs point to some sort of problem with the embedded ZooKeeper?
>
>
>
> 2018-10-12 08:21:35,838 WARN [main]
> o.a.nifi.controller.StandardFlowService There is currently no Cluster
> Coordinator. This often happens upon restart of NiFi when running an
> embedded ZooKeeper. Will register this node to become the active Cluster
> Coordinator and will attempt to connect to cluster again
>
> 2018-10-12 08:21:35,838 INFO [main]
> o.a.n.c.l.e.CuratorLeaderElectionManager
> CuratorLeaderElectionManager[stopped=false] Attempted to register Leader
> Election for role 'Cluster Coordinator' but this role is already registered
>
> 2018-10-12 08:21:42,090 INFO [Curator-Framework-0]
> o.a.c.f.state.ConnectionStateManager State change: SUSPENDED
>
> 2018-10-12 08:21:42,092 INFO [Curator-ConnectionStateManager-0]
> o.a.n.c.l.e.CuratorLeaderElectionManager
> org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@17900f5b
> Connection State changed to SUSPENDED
>
>
>
> *From:* Mike Thomsen 
> *Sent:* Friday, October 12, 2018 8:33 AM
> *To:* users@nifi.apache.org
> *Subject:* Re: NiFi fails on cluster nodes
>
>
>
> Also, in a production environment NiFi should have its own dedicated
> ZooKeeper cluster to be on the safe side. You should not reuse ZooKeeper
> quora (ex. have HBase and NiFi point to the same quorum).
>
>
>
> On Fri, Oct 12, 2018 at 8:29 AM Mike Thomsen 
> wrote:
>
> Alexander,
>
>
>
> I am pretty sure your problem is here:
> *nifi.state.management.embedded.zookeeper.start=true*
>
>
>
> That spins up an embedded ZooKeeper, which is generally intended to be
> used for local development. For example, HBase provides the same feature,
> but it is intended to allow you to test a real HBase client application
> against a single node of HBase running locally.
>
>
>
> What you need to try is these steps:
>
>
>
> 1. Set up an external ZooKeeper instance (or set up 3 in a quorum; must be
> odd numbers)
>
> 2. Update nifi.properties on each node to use the external ZooKeeper setup.
>
> 3. Restart all of them.
>
>
>
> See if that works.
>
>
>
> Mike
>
>
>
> On Fri, Oct 12, 2018 at 8:13 AM Saip, Alexander (NIH/CC/BTRIS) [C] <
> alexander.s...@nih.gov> wrote:
>
> *nifi.cluster.node.protocol.port=11443* by default on all nodes, I
> haven’t touched that property. Yesterday, we discovered some issues
> preventing two of the boxes from communicating. Now, they can talk okay.
> Ports 11443, 2181 and 3888 are explicitly open in *iptables*, but
> clustering still doesn’t happen. The log files are filled up with errors
> like this:
>
>
>
> 2018-10-12 07:59:08,494 ERROR [Curator-Framework-0]
> o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave up
>
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss
>
> at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>
> at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:728)
>
> at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:857)
>
> at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:809)
>
> at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:64)
>
> at
> org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:267)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>
> 

Re: Nifi fetching files

2018-10-12 Thread Mike Thomsen
Another thing is that you can also have the process add a period to the
start of the filename to hide them until they're done being written and can
be renamed, if you want to be extra safe.

On Fri, Oct 12, 2018 at 10:01 AM Aldrin Piri  wrote:

> Hi Tom,
>
> You can make use of the minimum file age property on ListFile to ignore a
> file until it has reached the desired 10 minute buffer.
>
> On Fri, Oct 12, 2018 at 9:59 AM Tomislav Novosel 
> wrote:
>
>> Hi Nifi team,
>>
>> my usecase is to list files and fetch files using ListFile and FetchFile
>> processors from intermediate folder which is also  destination of my
>> external .exe script.
>>
>> Fetching from that folder has completion strategy to delete files.
>>
>> How can I wait for let's say 10 minutes before files are fetched from
>> that folder to avoid conflict between two processes (Nifi and external
>> script) and exceptions like Access denied etc.
>>
>> Thanks,
>> Tom
>>
>


Re: Nifi fetching files

2018-10-12 Thread Aldrin Piri
Hi Tom,

You can make use of the minimum file age property on ListFile to ignore a
file until it has reached the desired 10 minute buffer.

On Fri, Oct 12, 2018 at 9:59 AM Tomislav Novosel 
wrote:

> Hi Nifi team,
>
> my usecase is to list files and fetch files using ListFile and FetchFile
> processors from intermediate folder which is also  destination of my
> external .exe script.
>
> Fetching from that folder has completion strategy to delete files.
>
> How can I wait for let's say 10 minutes before files are fetched from that
> folder to avoid conflict between two processes (Nifi and external script)
> and exceptions like Access denied etc.
>
> Thanks,
> Tom
>


Nifi fetching files

2018-10-12 Thread Tomislav Novosel
Hi Nifi team,

my usecase is to list files and fetch files using ListFile and FetchFile
processors from intermediate folder which is also  destination of my
external .exe script.

Fetching from that folder has completion strategy to delete files.

How can I wait for let's say 10 minutes before files are fetched from that
folder to avoid conflict between two processes (Nifi and external script)
and exceptions like Access denied etc.

Thanks,
Tom


RE: NiFi fails on cluster nodes

2018-10-12 Thread Saip, Alexander (NIH/CC/BTRIS) [C]
Thanks Mike. We will get an external ZooKeeper instance deployed. I guess 
co-locating it with one of the NiFi nodes shouldn’t be an issue, or will it? We 
are chronically short of hardware. BTW, does the following message in the logs 
point to some sort of problem with the embedded ZooKeeper?

2018-10-12 08:21:35,838 WARN [main] o.a.nifi.controller.StandardFlowService 
There is currently no Cluster Coordinator. This often happens upon restart of 
NiFi when running an embedded ZooKeeper. Will register this node to become the 
active Cluster Coordinator and will attempt to connect to cluster again
2018-10-12 08:21:35,838 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager 
CuratorLeaderElectionManager[stopped=false] Attempted to register Leader 
Election for role 'Cluster Coordinator' but this role is already registered
2018-10-12 08:21:42,090 INFO [Curator-Framework-0] 
o.a.c.f.state.ConnectionStateManager State change: SUSPENDED
2018-10-12 08:21:42,092 INFO [Curator-ConnectionStateManager-0] 
o.a.n.c.l.e.CuratorLeaderElectionManager 
org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener@17900f5b
 Connection State changed to SUSPENDED

From: Mike Thomsen 
Sent: Friday, October 12, 2018 8:33 AM
To: users@nifi.apache.org
Subject: Re: NiFi fails on cluster nodes

Also, in a production environment NiFi should have its own dedicated ZooKeeper 
cluster to be on the safe side. You should not reuse ZooKeeper quora (ex. have 
HBase and NiFi point to the same quorum).

On Fri, Oct 12, 2018 at 8:29 AM Mike Thomsen 
mailto:mikerthom...@gmail.com>> wrote:
Alexander,

I am pretty sure your problem is here: 
nifi.state.management.embedded.zookeeper.start=true

That spins up an embedded ZooKeeper, which is generally intended to be used for 
local development. For example, HBase provides the same feature, but it is 
intended to allow you to test a real HBase client application against a single 
node of HBase running locally.

What you need to try is these steps:

1. Set up an external ZooKeeper instance (or set up 3 in a quorum; must be odd 
numbers)
2. Update nifi.properties on each node to use the external ZooKeeper setup.
3. Restart all of them.

See if that works.

Mike

On Fri, Oct 12, 2018 at 8:13 AM Saip, Alexander (NIH/CC/BTRIS) [C] 
mailto:alexander.s...@nih.gov>> wrote:
nifi.cluster.node.protocol.port=11443 by default on all nodes, I haven’t 
touched that property. Yesterday, we discovered some issues preventing two of 
the boxes from communicating. Now, they can talk okay. Ports 11443, 2181 and 
3888 are explicitly open in iptables, but clustering still doesn’t happen. The 
log files are filled up with errors like this:

2018-10-12 07:59:08,494 ERROR [Curator-Framework-0] 
o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave up
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:728)
at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:857)
at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:809)
at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:64)
at 
org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:267)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Is there anything else we should check?

From: Nathan Gough mailto:thena...@gmail.com>>
Sent: Thursday, October 11, 2018 9:12 AM
To: users@nifi.apache.org
Subject: Re: NiFi fails on cluster nodes

You may also need to explicitly open ‘nifi.cluster.node.protocol.port’ on all 
nodes to allow cluster communication for cluster heartbeats etc.

From: ashmeet kandhari 
mailto:ashmeetkandhar...@gmail.com>>
Reply-To: mailto:users@nifi.apache.org>>
Date: Thursday, October 11, 2018 at 9:09 AM
To: mailto:users@nifi.apache.org>>
Subject: Re: NiFi fails on cluster nodes

Hi Alexander,

Can you verify by pinging if the 3 nodes (tcp ping) or run nifi in standalone 
mode and see if you can ping them from other 2 servers just to be sure if they 
can communicate with one another.

On Thu, Oct 11, 2018 at 11:49 AM Saip, Alexander 

Re: NiFi fails on cluster nodes

2018-10-12 Thread Mike Thomsen
Also, in a production environment NiFi should have its own dedicated
ZooKeeper cluster to be on the safe side. You should not reuse ZooKeeper
quora (ex. have HBase and NiFi point to the same quorum).

On Fri, Oct 12, 2018 at 8:29 AM Mike Thomsen  wrote:

> Alexander,
>
> I am pretty sure your problem is here:
> *nifi.state.management.embedded.zookeeper.start=true*
>
> That spins up an embedded ZooKeeper, which is generally intended to be
> used for local development. For example, HBase provides the same feature,
> but it is intended to allow you to test a real HBase client application
> against a single node of HBase running locally.
>
> What you need to try is these steps:
>
> 1. Set up an external ZooKeeper instance (or set up 3 in a quorum; must be
> odd numbers)
> 2. Update nifi.properties on each node to use the external ZooKeeper setup.
> 3. Restart all of them.
>
> See if that works.
>
> Mike
>
> On Fri, Oct 12, 2018 at 8:13 AM Saip, Alexander (NIH/CC/BTRIS) [C] <
> alexander.s...@nih.gov> wrote:
>
>> *nifi.cluster.node.protocol.port=11443* by default on all nodes, I
>> haven’t touched that property. Yesterday, we discovered some issues
>> preventing two of the boxes from communicating. Now, they can talk okay.
>> Ports 11443, 2181 and 3888 are explicitly open in *iptables*, but
>> clustering still doesn’t happen. The log files are filled up with errors
>> like this:
>>
>>
>>
>> 2018-10-12 07:59:08,494 ERROR [Curator-Framework-0]
>> o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave up
>>
>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>> KeeperErrorCode = ConnectionLoss
>>
>> at
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>>
>> at
>> org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:728)
>>
>> at
>> org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:857)
>>
>> at
>> org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:809)
>>
>> at
>> org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:64)
>>
>> at
>> org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:267)
>>
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>
>> at java.lang.Thread.run(Thread.java:748)
>>
>>
>>
>> Is there anything else we should check?
>>
>>
>>
>> *From:* Nathan Gough 
>> *Sent:* Thursday, October 11, 2018 9:12 AM
>> *To:* users@nifi.apache.org
>> *Subject:* Re: NiFi fails on cluster nodes
>>
>>
>>
>> You may also need to explicitly open ‘nifi.cluster.node.protocol.port’ on
>> all nodes to allow cluster communication for cluster heartbeats etc.
>>
>>
>>
>> *From: *ashmeet kandhari 
>> *Reply-To: *
>> *Date: *Thursday, October 11, 2018 at 9:09 AM
>> *To: *
>> *Subject: *Re: NiFi fails on cluster nodes
>>
>>
>>
>> Hi Alexander,
>>
>>
>>
>> Can you verify by pinging if the 3 nodes (tcp ping) or run nifi in
>> standalone mode and see if you can ping them from other 2 servers just to
>> be sure if they can communicate with one another.
>>
>>
>>
>> On Thu, Oct 11, 2018 at 11:49 AM Saip, Alexander (NIH/CC/BTRIS) [C] <
>> alexander.s...@nih.gov> wrote:
>>
>> How do I do that? The *nifi.properties* file on each node includes ‘
>> *nifi.state.management.embedded.zookeeper.start=true’*, so I assume
>> Zookeeper does start.
>>
>>
>>
>> *From:* ashmeet kandhari 
>> *Sent:* Thursday, October 11, 2018 4:36 AM
>> *To:* users@nifi.apache.org
>> *Subject:* Re: NiFi fails on cluster nodes
>>
>>
>>
>> Can you see if zookeeper node is up and running and can connect to the
>> nifi nodes
>>
>>
>>
>> On Wed, Oct 10, 2018 at 7:34 PM Saip, Alexander (NIH/CC/BTRIS) [C] <
>> alexander.s...@nih.gov> wrote:
>>
>> Hello,
>>
>>
>>
>> We have three NiFi 1.7.1 nodes originally configured as independent
>> instances, each on its own server. There is no firewall between them. When
>> I tried to build a cluster following instructions here
>> ,
>> NiFi failed to start on all of them, despite the fact that I even set *
>> nifi.cluster.protocol.is.secure=false* in the *nifi.properties* file on
>> each node. Here is the error in the log files:
>>
>>
>>
>> 2018-10-10 13:57:07,506 INFO [main] org.apache.nifi.NiFi Launching NiFi...
>>
>> 2018-10-10 13:57:07,745 INFO [main]

Re: NiFi fails on cluster nodes

2018-10-12 Thread Mike Thomsen
Alexander,

I am pretty sure your problem is here:
*nifi.state.management.embedded.zookeeper.start=true*

That spins up an embedded ZooKeeper, which is generally intended to be used
for local development. For example, HBase provides the same feature, but it
is intended to allow you to test a real HBase client application against a
single node of HBase running locally.

What you need to try is these steps:

1. Set up an external ZooKeeper instance (or set up 3 in a quorum; must be
odd numbers)
2. Update nifi.properties on each node to use the external ZooKeeper setup.
3. Restart all of them.

See if that works.

Mike

On Fri, Oct 12, 2018 at 8:13 AM Saip, Alexander (NIH/CC/BTRIS) [C] <
alexander.s...@nih.gov> wrote:

> *nifi.cluster.node.protocol.port=11443* by default on all nodes, I
> haven’t touched that property. Yesterday, we discovered some issues
> preventing two of the boxes from communicating. Now, they can talk okay.
> Ports 11443, 2181 and 3888 are explicitly open in *iptables*, but
> clustering still doesn’t happen. The log files are filled up with errors
> like this:
>
>
>
> 2018-10-12 07:59:08,494 ERROR [Curator-Framework-0]
> o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave up
>
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss
>
> at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>
> at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:728)
>
> at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:857)
>
> at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:809)
>
> at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:64)
>
> at
> org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:267)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>
> at java.lang.Thread.run(Thread.java:748)
>
>
>
> Is there anything else we should check?
>
>
>
> *From:* Nathan Gough 
> *Sent:* Thursday, October 11, 2018 9:12 AM
> *To:* users@nifi.apache.org
> *Subject:* Re: NiFi fails on cluster nodes
>
>
>
> You may also need to explicitly open ‘nifi.cluster.node.protocol.port’ on
> all nodes to allow cluster communication for cluster heartbeats etc.
>
>
>
> *From: *ashmeet kandhari 
> *Reply-To: *
> *Date: *Thursday, October 11, 2018 at 9:09 AM
> *To: *
> *Subject: *Re: NiFi fails on cluster nodes
>
>
>
> Hi Alexander,
>
>
>
> Can you verify by pinging if the 3 nodes (tcp ping) or run nifi in
> standalone mode and see if you can ping them from other 2 servers just to
> be sure if they can communicate with one another.
>
>
>
> On Thu, Oct 11, 2018 at 11:49 AM Saip, Alexander (NIH/CC/BTRIS) [C] <
> alexander.s...@nih.gov> wrote:
>
> How do I do that? The *nifi.properties* file on each node includes ‘
> *nifi.state.management.embedded.zookeeper.start=true’*, so I assume
> Zookeeper does start.
>
>
>
> *From:* ashmeet kandhari 
> *Sent:* Thursday, October 11, 2018 4:36 AM
> *To:* users@nifi.apache.org
> *Subject:* Re: NiFi fails on cluster nodes
>
>
>
> Can you see if zookeeper node is up and running and can connect to the
> nifi nodes
>
>
>
> On Wed, Oct 10, 2018 at 7:34 PM Saip, Alexander (NIH/CC/BTRIS) [C] <
> alexander.s...@nih.gov> wrote:
>
> Hello,
>
>
>
> We have three NiFi 1.7.1 nodes originally configured as independent
> instances, each on its own server. There is no firewall between them. When
> I tried to build a cluster following instructions here
> ,
> NiFi failed to start on all of them, despite the fact that I even set *
> nifi.cluster.protocol.is.secure=false* in the *nifi.properties* file on
> each node. Here is the error in the log files:
>
>
>
> 2018-10-10 13:57:07,506 INFO [main] org.apache.nifi.NiFi Launching NiFi...
>
> 2018-10-10 13:57:07,745 INFO [main]
> o.a.nifi.properties.NiFiPropertiesLoader Determined default nifi.properties
> path to be '/opt/nifi-1.7.1/./conf/nifi.properties'
>
> 2018-10-10 13:57:07,748 INFO [main]
> o.a.nifi.properties.NiFiPropertiesLoader Loaded 125 properties from
> /opt/nifi-1.7.1/./conf/nifi.properties
>
> 2018-10-10 13:57:07,755 INFO [main] org.apache.nifi.NiFi Loaded 125
> properties
>
> 2018-10-10 13:57:07,762 INFO [main] 

RE: NiFi fails on cluster nodes

2018-10-12 Thread Saip, Alexander (NIH/CC/BTRIS) [C]
nifi.cluster.node.protocol.port=11443 by default on all nodes, I haven’t 
touched that property. Yesterday, we discovered some issues preventing two of 
the boxes from communicating. Now, they can talk okay. Ports 11443, 2181 and 
3888 are explicitly open in iptables, but clustering still doesn’t happen. The 
log files are filled up with errors like this:

2018-10-12 07:59:08,494 ERROR [Curator-Framework-0] 
o.a.c.f.imps.CuratorFrameworkImpl Background operation retry gave up
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:728)
at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:857)
at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:809)
at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:64)
at 
org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:267)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Is there anything else we should check?

From: Nathan Gough 
Sent: Thursday, October 11, 2018 9:12 AM
To: users@nifi.apache.org
Subject: Re: NiFi fails on cluster nodes

You may also need to explicitly open ‘nifi.cluster.node.protocol.port’ on all 
nodes to allow cluster communication for cluster heartbeats etc.

From: ashmeet kandhari 
mailto:ashmeetkandhar...@gmail.com>>
Reply-To: mailto:users@nifi.apache.org>>
Date: Thursday, October 11, 2018 at 9:09 AM
To: mailto:users@nifi.apache.org>>
Subject: Re: NiFi fails on cluster nodes

Hi Alexander,

Can you verify by pinging if the 3 nodes (tcp ping) or run nifi in standalone 
mode and see if you can ping them from other 2 servers just to be sure if they 
can communicate with one another.

On Thu, Oct 11, 2018 at 11:49 AM Saip, Alexander (NIH/CC/BTRIS) [C] 
mailto:alexander.s...@nih.gov>> wrote:
How do I do that? The nifi.properties file on each node includes 
‘nifi.state.management.embedded.zookeeper.start=true’, so I assume Zookeeper 
does start.

From: ashmeet kandhari 
mailto:ashmeetkandhar...@gmail.com>>
Sent: Thursday, October 11, 2018 4:36 AM
To: users@nifi.apache.org
Subject: Re: NiFi fails on cluster nodes

Can you see if zookeeper node is up and running and can connect to the nifi 
nodes

On Wed, Oct 10, 2018 at 7:34 PM Saip, Alexander (NIH/CC/BTRIS) [C] 
mailto:alexander.s...@nih.gov>> wrote:
Hello,

We have three NiFi 1.7.1 nodes originally configured as independent instances, 
each on its own server. There is no firewall between them. When I tried to 
build a cluster following instructions 
here, 
NiFi failed to start on all of them, despite the fact that I even set 
nifi.cluster.protocol.is.secure=false in the nifi.properties file on each node. 
Here is the error in the log files:

2018-10-10 13:57:07,506 INFO [main] org.apache.nifi.NiFi Launching NiFi...
2018-10-10 13:57:07,745 INFO [main] o.a.nifi.properties.NiFiPropertiesLoader 
Determined default nifi.properties path to be 
'/opt/nifi-1.7.1/./conf/nifi.properties'
2018-10-10 13:57:07,748 INFO [main] o.a.nifi.properties.NiFiPropertiesLoader 
Loaded 125 properties from /opt/nifi-1.7.1/./conf/nifi.properties
2018-10-10 13:57:07,755 INFO [main] org.apache.nifi.NiFi Loaded 125 properties
2018-10-10 13:57:07,762 INFO [main] org.apache.nifi.BootstrapListener Started 
Bootstrap Listener, Listening for incoming requests on port 43744
2018-10-10 13:59:15,056 ERROR [main] org.apache.nifi.NiFi Failure to launch 
NiFi due to java.net.ConnectException: Connection timed out (Connection timed 
out)
java.net.ConnectException: Connection timed out (Connection timed out)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at